Saturday, November 24, 2018

Integration Patterns and Best Practices for Salesforce - Part 2

This is my second blog to explain Integration patterns and best practices.  In this I will cover remaining patterns.

In order to understand the patterns covered in previous blog, refer below link:
Integration Patterns and Best Practices for Salesforce - Part 1

If you are planning to appear for "SALESFORCE CERTIFIED INTEGRATION ARCHITECTURE DESIGNER" exam then below information will be helpful.

Note: I have summarized the information provided in Integration Patterns and Practices documentation provided by salesforce in this blog.

Batch Data Synchronization

Use this pattern if you want to import data into Salesforce and export data out of Salesforce, taking into consideration that these imports and exports can interfere with end-user operations during business hours, and involve large amounts of data.
Below are different scenarions which can utilize this pattern:
  • Extract and transform accounts, contacts, and opportunities from the current CRM system and load the data into Salesforce (initial  data import).
  • Extract, transform, and load customer billing data into Salesforce from a remote system on a weekly basis (ongoing).
  • Extract customer activity information from Salesforce and import it into an on-premises data warehouse on a weekly basis (ongoing backup).      
Change data capture    
  • If remote system is master
Leverage a third-party ETL tool that allows you to run change data capture against source data. The tool reacts to changes in the source data set, transforms the data, and then calls Salesforce Bulk API to issue DML statements. This can also be implemented using the Salesforce SOAP API.  
  • If Salesforce system is master
If Salesforce is the data source then you can use time/status information on individual rows to query the data and  filter the target result set. This can be implemented by using SOQL together with SOAP API and the query() method, or by the using SOAP API and the getUpdated() method.

In case of using middleware, it is recommend that you create the control tables and associated data structures in an environment that the ETL tool has access. This provides adequate levels of resilience. Salesforce should be treated as a spoke in this process and the ETL infrastructure is the hub.
For an ETL tool to gain maximum benefit from data synchronization capabilities, consider the following:
  • Chain and sequence the ETL jobs to provide a cohesive process.
  • Use primary keys from both systems to match incoming data.
  • Use specific API methods to extract only updated data.
  • If importing child records in a master-detail or lookup relationship, group the imported data using its parent key at the source to avoid locking. For example, if you’re importing contact data, be sure to group the contact data by the parent account key so that maximum number of contacts for a single account can be loaded in one API call. Failure to group the imported data usually results in the first contact record being loaded and subsequent contact records for that account to fail in the context of the API call.
  •  Any post-import processing, such as triggers, should only process data selectively.
Error Handling and Recovery
  • Exporting data from SFDC
During read operation from salesforce, middleware should perform below operations
  • Log the error
  • Retry the read operation
  • Terminate if unsuccessful
  • Send a notification
  • Importing data into SFDC
Handling—Errors that occur during a write operation via middleware can result from a combination of factors in the application (record locking errors). The API calls return a result set that consists of the information listed below. This information should be used to retry the write operation (if necessary).
  • Record identifying information
  • Success/failure notification
  • collection of errors for each record
Security Considerations
  • A Lightning Platform license is required to allow authenticated API access to the Salesforce API.
  • It is recommended to use standard encryption to keep password access secure.
  • Use the HTTPS protocol when making calls to the Salesforce APIs. You can also proxy traffic to the Salesforce APIs through an on-premises security solution, if necessary.
Timeliness

Timeline is not significant factor as these operations runs in background. Loading batches during business hours might result in some contention, resulting in either a user's update failing, or more significantly, a batch load (or partial batch load) failing.
For organizations that have global operations, it might not be feasible to run all batch processes at the same time because the system might continually be in use. Data segmentation techniques using record types and other filtering criteria can be used to avoid data contention in these cases.

Data Volumes

This pattern is mainly used for bulk data import and export.

Remote Call-In


This pattern is used when remote system wants to connect to salesforce and after authentication, want to update records in SFDC.
Below are different options available for this:
  • SOAP API
Query, Create update or delete records and obtain metadata information from Salesforce
Salesforce provides two WSDLs for remote systems:
  • Enterprise WSDL—Provides a strongly-typed WSDL that’s specific to a Salesforce organization.
  • Partner WSDL—Contains a loosely-typed WSDL that’s not specific to a Salesforce organization. It deals with considering subject structure.
Security :- The client executing SOAP API must have a valid login and obtain a session to perform any API calls. The API respects object-level and field-level security configured in the application based on the logged in user’s profile.
Data Volume :- For bulk data operations (more than 500,000 records), use the REST-based Bulk API.
  • REST API
Query, Create update or delete records and obtain metadata information from Salesforce
REST exposes resources (entities/objects) as URIs and uses HTTP verbs to define CRUD perations on these resources. Unlike SOAP, the REST API requires no predefined contract, utilizes XML and JSON for responses, and has loose typing. REST API is lightweight and provides a simple method for interacting with Salesforce. Its advantages include ease of integration and development, and it’s an excellent choice for use with mobile applications and Web 2.0 projects.
Security:-  We recommend that the remote system establish an OAuth trust for authorization.  It’s also possible to make REST calls with a valid session ID that might have been obtained by other means (for example, retrieved by calling SOAP API or provided via an outbound message).
We recommend that clients that call the REST API cache and reuse the session ID to maximize performance, rather than obtaining a new session ID for each call.
  • Custom Webservices/Apex Rest classes
We can create custom webservices and provide WSDL to remote system so that they can consume it and call custom webservices methods. If we create Apex rest services, then remote system can directly call URI’s.
Custom webservices or Apex Rest services are usefull when you need to update multiple records related to different objects in single call as logic for complete transaction is controlled by developer.
  • Bulk API
Bulk API is based on REST principles, and is optimized for loading or deleting large sets of data. It has the same accessibility and security behavior as REST API.
Bulk API allows the client application to query, insert, update, upsert, or delete a large number of records asynchronously by submitting a number of batches, which are processed in the background by Salesforce. In contrast, SOAP API is optimized for real-time client applications that update small numbers of records at a time.
Although SOAP API can also be used for processing large numbers of records, when the data sets contain hundreds of thousands to millions of records, it becomes less practical. This is due to its relatively high overhead and lower performance characteristics.

Error Handling and Recovery

Error handling needs to be implemented by remote system or middleware. Middleware or remote system should implement retry logic and also need to make sure that duplicate request is coming to salesforce. We can handle duplicate request in case of custom webservices or apex rest services but it is required for remote system to have some mechanism for this.

Timelines

SOAP and REST API’s are synchronous.

Data Volume

SOAP/REST API
  • Login—The login request size is limited to 10 KB or less.
  • Create, Update, Delete—The remote system can create, update, or delete up to 200 records at a time. Multiple calls can be made to process more than a total of 200 records, but each request is limited to 200 records in size.
  • Query Results Size — By default, the number of rows returned in the query result object (batch size), returned in a query() or queryMore() call is set to 500. Where the number of rows to be returned exceeds the batch size, use the queryMore() API call to iterate through multiple batches. The maximum batch size is 2,000 records
BULK API
Bulk API is synchronous when submitting the batch request and associated data. The actual processing of the data occurs asynchronously in the background.
  • Up to 2,000 batches can be submitted per rolling 24–hour period.
  • A batch can contain a maximum of 10,000 records.


UI Update Based on Data Changes


When an event occurs in Salesforce like update to any record, user should be notified in the Salesforce user interface without having to refresh their screen and potentially losing work.
The recommended solution to this integration problem is to use the Salesforce Streaming API.
This solution is comprised of the following  components:
  • A PushTopic with a query definition that allows you to:
    • Specify what events trigger an update
    • Select what data to include in the notification
  • A JavaScript-based implementation of the Bayeux protocol (currently CometD) that can be used by the user interface
  • A Visualforce page
  • A JavaScript library included as a static resource
Benefit of Streaming API
  • No need to write pooling mechanism to identify the records changes
  • User does not have to refresh record or invoke any action to get latest updates
    Limitations
    • Delivery of notifications isn’t guaranteed.
    • Order of notifications isn’t guaranteed.
    • Notifications aren’t generated from record changes made by Bulk API.
      Security Considerations

      It respect Salesforce organization-level security.

      Idempotent Design Considerations
      • Remote Process Invocation—Request and Reply / Request and Forget
      It’s important to ensure that the remote procedure being called is idempotent means it can identify if any repeated request is coming to avoid duplicates request processing. It’s almost impossible to guarantee that Salesforce only calls once, especially if the call is triggered from a user interface event. Even if Salesforce makes a single call, there’s no guarantee that other processes (for example, middleware) do the same.
      The most typical method of building an idempotent receiver is for it to track duplicates based on unique message identifiers sent by the consumer. Apex web service or REST calls must be customized to send a unique message ID.
      • Remote Call In
      The remote system must manage multiple (duplicate) calls, in the case of errors or timeouts, to avoid duplicate inserts and redundant updates (especially if downstream triggers and workflow rules fire). While it’s possible to manage some of these situations within Salesforce (particularly in the case of custom SOAP and REST services), we recommend that the remote system (or middleware) manages error handling and idempotent design.



      Hope this will help!!

      1 comment:

      1. Regarding Change Data Capture as mentioned in the first param, I believe the scenarios are inverted. I think it should be "When SFDC is master, use change data capture"

        ReplyDelete