Reltio Professional Services - Reltio Integration Hub (RIH) Recipe Best Practices

View Only

Reltio Professional Services - Reltio Integration Hub (RIH) Recipe Best Practices

By Paul Lawrence posted 11-14-2024 06:17

Recommend

It's fair to say, that the Reltio Professional Services (PS) Team loves Reltio’s Integration Hub.

Since Reltio partnered with Workato, we have developed (through customer engagements or just for fun) upwards of 25 different RIH recipes, a lot of which are available for use with minimal changes corresponding to customer's requirements and used by our PS, SC, Partners and Customers as part of their implementations and beyond.

We have developed recipes for:

Exporting & Importing Data
ETL
Reference Data Management (RDM)
Unmerging
Custom Reports which provide information about

Merges
Potential Matches
Validation Errors
Historical Workflow tasks
Users and their associated Roles and Groups

Throughout the development cycle, we have tuned these recipes taking into account their task consumption to ensure that they are designed optimally for the customer requirements (functional and non-functional). So I'm here to share those best practices which we follow, so you can benefit from them too!

Firstly let’s understand the Major Factors of Task Usage. There are many different reasons why task consumption for a given recipe can rise. It boils down to three main factors that broadly impact the task consumption of any recipe:

The number of Jobs

The number of events per Job

The efficient processing of Events

E.g. Say you work in the sales org of a business. For every new customer opportunity, you are running a job and performing a particular action, which takes about 15 tasks each time that job is run. If you had about ten thousand customer opportunities being processed each day, those hundred would be a multiplier for the number of tasks consumed by each of those jobs - This becomes task-intensive.

E.g. You are reading a CSV file from your S3 bucket (other buckets are available) and then processing a couple of actions in HubSpot (other equally lovely CRM solutions are compatible with RIH too). Whether each event is processed individually or as a batch can have a multiplier effect on the total task count for a job. Additionally, keep in mind that data synchronisation or scheduled jobs typically process more than one event per trigger event. - This can also become task-intensive

The way you choose to implement processing logic can impact the efficiency of processing events and the overall number of tasks consumed in every job. Using the right tools can provide a balance between development complexity and task optimisation. - There are many tools in your arsenal here, PS often use Python, Javascript or Ruby which adds a little complexity when building the recipe, however, it improves the task usage massively.

Let’s talk about Task Optimisation - Now from the above section, we understand how tasks are being consumed, and the different factors affecting overall task consumption for each job, there are a couple of approaches you can use to automate the process of identifying recipes in need of optimisation within RIH, so you don’t have to do so reactively. By using the approaches below to assess different high-consumption recipes proactively, you can flag certain recipes as potential candidates for optimisation. Once you've identified those recipes, you can start applying the optimisation strategies that we’ll discuss to bring down the overall task consumption to acceptable levels.

Which ones need some fine-tuning? - There are two ways you can identify your high-consumption recipes, the Predictive approach and the Reactive approach. First, we will look at using a predictive approach, which is typically used when you don't have a lot of historical data to reference.

The Predictive Approach

In this example provided by Workato, you’re deploying new recipes into your production environment, transitioning from a previous batch process to Near Real Time (NRT). Consequently, it will take several months in production before trends indicating high overall task consumption become apparent. The predictive approach will start extrapolating data based on the past month’s usage within one or two months of starting the recipe. This can provide a prediction of annual task usage, helping you decide whether to consider that recipe for optimisation.

This recipe uses a RecipeOps connector to list all the active recipes in the workspace. It then uses a simple JavaScript to extrapolate new data based on the historical data of those recipes, which is subsequently output to a CSV file. Here's an example of what you might see when you output the data:

As you can see there are several recipes here with huge counts of Estimated Annual Tasks - these would be the ones to look at Optimising first.

The Reactive Approach

The reactive approach to identifying recipes for optimisation can be used once you have a decent amount of history for your recipes. To implement this, follow these steps:

Focus on the lifetime task consumption metric to identify recipes to address first.	Use meaningful periods, such as three to six months, for comparing usage across recipes.
Factor Seasonality in your calculation - are you likely to see more customer opportunities for example at specific times of the year?	Consider any upward trends in task consumption during the process to identify if any automation is showing periodic spikes or gradual growth in task usage.

Prioritise your list - Once you have identified the various factors contributing to overall task consumption for different recipes, you can start to assess and analyse those recipes for business needs and determine the most appropriate strategies for optimisation. Next, categorise them by business criticality. At a high level, every recipe can be classified as either automation that serves real-time or batch/deferred processing needs. Determine if any automation can be changed from near real-time or real-time processing to deferred processing and batching options. As an example, it may be possible to convert a recipe with a five-minute polling interval to an hourly or daily sync job without affecting productivity.

How to optimise

Once you have data and an understanding of the different factors contributing to overall task consumption for different recipes, you can use this information to choose an optimisation strategy. Let’s take a closer look at the three categories of optimisation strategies.

Trigger Strategies: Manage how automations are triggered to control the number of jobs for a given recipe.

Trigger	Optimisation Strategy	Description	When to use	Efficacy	Complexity
Trigger Conditions	Use trigger conditions to filter only the events that the recipe should process.	Trigger conditions are additional rules that determine which events should be selected for processing. For example, process only closed-won opportunities. Trigger conditions eliminate extraneous jobs. No task will be consumed if conditions are not met.	Need to process only certain records that fulfil redefined criteria based on event data.	Low - Moderate	Low
Batch/Bulk Triggers	Use batch triggers when available to process multiple events in a single job.	Batch triggers can process hundreds to thousands of records together in a single job. Using larger batch sizes will help cut down the number of jobs needed to process the same number of records and hence the number of tasks used. Combine batch triggers with batch actions in the recipes when possible for substantial performance gains.	Data sync. use cases to process a high volume of information for improved throughput and task savings when the source system supports batch/bulk actions.	High	Low
Polling Frequency	Set the polling interval to a higher value. Triggers also support setting a custom interval value.	Most RIH triggers are poll-based triggers. The default polling interval occurs every 5 minutes. Using lower polling intervals will create more jobs resulting in higher task consumption, especially in cases where additional pre-processing logic such as data enrichment is required.	The source system does not have frequent new or updated records, and/or use cases are not time-critical	Moderate - High	Low

Batch Strategies: Implement batching techniques to optimise the processing of multiple events in a single job.

Trigger	Optimisation Strategy	Description	When to use	Efficacy	Complexity
Batch/Bulk Actions	Use batch/bulk actions when available to process multiple events in a single step.	Batch actions can process hundreds to thousands of records together in a single action. Batch actions with custom batch size can lead to significant task savings when repeat actions are required to manage the target system limitations. Instead of processing 1 record (1 task) at a time, you can process X record batches (counts as 1 task). This would reduce task usage by a factor of X.	Data synchronisation use cases to process a high volume of information for improved throughput and task savings when the target system supports batch/bulk actions.	High	Low
Filtering Events	Use action-specific filtering criteria such as WHERE conditions or query parameters in APIs to limit the number of events returned from the search/list actions.	Recipes often require only a subset of data based on specific business logic or dynamic request data in the trigger event. Similar to filtering events in triggers using conditions, applying filtering logic in the search or list actions can limit the number of results returned by the business applications only to the required records, eliminating any additional filtering steps in the recipes.	Need to filter batch of events/list input from Search/GET actions based on specific criteria before using them in the recipe.	Moderate - High	Low
Delta Sync	Process only the changed records instead of the full data set.	Delta synchronisation processes only changed data (new or updated) since the last successful job execution instead of processing the complete record set every time. Use identifiers in source data, such as the last modified date field, to read only the changed records. This approach will ensure improved data consistency, efficient task consumption, and reduced load on source and target business systems.	Data synchronisation use cases that support changed data identification in the source system. Refer to database design best practices for improving performance and reducing load on all systems.	Moderate - High	Low
ELT Pipeline	Use ELT pattern instead of ETL for complex data transformations	ELT (Extract, Load, and Transform) is an alternative approach to ETL for loading data into the data warehousing platform and then applying the transformation Logic. ELT pattern support improves efficiency by taking advantage of purpose-built data warehousing platforms for high-volume data processing while recipe provides orchestration and automation logic.	High-volume data processing requirements for modern data warehousing solutions such as Snowflake	High	High
Streaming	Use streaming actions to automate large file transfer management.	File streaming is the concept of reading and writing a file in smaller parts (chunks) in sequence. A typical example is transferring records from a shared file system (SFTP) to a file hosting platform for analysis (Amazon S3). When transferring a file from a source app to a destination app, Workato splits the file into smaller chunks and downloads them. These chunks are then uploaded to the destination app in separate requests.	File management/analytics use cases where both source and target systems support streaming actions	High	Low

Recipe Efficiency Strategies: Use alternative methods to implement business logic in recipes more effectively.

Trigger	Optimisation Strategy	Description	When to use	Efficacy	Complexity
Formula Mode	Use the inline formula for performing data transformations instead of using repeat actions with lists.	Most business applications accept and return complex data structures. These data structures can broadly be classified as JSON objects or arrays. Formulas allow users to work with and format data easily. Formulas in RIH are whitelisted Ruby methods. Mapping data in formula mode will enable you to handle complex data structures and count as just one task per step.	Recipe logic requires performing data validation, data cleansing, data extraction, data type casting, and/or data transformations.	Moderate - High	Medium
List Processing	Define target list schema and use dynamic list mode for implicit looping of existing lists	RIH supports implicit looping of existing lists in dynamic list input mode. Workato will automatically loop the input list and apply all configured data pill transformation logic to each element. Use either the Mapper function or Accumulate List actions to use dynamic input mode. For Mapper actions, you can define the target list schema using Common Data Models from the tools menu. For Accumulate list action, edit the schema directly in the recipe to define list item fields as an array.	Need to apply data pill level transformation to all the elements in an existing list before using it in other actions.	Moderate - High	Low
Collections	Use Collections in RIH for complex data processing.	Collection in RIH uses an in-memory database to allow you to work with data from multiple sources. You can create lists and query them using standard SQL syntax, using common SQL keywords like WHERE, GROUP BY, and JOIN to manipulate data from tables into your desired format.	Complex data processing requirements such as merging/enriching data from different sources, applying transformations to a batch of events, aggregating data for a summary view, etc.	Moderate - High	Medium
Custom Code	Use custom code for implementing complex business logic and data processing.	You can perform compute-intensive processing more efficiently with a few lines of code. Examples of such complex processing include flattening hierarchical data structures, filtering lists, transforming object maps, or running nested loops. Write and execute custom code in recipes using popular languages such as Ruby, JavaScript, and Python. Each execution of such custom code is counted as 1 task.	Need for complex processing such as hierarchical data structures, high/nested cardinality, custom parsing logic, semi-structured or unstructured data processing, filtering lists, etc.	Moderate - High	Medium - High
Variable Grouping	Use single Variables by RIH step to create multiple variables.	When combined with frequent recipe execution or high-volume data processing, multiple steps to create variables can quickly add to overall task usage.	Apply this strategy to optimise recipes to group related variables in a single step	Low	Low
Variable Declarations	Declare and define variables close to where they are first used.	Recipe steps such as declaring and defining variables count towards the task usage. Often multiple variables are declared upfront for use inside a conditional block that is never executed. For frequently executed recipes, such task consumption can add up quickly. Defining such logic inside the conditional block can help avoid any unused variables in the overall task count.	Recipe logic with variables for use inside a conditional block.	Low	Low
Response Caching	Use cache response for implementing API GET requests.	RIH can cache API responses for GET request endpoints. You can specify the time in seconds (up to 3600 seconds) that the API response is stored in the cache before being refreshed or deleted. RIH will use a cache response for any request matching the cache key in the client request. This approach will avoid triggering API endpoint recipe and consumes just 1 task.	Need to implement API endpoint for frequently accessed data or multiple concurrent requesters requesting access to shared data.	Moderate - High	Low
Custom SDK	Use RIH connector SDK to implement complex automation logic.	RIH’s custom SDK connector allows you to write business logic using Ruby while providing a consistent low-code/no-code approach for using them in recipes. The Custom SDK connector also supports advanced capabilities that can help optimise task count for commonly used recipe logic. For example, multi-step action to get asynchronous and long-running Google BigQuery results in a single step is counted as 1 task.	Need for implementing complex automation logic such as asynchronous processing with polling or special Ruby methods supported by connector SDK.	Moderate - High	High
Reusable Recipes	Optimise a recipe and reuse it as a regular function.	RIH allows you to call recipes from inside other recipes so that regular functions can be used in multiple recipes.	When you have a complex recipe which could be modularised and existing functions incorporated to improve development efficiency.	Moderate - High	Low

Please note that the above strategies shared are adapted from Workato Best Practices based on the same topic

Monitoring and Tracking

You’ve identified, assessed, and optimised your high-consumption recipes, but imagine having a dashboard where you can monitor task consumption trends monthly. This would allow you to see which recipes are experiencing periodic spikes and use that dashboard to determine why those spikes are happening and how to fix them - well you can!

Monitoring Task Consumption	Monitoring at the Job Level
Task consumption can be tracked in several areas within RIH. You can monitor task consumption for a specific job, recipe, or at the account level. You can also include task consumption metrics in your regular administrative reporting and automate task monitoring and management using RecipeOps.	Here, you can select a job to view the total number of tasks used. If a recipe calls other recipe functions, it will not display the total number of tasks across all child jobs. Each “child job” will have its corresponding task usage details.
Monitoring at the Recipe Level	Monitoring at the Tenant Level
Recipe task usage can be found under the Settings tab of the recipe. Alternatively, you can filter specific recipes and time windows from the Dashboard.	The monthly tenant usage is available in the Dashboard. It provides hourly to monthly and overall task consumption which is customisable.
Monitoring at the Account Level
You should always have visibility of your task counts across all of your tenants, Dev, Test and Prod. This will also be another way to flag the need to optimise your recipes. You can achieve this in a number of ways: Working closely with your Customer Success Manager who will have a regular cadence with you and share your usage against all your entitlements Proactively monitor by leveraging RecipeOps and other connectors to develop a recipe to monitor your usage.

In Conclusion

We have looked at Task Usage and Identifying which of the recipes to optimise and how to prioritise that work. We have looked at Strategies or Approaches to that optimisation whether they be Trigger Strategies (how automations are triggered to control the number of jobs for a given recipe), Batch Strategies: (batching techniques to optimise the processing of multiple events in a single job), and Recipe Efficiency Strategies (alternative methods to implement business logic in recipes more effectively). Lastly, we looked at how to monitor this good work and track changes in consumption at the Job, Recipe and Account levels.

Reltio PS have implemented many of these enhancements in their recipes to ensure that they are optimised. We have seen some great successes with these approaches, and we know you will see them too!

Paul Lawrence
Technical Delivery Manager

Father, Analytical thinker, and Builder of Lego.

Paul is a Project Manager & Senior Account Manager with 16 years of experience working with global brands in FMCG, Pharma, and Manufacturing managing high-value and Enterprise accounts to drive renewals and growth of existing contract value whilst delivering exceptional Customer Adoption and best practice.

Special thanks to Moin Ur Raza, Diparnab Dey, Kevin King, Aditi Verma and Zaid Jawad for taking the time to review this blog and for providing feedback!

#RIHRecipes

#Connectors

#Blog

1 comment

106 views