This blog provides best practices for when a new source is being added to a domain, and match rules need to be validated. This blog will list all steps and the order they should be considered. Generally the Professional Services team will consult and guide the client on which steps are relevant for their implementation.
Analysis & Data Profiling
When adding a new source to an existing configuration, data profiling must be performed on the data set to ensure it is of similar data quality as the existing data set, and that existing match rules will support the new data set. This is the most important part of adding a new data set to Reltio. Data will make or break your solution.
The steps for analysing and data profiling of the new source are below:
- Conduct a mapping exercise to determine if the new data set maps into the existing configuration (If new attributes are required, that work will be done in the Configuration step below).
- Use any analysis tool to profile the data set to be loaded into Reltio.
- Identify any data quality issues with the data set.
- Decide with the client how to address any data quality identified (i.e. Resolve data quality issues in the source prior to loading the new data set into Reltio, or ensure enough data stewards are staffed to resolve suspect matches).
- Identify if there is a unique ID column (for ex. personID / orgID) in source, which can be used as crosswalk ID. if none exists, identify if concatenation of columns can be used to create a unique key.
- Determine if there is a need for configuring a lookupType (RDM) for the new source. Or if existing RDM lookups can be used, determine the values for the lookups.
- Determine if the introduction of new data will require additions or changes to the existing set of match rules.
NOTE: If data does not adhere to certain expectations, then the source system data entry processes should be reviewed (Who is capturing, why is it being captured a certain way, are there dependencies on Ordering/Opportunity management. Is data required available at the time of data entry…). Oftentimes, rogue business processes are uncovered during this exercise.
In addition to Data Profiling of the new source and data set, any new configuration required must be implemented and tested prior to loading the new data set.
New configuration includes:
- Defining the source in the Reltio Platform
- This can be done in two ways - via the API (L3 configuration) or via the console.Add the source in the sources section in the L3 configuration -
sample below -
or via the console (- Data Modeler -> Sources)
2. Add Source into corresponding RDM tenant, if applicable - Login into the corresponding RDM tenant
- Click on Sources in the left menu - Click on “+ SOURCE SYSTEM”
- Fill out the Name and Abbreviation attributes that matches to what is in the MDM tenant. Click “DONE”
- Additional RDM Lookup Type mappings may be needed, based on profiling results and overall solution design.
3. Creation of any new attributes required to be stored in Reltio from the new source - This can be done in two ways - via the API or via the console.
Add the attributes in the attributes section of the entityType. Sample Below -
or via the console (Data Modeler -> EntityTypes -> Select the entityType for which you would like to add the attribute -> Create New -> Define the parameters for the attribute)
4. Creation of updating of any match rules for the source known prior to loading the data
- This can be done in two ways - via the API or via the console.
Add the matchrules in the matchgroups section of the entityType in L3 configuration. Sample below -
"label": "Exact(Phone), Fuzzy(Name)",
or via the console (Data Modeler -> EntityTypes -> Select the entityType for which you would like to add the matchRule on -> Create New -> Define the attributes, match rule type(suspect /automatic) and matching logic for the attributes (fuzzy or exact))
External Match Tool
The External Match tool can be used to analyse match rule results prior to loading the new data set into a tenant. Documentation on the External Match tool is found here ->
As a result of the analysis of the External Match Tool, the customer can determine if the results are as expected, or if adjustments need to be made to the match rules prior to loading the new data set.
The external match tool
The external match tool can be executed from the console
User would need to
- upload the .csv file which is to be used for matching. The file can be of the same format as the source file or it can be a different format.
- perform mapping - map columns from csv file to match attributes on the right. This is similar to, when running data loader from console,
- select the match rules that the job would be executed on and
- run the job.
After the job is executed, user will have the option to export the match reports. There will be two files exported - matched and unmatched records. User will have the option to review potential suspect and auto merges and determine if the match rules would need to be changed.
For more details and screenshots on the steps, please refer to the documentation link below -
Match Tuning for adding a source is very similar to initial match tuning in a project. The analysis and data profiling you complete in the beginning of an onboarding project will give you an indication of how many iterations to plan for match tuning.
If the results of your analysis and data profiling indicate you need additional match tuning, please leverage the Creating Match Rules course in Reltio Academy for additional best practices and guidance.
When data profiling and analysis of the External Match Tool are complete, proper testing should be followed by updating any required configuration in DEV and TEST prior to loading the data into the PROD tenant.
Learn More with the Reltio Community
The Reltio Community is a great place to learn more about how to use the Reltio products and connect with Master Data Management peers. Rely on the expertise of Reltio partners, customers, and technical experts.