Anatomy of a Match Rule

View Only

Anatomy of a Match Rule

By Terence Kirk posted 06-09-2021 08:00

Recommend

The match rules for your Reltio tenant will control how profiles get identified as matches, and if they match, whether to merge them automatically or send them for review by a data steward. Before creating your match rules, think about what is needed to identify two profiles as a match.

If you are looking to go deep into matching find more on our Matching Page here.

Let’s say you have customer profiles from multiple sources in your tenant and you need to merge them when there are matches. There may be some profiles with very little information and others that contain more than you need for matching.

What similarities between two profiles are required for the system to automatically identify these profiles as a match and merge them automatically, and which ones should be reviewed by a data steward? Once you have a good understanding of your business requirements and the characteristics of the data from your business systems, you can get started on creating your match rules.

Video on Anatomy of a Match Rule

Match Rule Basics

Match rules work by comparing the attributes of one profile with those of another profile to see if they are the same. A number of comparison operators are provided:

Exact match checks to see if the attributes being used for comparison are identical for both records. In this case NULLs are not treated as matches, so if either or both attributes are NULL, they will not be treated as a match.
Exact or Null is the same as Exact, except that if one of the attribute values is NULL, it will be treated as a match.
Exact or All Null is the same as Exact, except that if both of the attribute values are NULL, it will be treated as a match.
Fuzzy match checks to see if the attributes being used for comparison are similar using probabilistic match determinations that consider likely variations in data patterns such as misspellings (“John” vs “Jon”), transpositions (“John Smith” vs “Smith, John”), omissions (“Mike’s” vs “Mikes”, and phonetic variations. Reltio provides a number of algorithms for fuzzy matching that can be used based on the characteristics of your data.

Several helper operators are also provided to refine how match rules are processed:

Equals can be used to trigger a condition when an attribute equals a fixed value e.g. State equals “CA”
Not Equals can be used trigger a condition when an attribute does not equal a fixed value e.g. Gender not equals “Male”
In is the same as Equals except you can specify a number of values e.g. State in (CA, WA, OR)
And, Or, Not enable you to combine match clauses using logical operations e.g. (Fuzzy(FirstName) OR Fuzzy(LastName)) AND NOT(Exact(gender))

Here’s an example of how to define a match rule using the console data modeler:

This is how the rule is evaluated:

Several comparator classes are provided to support the comparison algorithms - see the doc portal for details. You can select the one that best supports your business requirements. Here is an example of how to define comparator classes using JSON:

When evaluating attributes for matching, “tokenization” is used to find match candidates that will be used by the match rules for comparison purposes. Tokenization avoids the need to compare every possible pair of records (out of perhaps millions of records) by identifying “match candidate pairs” in your tenant that are “close enough” to be used when evaluating match rules. Here’s an example of how to define match token classes using JSON:

Match Token classes are designed to mimic the same strategies as the comparator classes. So if you’re using a SoundexComparator for an attribute, and you determine that tokenizing that attribute is important, then you’ll likely want to use the SoundexTextMatchToken class for it. Each comparator provides a recommended match token class. As a general principle you should always use the recommended tokenizer class for a given comparator class.

There are two main types of match rule:

Automatic: If an automatic match rule is triggered, the profiles will be merged automatically.
Suspect: If a suspect match rule is triggered, the profiles will be flagged as a Potential Match for review by a Data Steward, who will have three options:

Decide that the profiles are the same and merge them
Decide that the profiles are not the same and flag them as “Not a Match” which will prevent them from showing as Potential Matches in future.
If workflow is enabled, send the potential match for review by e.g. a business analyst

A Match Rule Example

Let’s say your business rules dictate that exact matches between a Contact’s first name, last name, phone number and address line 1 can be merged automatically by the system, while any fuzzy matches of these attributes are to be flagged as Potential Matches for review by a Data Steward.

First name, last name, phone number and address line 1 are the attributes that will be used for the comparison. You will create two match rules:

An automatic match rule that will use the Exact comparison operator to compare the values of first name, last name, phone number and address line 1. If the rule is triggered the profiles will merge.
A suspect match rule that will use the Fuzzy comparison operator to compare the values of first name, last name, phone number and address line 1. If this rule is triggered the profiles will be flagged as Potential Matches for review by a Data Steward.

Writing Match Rules

Now that you understand how match rules work, how do you create them? The console data modeler has a view that enables you to define match rules.

More complex match rules will need to be entered by editing the configuration JSON.

There are several things to keep in mind when making changes to match rules.

Download a copy of your current configuration before making any changes so that you can revert back to the original if needed.
After making a change to the match rules, they will not be re-evaluated automatically so you won’t see any changes resulting from the updated match rules. You can rebuild the match tables to re-evaluate the match rules. To do this run the Rebuild Match Tables task in the console Tenant Management Application. If you don’t rebuild, your new match rules will only apply to new or updated data in your tenant.
If you are doing large data loads (e.g Initial Data Load) you should disable real time evaluation of match rules while doing the load, and run the rebuild match table task on completion of the data load. Don’t forget to re-enable real time evaluation of match rules afterwards.
All match rules for a profile are evaluated every time match rules are processed, so if you have say 4 or 5 suspect match rules for an entity type, you may see that several suspect match rules were triggered.
When making changes to match rules, the best practice is to make them “suspect” and test them on a representative subset of your data in a development environment. It can be very time consuming to undo the results of an incorrect system wide auto-merge. The process you use should follow this process (in a development environment):

Set the match rule to “suspect”.
Make required changes.
Rebuild match tables.
Evaluate Potential Matches to determine if it is working the way you intend. Make sure to check all the business scenarios of your data set.
If the rule is not working as expected, go back to step b.
Once the rule is working as expected, set it to automatic, rebuild the match tables, check once more that the merges are working the way you expect.
Apply the changes to Production.

Match Rule Efficiency

When you are provided with a Reltio tenant, it will have a starter configuration based on one of the Reltio accelerators e.g. Account 360, Identity 360, LS Customer 360 etc. These configurations will have a set of match rules for each entity type that cover commonly occurring business scenarios. These can be used as a starting point for developing your own match rules. You should delete any that you do not need. As a general rule you should keep your match rules to the minimum that you need to support your business, and avoid having multiple match rules that are doing essentially the same thing as this results in inefficiency.

To help you identify problems with your match rules, you can check their efficiency using the analyze match rules tool. The tool will lay out your match rules in a table and flag ones that overlap significantly. It will also flag problematic rules. The tool is especially handy when multiple people are developing match rules for a tenant. It’s also a great way to get a bird’s eye view of your match rules and help you to optimize them.

Learn More with the Reltio Community

The Reltio Community is a great place to learn more about how to use the Reltio products and connect with Master Data Management peers. Rely on the expertise of Reltio partners, customers, and technical experts.

Other Relevant Content:

#DataUnifcation
#MatchRules
#Featured
#Blog

0 comments

2871 views

Permalink

https://community.reltio.com/blogs/terence-kirk/2021/06/08/anatomy-of-a-match-rule

Reltio Connect