Match Tuning Best Practices: Testing and Tuning

View Only

Match Tuning Best Practices: Testing and Tuning

By Joel Snipes posted 11-08-2021 15:39

Recommend

Welcome to the final pieces of my 3 part blog series on match and merge in Reltio. If you stumbled on this blog first I would like to invite you to check out the first two blogs as well as my webinar on the topic. Otherwise we will jump into the testing tuning tuning topics surrounding match merge.

Part 1: Data Profiling and Analysis

Part 2: Design and Implementation

Testing and Tuning

Building match rules is an interactive process. First you do your best to design your rules using the background gathered from your data profiling. Then you implement your rules being mindful of comparator options and tokenization strategies. Finally you review how your rules performed and look for opportunities to improve further.

Static Match Rule Analyzer

The static match rule analyzer is a tool built into Reltio designed to identify syntax and comparator issues in your match rules. The Static Match Rule Analyzer generates an easy to understand HTML report confirming your rules have been configured correctly or providing comments on where they may be improved.

To generate this report, head to your console and click on your data modeler. Then, click “Analyze” in the top right corner.

When checking this report, keep in mind it is “data agnostic”. This means that it isn’t necessarily looking at your data, but only at the syntax of your rules. The static match rule analyzer will give you feedback in its comments section indicating if your rules are syntactically correct, redundant, or if they were designed well.

The static match rule analyzer uses a color-coordination system to indicate the status of your rule.

Green indicates a well-configured rule.

Yellow indicates your rule may be a duplicate, as it has many similarities with another rule. This may make you consider combining or removing one of these.

Blue indicates that some of the operators used in the match rules could be either wrong or inefficient. Follow the guidance in the comments section to resolve

Dynamic Match Rule Analyzer

The Dynamic Match Rule Analyzer considers the performance of your match rule in terms of the data in your tenant. It helps to identify over collisioned tokens, under matching, and redundant rules.

To use the dynamic match rule analyzer, head to your console. Go to Tenant Management. Then, click on “Jobs” and start a “New Job”. Then, click “Analyze Match Rule”. Then, choose what you would like analyzed.

When it finishes analyzing, click on “Match Analysis” on the left side to review your report. There are three headers of the report which we will discuss in turn, the first being Summary.

The Summary view provides a high level overview of how your match rules are performing for each entityType. Above you can see a red warning for my Organization entityType indicating something is wrong. My entities are generating very few token phrases meaning I may be under matching. In the following view I can see my match token phrases which will help me diagnose the issue.

Match Token Phrases

The Match Token Phrases view shows a bell curve of the number of tokens each of your entities has generated. The default view shows a combination of all match rules, but you can target individual rules with the drop down in the top right corner.

For my Organization ruleset above, the average Organization is only generating 2 tokens and always less than 5 tokens. Ideally, for a collection of match rules we want the bulk of our bell curve between 5 and 300 tokens.

To remedy this I would consider adding a fuzzy or soundex operand to the Name attribute. This would generate multiple tokens for each Organization's Name and move my bell curve toward the middle.If your bell curve is too far to the right consider using “ignoreIntoken” on a operator generating multiple tokens.

Match Rules

The Match Rules view allows you to see how your rules are performing in comparison to one another. Blue boxes represent diverse match sets which are ideal; Red boxes represent similar match sets which mean rules are not adding much value. Sometimes rules considering completely different attributes find the same matches. In these cases you can consider combining or eliminating one of the rules.

Because the Dynamic Match Rule Analyzer depends on your data to determine the performance of your match rules, you should make it a habit to check on your performance quarterly or yearly. As new data sets come in and data stewards work through potential match backlogs the results will change. I like to save screenshots of the results to see how they change over time. In the top right corner you can access a profiling report of your match rule data that I document as well.

Repeat

Once you have made your improvements to your match rules be sure to visit these reports again. You’ll be surprised at how much the results can change and you will likely find more improvements to be made.

Reltio Master Data Management is a cloud-native, data-driven system that works for you to consolidate and enrich thousands of person profiles to empower your business. When working with such a complex system, there are alot of questions. To read answers from other users on the topic of Match Tuning Best Practices, or to post your own questions, join the community. And, continue to keep an eye out for other webinars on this topic.

#Matching
#MatchTuning
#Featured
#UI
#Blog

0 comments

3843 views

Reltio Connect