Reltio Connect

View Only

Reltio Community Show: Relevance Based Matching - Webinar

By Chris Detzel posted 01-20-2022 17:21

Recommend

Relevance based matching allows you to weight the quality of your matches and determine the outcome based on that weight. Learn how you can reduce the complexity and number of match rules you maintain while improving your match quality. For more information, head to the Reltio Community and get answers to the questions that matter to you: https://community.reltio.com/home

Download the PPT here.

Transcript for Relevance Based Matching:

Joel Snipes (00:28):

All right. Let me jump into it here. So today I want to talk about Relevance Based Matching. We've had a couple webinars or community shows in the past about matching between me and [Suchen 00:00:40]. And this is really a deep dive and one of the newer features in our matching repertoire. So let me start by introducing what Relevance Based Matching is, especially in the context of how matching worked previously. So I called the previous match strategy, binary matching, as opposed to what I'm going to be talking about today, relevance matching. In the previous way of matching. Everything was logic based. So it's every opera end evaluated to a true or false and you need it all trues at the end to have a match considered. Whereas with Relevance Based Matching every opera end to a numerical score and those numerical scores are weighted. And if the final value of the score of the match rule at the end of evaluation is over a certain threshold an action will occur. So there's really one of two outcomes in a binary match rule.

Joel Snipes (01:48):

You can either match or not match, whereas in a relevant space rule, you can have multiple outcomes. You can choose whether to immediately merge, have a potential match or even have a potential match with a label of maybe weak match, or something so the data stewards can prioritize. So there's a lot of advantages of the Relevance Based Matching in terms of what you can do inside of a single rule. And they do work a little differently, but there are a lot of things that are still the same.

Joel Snipes (02:16):

So we're still going to be tokenizing all of our attributes. We still use the same comparator classes and dictionaries. And at the end of the day, there are all auto matches or merges and suspect matches just like before. So that's generally an introduction of how it's different than our base matching platform and to give an overview each attribute in your match will... So say you have a first name, a last name, an address line one, a city, a estate will have a weighting. And by default they're all weighted equally. So you can set a threshold score, say maybe 80% or 0.8.

Joel Snipes (03:03):

And if you break that threshold score, it'll auto emerge. And one of the really cool things here is you can set a second threshold maybe from 60% to 80% where it won't auto emerge, but it'll create a potential match. So now you have what would normally take one rule to configure or two rules to configure in the binary strategy done in one and a lot more flexibility. So if one particular attribute does not match, it's still possible to have a match outcome, whereas that wouldn't be the case in binary. So I hope that gives an overview of what relevance matching is versus our traditional matching from before.

Joel Snipes (03:48):

So why would you want to use Relevance Based Matching as opposed to our old system? One of the advantages is you get to maintain a lot fewer match rules and that's for a couple of reasons. One is because previously a lot of customers would have maybe a match rule on first and last name that would use exact first and last name as criteria. And then that would be an auto merge. And then they'd have a second rule where first and last name or set the fuzzy. And that would be a suspect match. Cause we might, might not be a feel confident enough to take that fuzzy match and have it automatically merged without a data storage's oversight.

Joel Snipes (04:29):

With Relevance Based Matching, we can have one match rule cover, both a potential match and an auto match. And that's one way you can reduce the number of rules you have to maintain. Another way is because it evaluates to a score instead of a bullying, previously, you might have one match rule that the combination of name, addressed, and email and another that's name, addressed, and phone number.

Joel Snipes (04:54):

And they're very, very similar rules, right? It's just the email versus the phone number you're using to change how they evaluate. Well, you could have email and phone number both in one rule. And as long as the score is coming out, even if maybe email matched and phone didn't or phone matched and email didn't, if you set your thresholds correctly, you can cover both of those scenarios with only one rule. So there's quite a lot of opportunity to reduce the amount of rules you have to maintain, and that's going to make your life easier in terms of your relative admins and what they have to keep up with and maintain going on.

Joel Snipes (05:37):

The other advantage is those dynamic rules. So there's a lot more dynamism when you aren't evaluating simply true or false. So within a rule, you can actually tune the rule without changing what you're evaluating, just the in score. So previously you would have to take attributes in or out to change the outcome of a rule set. Now you can leave the same attributes and just tune the threshold at which you decide to merge or decide to match. So if you feel like you're under matching when your threshold is set to 80%, you can drop that down to 75% or 70% and change nothing else about your role to get a cast, a little bit of a wider net.

Joel Snipes (06:25):

And the last big advantage of Relevance Based Matching is, because you can have multiple outcomes, say you have a role that only evaluates the potential match. You could actually put those in two buckets, a strong potential match and a weak potential match, that lets your data stewards go through the strong potential matches first, get that low hanging fruit, get that easy value early, before they go through working through the week matches, which might not have much value. So that prioritization can be really, really great in terms of making the best of your time and in getting the value a little more quickly.

Joel Snipes (07:04):

So there's two ways to configure a Relevant Based Match rule, just like any match rule. It can be configured through the UI inside the data modeler and this screenshot to the left gives you a preview of what that could look like, or it can be configured in your L3 through the API. My recommendation to you would be to, start with the data modeler in the UI, because it's a lot easier to use. It builds all that syntax for you. And after you save it, it'll automatically be added to your L3 and you can customize it from there.

Joel Snipes (07:38):

There are a few features that are only available. If you're manually editing the JSUN that you can't do from the UI. But if you go ahead and get that base rule built, you can go back and tweak it with things like weightings. So in this particular example, we see first name, last name and suffix are being used to match. First name and last name, don't have a weighting assigned inside this weight survey. So if no weighting assigned, they're assigned a weighting of 1.0 or a 100%. By default, everything has a weighting of 1.0.

Joel Snipes (08:20):

Now, in this example, we don't want suffix like a junior or senior doesn't have the same amount of value in determining if two individuals are the same as first and last name do. So the decision was [bade 00:08:35] to conclude suffix, but give it a very lightweight 0.2. So suffix doesn't carry anywhere near the same or only 20% of the value that first or last name carry. And this is a great example of something you can do with Relevance Based Matching that you wouldn't be able to do with the previous version of matching.

Joel Snipes (08:57):

Previously, you would just had the exact renewal, and if this evaluated the false, this was thrown out the window altogether. So if you had a scenario where you had a junior and somebody who didn't include their junior and another crosswalk, they would not have matched, but in this example they will. And then the lower part here, you can see the action threshold. So it's going to auto emerge at 82% to a 100% match and become a potential match of 0.4 to 0.8. On the left side, you can see, this is actually managed by a little UI. I don't you call this bar that you can drag and drop to the point you want. So from here, I'm going to jump into the UI and actually configure a rule from scratch for you.

Chris Detzel (09:45):

So quick question or thought. So for context, this level of matching, is it used to reconcile disparate data sources in realty? Is that right?

Joel Snipes (09:54):

That's right. Yeah. You're bringing to gather, you loaded maybe a Salesforce system and a HubSpot system. And you want to find which of the individuals exist both in your sales system and your marketing system.

Chris Detzel (10:06):

Okay. And then, so Danny says, "seems like a great improvement," but then another question from Steve is, if one record has senior and the other has junior, would those still incorrectly, would they still match incorrectly if you have the weight for the suffix very low?

Joel Snipes (10:25):

So they will match incorrectly, right? So we'll get this suffix, if one's junior and one senior will be no match at all, because we're using in this example, the exact match token. So it's still evaluating true or false. And what's going to happen is, we're going to have a 100% plus a 100% plus 0% for suffix. And that's going to be divided by basically the highest possible value, which would be 2.2. And that score will land in one of these two categories. I think in that scenario, the score would probably still be in that 80% to 100% range. So they would auto merge.

Joel Snipes (11:17):

And as a puny exercise, you could see you that landing that range and maybe tighten this up and move this up to a 0.95 to 1, that thing. So that's, a really good question. You have to think about how you weight things and what your action thresholds are. So if you're getting incorrect juniors and senior merging, you might want to increase the weight of suffix, or tighten your action thresholds, but something that can't happen that wouldn't happen with the binary matching.

Chris Detzel (11:47):

Okay. And one last question before the demo. Well, it's not the last one, but this. Let's say first name between the records are compared then the resolent will be between zero and one. But the same is between 0 to 0.2 for suffix. Is that the understanding? Is that kind of right?

Joel Snipes (12:12):

So I think the question is if I understand it correctly, is what happens if these two thresholds overlap? What if this one was from 0.04 to 0.09 and this one's from 0.8 to 1.0? So we have an overlap from 0.8 to 0.9. If there's an overlap in the two thresholds, it goes with the potential match over the auto merge. So it plays it safe and doesn't over match for you.

Chris Detzel (12:40):

Okay. There are two other questions, but let's get going a little bit and then I'll ask those here in a minute. I promise Manuel and-

Joel Snipes (12:48):

All right. Sounds good.

Chris Detzel (12:50):

All right.

Speaker 3 (12:51):

Just one last question on that. Will we be able to see the actual value of the weight by using [auto emerge 00:13:01] or any other API?

Joel Snipes (13:05):

Yes, you will. It's not visible in the UI yet, unfortunately. So I'm going to show you API, that'll let you see that.

Chris Detzel (13:14):

Cool. Thanks, [Shool 00:13:16].

Speaker 3 (13:15):

Thank you.

Joel Snipes (13:16):

You're welcome. Yep. Still a newer feature. So first thing I'm going to do is I'm going to jump into my demo tenant here, and I am going to configure a relevant space match from scratch. So you do this by clicking the chocolate bar on the top, right? I'm going to console. When console has loaded, we're going to go to the data modeler. I'm going to make a metro for contacts, and I'll go to metro here, and hit create new. I'm going to call this demo rule. My scope is internal. So I only want to match within my own tenant, not within data tenants or anything like that. So I'm going to leave this set to internal and I'm only going to consider the operational value. So I'm going to leave this checked as well. For match type, this is where you switch between the traditional binary rules versus relevance. Relevance is actually the default now.

Joel Snipes (14:24):

It started checked here and a recommendation I would make when you're designing rules in your testing, do not start with a automatic merge, turned on. It's difficult to... It's easier to tune and evaluate rules when they're only creating potential matches. So I'm going to create two potential match actions. Eventually I'm going to call this, let's say, I'll call this super strong matches. And this one I'll probably eventually would come back and turn into an auto match. But for now I'm going to leave it as a potential so that I have a chance to tune things before things are actually merged and I have to go and un-merge them. And the second one I'm going to label as weak matches.

Joel Snipes (15:16):

So strong matches. I'm going to start with 85% as my threshold and for my weak matches, I'm going to set 8.4, 84% as my high end. And my low end is maybe let's make it 0.4. So something you'll see here is the UI will kindly tell you not to leave a gap in between the two ranges. So that's a good, best practice to remember. You want your two ranges to touch and you can actually create multiple ranges. So if you wanted to create really weak matches for some reason, let's see, let's call these super weak. You could even have a third one that would the range in between these two.

Joel Snipes (16:18):

There we go. But for now, I'm just going to have the two ranges. There we go. Get this back to 0.8 and I'm going to have strong matches and weak matches. So scroll down a bit further. And the next step is to choose which attributes are going to be considered. So I'm going to consider first name, last name, address line one, and zip five. For my example. When matching within the US only, I like zip five as a simpler way of effectively match on city and state. So now that I've decided on these attributes, I'm going to hit build match rule.

Joel Snipes (17:09):

And here is my starting form of match rule. So the first step I'm going to do is select a comparator. So for first name, let's say, I'm going to make this fuzzy. There's lots of nicknames and things out in the world. So I'm going to choose fuzzy as my operator. For my match token, I'm going to choose the double metaphone match token. So this finds names that sound alike, that sort of thing. And I'll choose a double metaphone comparator to match. You always want your match token to match your comparator. And I'm going to apply the Reltio name dictionary. This checks for nickname matches that sort of thing, which is super handy. For last name, I'm going to choose the exact match operator, and I'm going to do the exact match token and basic screen comparator.

Joel Snipes (18:24):

I'm not going to apply the name dictionary, the last name. Address line one. I'm going to choose the address line match token, and the address line comparator. Zip five, I'm going to do the exact match token and string comparator. All right. So I have chosen a token and a comparator for all of my attributes now, and I'm ready to build my match rule. So let's go ahead and save it.

Joel Snipes (19:07):

So if I go back to my tenant and search for match rules, I will not find my demo role evaluating anything there, my demo role didn't pop-up with any potential matches. And that is because I have not rebuilt my match table. So a good rule to remember, anytime you make a change to your match rules, go ahead and rebuild your match table. You can get there from the tenant management pain in the console, and there's even a link right here to take you there even more quickly. So let me kick off this jump. I'm going to make it distributed with a few parts. So it processes a little quicker and I'm going to apply it only to contact, because that's where I made my match rule. All right. This should take just a few minutes. I could, this would be a good time for a question while we wait.

Chris Detzel (20:11):

Oh, yep. So two questions. Is it possible to have two system providing different set of fields? Like system one first name, last name, suffix, age and system. And number two, first name, last name, height. Can we use relevant based matching in that way?

Joel Snipes (20:28):

Yeah. So if one system doesn't supply all of the attributes, Relevance Based Matching could provide a match for binary. So that would be a cool scenario for that.

Chris Detzel (20:41):

Okay. And the last question is, what happens when you give wrong, like comparator to like token class, does it only affect the performance or any mismatches can occur to the record?

Joel Snipes (20:55):

Yeah, I think you're with the wrong comparator, you're probably most likely going to under match, because they're not going to be in alignment. You're going to mismatches and you definitely could get some poor performance with bad tokenization, that sort of thing. While we're waiting, there is a really good document. Let's see if I can find this on the fly of comparators and Token classes. Let's see if I can.

Joel Snipes (21:35):

Yeah. I think this is it. Yeah. And so this is a table of the different comparators that are available. And in the notes here, it'll tell you which token class or classes work best with each of these. So like the basic string comparator works really well with the exact match token. We're just trying to see if two strings are similar, but like a fuzzy comparator could use this. I can never pronounce this, but levenshtein distance or could use the metaphone, or the double metaphone, or the soundex. There's all sorts of options when you go that route. So pretty difficult to memorize. But if you find, if you use this document, it's pretty helpful for keeping track of what's useful or what works together. I'm going to put this in chat real quick.

Chris Detzel (22:25):

Yeah. Cool. And I'll make sure that gets like that. We do have one more question that maybe two, but for at least now one. So Manuel asked, can we assign a variable weight to a fuzzy match? So depending on how much the string is matched.

Joel Snipes (22:40):

Good question. So that does not work that way yet, but it would be awesome if it did. So it's still going to evaluate if there was a fuzzy match or not, and then apply that weighting. So if it did fuzzy match, you get the weighting. If it didn't, you get zero, that would be a great opportunity for enhancement that it would have like a dynamic weighting based on how tight that fuzzy match was.

Chris Detzel (23:11):

Manuel, I would, I'll put this in the chat here in a bit, but go to the ideation portal and then submit your idea. And then what I would do if I were, you I'd post that on community and get a bunch of other people to vote for it. Is it ready, I just want to make sure, because there's a question or two already in there. So if it's not-

Joel Snipes (23:34):

Oh, just finished.

Chris Detzel (23:36):

Okay.

Joel Snipes (23:37):

It's up to you, whether you want to take another question.

Chris Detzel (23:41):

So [Shatun 00:23:43], I don't know if this is an additional question, but he does say it's from a previous question, but by default, when we have a match rule, can we restrict the match to match with only one entity? So for example, I have five entities and I need to match only with two entities leaving the other three entities aside.

Joel Snipes (24:04):

The that's a tricky one. So you only want it to ever merge once. So the way you would prevent that from happening, if it's possible would be to use something like equals, I don't think there's a count on crosswalks, because you would need to count on crosswalks and you would say, "Hey, if it already has more than one crosswalk, don't merge again." And I'm not sure if that would be possible. That would be a good question for the community, because somebody's going to have to think about that and then play with it a little bit to see if they can get it working.

Chris Detzel (24:46):

So check down. I was, it's funny. I was going to say that Joel is maybe post that particular question on the community and then we can see if either Joel or somebody else could go in and just play with it and mess with it and maybe give you a better answer if that makes sense. And that'd be good for the contest starting tomorrow. So if you haven't heard about it, I'll say it at the end. Keep going, Joel. Thanks.

Joel Snipes (25:08):

All right. So in the spirit of live demos, let's see if my match rule correctly evaluated. All right, I'm going to look for match rules. Demo rule, look at that. I have 817 records that a demo rule match. So let's look at Pat Murphy here. All right. And I'm going to go to the same as match view or potential match view. This is the new same as match view that's recently come out. I really like to click this show comparison view. It makes it feel more like the old one. And we can see that these two records are incredibly similar, that's because I rate loaded the same data set twice and it evaluated as a match on the demo rule.

Joel Snipes (26:06):

What is conspicuously missing is from the data modeler. When we configure out the demo rule, I want to know if this is a super strong match or a weak match and it does not show here. The way you will find out if it is a super strong or weak match while you're tuning your match rules is to grab the URL here at the top, open postman, grab a fresh token, and you're going to use the entity's API followed by the IV of your record, followed by slash underscore matches. I'm going to just put this in chat, but it's just the entity's API followed by underscore matches. So you should have this probably in your posting collection already.

Joel Snipes (27:02):

So I can see there are one object under my demo match rule group. We can see a copy of the record in here and let's see at the very bottom here, we can see you got a relevant score of 100 and or 1.0. And that's because I loaded the same data set twice. These are exact duplicates and the match action label, that is tricky to say, came as a super strong. So if it evaluated differently, it might have came as a weak match, but this is how you check the score. I think eventually this will probably be in the UI so you can see what the evaluation came to here without having to do this. But for now, this is unfortunately what we have to do.

Joel Snipes (28:00):

And one other way to look at a match and evaluate. And this is a tip I gave out in the previous demo is the entity's API verifying matches API. And this one will show you attribute by attribute how it was evaluated and not instead of passing the UI as a parameter, you passed it as a part of the body. So I'm going to grab the UI of my target record and the match trigger. Hopefully this text isn't too small. I'm going to try and zoom in a little bit and see if... Okay, postman doesn't support zoom, does it?

Joel Snipes (28:50):

I can move this to the use, to the browser, that'll help. Here we go. So this is the verify match API and it shows all the rules that matched under. So these are all turned off for now, but the match table's maintained. So it's still coming up, but here's a demo rule, I just set up. We can see all the tokens generated by this and by the other entity and the tokens that overlapped. We see the relevant score attribute by attribute. So at the highest level, it's a one, across this grouping, it's a one.

Joel Snipes (29:45):

And then each individual attribute can go one. So you can see as it rolls up from the individual attribute to the group, to the rule overall, to see how your relevant score was calculated and to see what outcome came about. We see the match action is a potential match. So the match is API and the verify match is API, both well documented in the doc portal are very handy for this thing. I'm going to throw copy of this in the chat, just in case anyone wants it. All right.

Joel Snipes (30:32):

So now that I have my potential matches for relevance face matching something, I want to show that might be a cool prelude to my upcoming webinar is how to do the potential match workflow. So I have these 817 potential matches, and I'm going to segment this somehow, I'm going to segment this by, say by city, I'm going to choose Detroit. So Detroit is a target market for my company and I want my data stewards to evaluate the matches in Detroit first.

Joel Snipes (31:12):

So I have my 24 potential matches for my demo role. And I want to queue these up to be evaluated. The way you would do that is once you have the entities in your search perspective that you want to target, you click this dot, dot, dot, over here in the top right in request match review. Now what that's going to do is queue all of these up for a workflow and you'll get a notification at the bottom when it's done. So match reviews were successfully submitted.

Joel Snipes (31:46):

And because I'm a one man show, I'm going to submit the workflows and also be the person in charge of processing them. So this is my inbox. And as a data steward, I could click one of these and begin evaluating. So it nicely puts names, the addresses, all the key attributes right next to each other, looking here at this, Randy Mot, I see that these two people seem to have the exact same data down to the dots and Ts. So I'm going to make a decision by clicking the task action and decide to merge these. So I have taken you from building a relevance based rule to evaluating the effectiveness of your role with the APIs, to cuing up a workflow for your data steward, and finally resulting in a few matches at the end of the day. So that, is a full life cycle of developing a Relevance Base Match role. I think now might be a good time to take a couple more questions.

Chris Detzel (33:03):

Great. We'll have some, so I'm not sure if you'll be able to answer this question or not, but is match score not planned to become visible in the UI in the next release?

Joel Snipes (33:17):

That is a great question that I should have looked up before of this demo, but I'm afraid I'm not sure. I'd have to go check the release stocks.

Chris Detzel (33:24):

Okay. And then by the way, we have a webinar coming up and that is a webinar on a show on the release stuff, I think on Tuesday. So if you check out the community and the events section, you'll see if you have an RSVP for that. And that's a good question to ask. So I know that Gino answered this, but I want maybe a more detailed answer. So let's say I have a profile with two different persons match due to a bad match rule. Now I'm changing survivorship rule. And in turns the OV value gets changed. Is matching going to recapture or rematch the profile with the new OV values with the rest of the profiles?

Joel Snipes (34:11):

And the answer to that will depend on whether this flag is checked. So if you match my operational value, it's only going to consider the OV, so a change in survivorship could result in a change of OV, which could result in a change in matches. If this were disabled, all the crosswalks were being considered anyway. So there'd be no change in the outcomes. I think most people, most of my customers stick with this. It's a good, best practice, because turning this off uses a lot of resources and slows down your matching. You have to maintain a lot more tokens. So I'd say in most cases, updating your survivorship could change your potential matches.

Chris Detzel (35:00):

Okay. And it looks like, I think you Gino answered this one. Is weight mandatory?

Joel Snipes (35:08):

Not. So that's a good question. It's defaults to one and you can't even maintain it in the default rules building. But if you go to the advanced editor, you can see the JSUN. So you don't actually even have to pull the L3. And if you have had some sample code, let's see. If you go to the advanced editor, you can update the JSUN for the rule right here and add the weights. So that's, something I meant to show you that I didn't actually get to. So let me do that real quick.

Chris Detzel (35:43):

I just thought you were going to have a short demo.

Joel Snipes (35:53):

So in this example, I want to wait the address heavier than the name for some reason. So I set both the weights of the names to 0.8. Whereas the address is undefined, which makes it a 1.0 by default. So if I add these weights to the advanced editor though, you'll notice that I cannot go back to the rules builder anymore. So you want to figure out what attributes you want to consider and get all that sorted out first, before you go adding in the weights, because you're not going to be able to go back once you do, unless you remove the weights, of course. I can take this out and now I can go back to the pretty UI editor.

Chris Detzel (36:35):

Great. Some more questions here. Angie, you have to make me read all of this, but I'll try. So recently we added a rule and we'll begin testing. So auto dot 85 to one or 0.75 to 0.85 suspect review one and 0.60, to 0.75 is suspect review two. If I understand correctly, these three bucket are not really visible. It will go either to auto merch or go to suspect which threshold on the review is not visible. So example, I cannot assign only review one bucket to a steward. Can you reply to that?

Joel Snipes (37:18):

Yeah, that's a great question and I know exactly... I think I know where your head's at there. If I had, let me go to an example with multiple potential matches actually. So I'm going to do that by-

Chris Detzel (37:33):

By the way, Gino's on a roll, I'm just scrolling here and you start to answer some of them. So thanks.

Joel Snipes (37:45):

So Craig Berry here has three potential matches. Now, if I were a data steward, I would love to have these matches ranked in order of their relevance score. Why would I evaluate the 0.6 match before I evaluate the 0.9? Because 90% better, but right now in the UI, there is nowhere it displays nor does it sort these by that relevant score. So I would definitely go back to that upcoming release. There might be a feature enhancement coming here and another consideration would be a custom workflow. So you might be able to configure a custom workflow in such a way that it assigns certain buckets to certain data stewards, nothing out of the box on that right now, but definitely a good candidate for enhancement request.

Gino (38:53):

Hey Joel, this is Gino. A quick question for you. You're looking at the early access view of the potential matches. If you were in the current GA view, would it show the score there?

Joel Snipes (39:11):

It will not. That's actually why I was using the early access one. I was hoping it would be in here. And then I figured I should just get used to it. I'll double check.

Gino (39:20):

The other idea I had on Angie's request was that, although you can't do it by the type, if you knew that the score was between 75 and 85, right. In the facet that you were in before, where you can look at the match scores, you can just filter on the 75 and 85, and that would have a similar effect, right? That's the number, there's a match score facet.

Joel Snipes (39:56):

Oh man, I'm learning something on my own webinar. Look at that.

Chris Detzel (40:00):

Community show.

Joel Snipes (40:01):

A community show. Sorry.

Gino (40:02):

I think it's already very restricted, because you have the number of potential matches.

Joel Snipes (40:06):

Oh you're right.

Gino (40:07):

That Maybe it's because of the rule. Maybe there's nothing in there with that rule. I don't know.

Joel Snipes (40:16):

I think this might be the match score from the binary way, where they have the base match.

Gino (40:23):

Got it.

Joel Snipes (40:24):

I think it might be that one.

Gino (40:25):

Okay.

Joel Snipes (40:27):

But that would've yeah... So the binary match has a different way of waiting the quality of a match. And I don't think they're in the same facet yet.

Gino (40:40):

Okay. Sorry. Sorry, Angie. I thought that would've done it, but apparently it's not.

Chris Detzel (40:48):

So some more questions for you, Joel. Is the matching rules by source system? So please assume data coming from different sources, having different set of attributes.

Joel Snipes (41:02):

So it doesn't consider the different source systems. So if you have an incomplete data set from one system you're going to in the old way of matching, you'd have to configure a rule, a few rules specifically for that. You'd have to consider which attributes were available and make a rule to bring that set into the fold. With Relevance Based Matching as long as enough of the attributes that are in a match rule or populated by data set, that it can get over that threshold. It's possible that it can match even with a, I guess, incomplete set.

Joel Snipes (41:36):

If first name wasn't included, last name and address only, it's possible that with last name and address, we could get a 0.8 and still ends up in the potential match range. Whereas with the binary match rule, if first name was missing and we weren't using exact [inaudible 00:41:58], it would always evaluate the false. So you should definitely consider these Relevant Based Rules, if you have sources with like really limited data sets.

Chris Detzel (42:11):

Great on potential matches screen, is there a way to filter by specific match rules or so others are excluded?

Joel Snipes (42:23):

So when you're on... You can filter by match rules in the advanced search to choose which records come up. And if I only have one match rule turned on in this tenant right now, but for example, if these two records matched on two or three different rules, all two or three rules would show here under this and say, "Hey, it matched on three rules, not just this one," but there isn't a way to filter. Like, can I just see this particular rule for now? That's not a feature that would fall in the enhancement request territory for now.

Chris Detzel (43:04):

Great. Also, what's a good roadmap or best practice to transition from binary to relevance and see the impact of relevance matching that demonstrates an improvement over what we're seeing now.

Joel Snipes (43:18):

Yeah. So the first thing I would do is I would look for rules that evaluate to a potential match and auto match that are very similar and look for a way to combine those. You're going to be managing half as many tokens that way, and you're going to have better performance. So that would be the first thing I would look for, where I can turn two rules into one. The next thing I would look for would be rules that are similar except for one attribute. So there are a lot of pseudo identifiers in the world like email, or phone number that almost perfectly identify individuals and a common best practice before was to build different rules around different identifiers.

Joel Snipes (44:08):

With relevance based matching, you could consider all the pseudo identifiers at once and just wait them and consolidate there. And another great I'm going to go off script here way to look into how to build a good relevance based rule is the match IQ builds relevance based rules by default. So let's see if this is turned on for my tenant, but you can start training a model for a relevance based rule. And this could be its own little community show, but pick a few attributes and have the model decide how to weight, bounce out the rule for you. And then once the model's built its rule, spend some time to evaluate how that AI rule is performing compared to your existing rules. And maybe you can replace them.

Chris Detzel (45:11):

That's a nice little roadmap. I like that. Great question, Sandro. So Mark says, and I think Steven answered this, but let's make sure. Can match rules specifically filter criteria? So role only valid for domestic versus international.

Joel Snipes (45:28):

Yes. That's a great question. I'm going to jump to the docs here to explain what you with that. So with binary rules, you can use the equals filter, which is effectively like a wear clause in a SQL statement to say where, where equals, where the role is international. And then the rule will only evaluate international entities. So with relevance based match, you get two options. There is the strict constraint and the weighted constraint, the strict constraint works the same way as the binary where it just filters out. If you want to only consider international entities, you set equals type to international, and we are only going to consider international entities.

Joel Snipes (46:25):

The second option is weighted. And if you take a weighted constraint, it is treated just like a match attribute. So if it's an international that helps its case for being a match, but it doesn't have to be international to be a match. It just adds its weighted score to the total calculation. And it's just one factor among many. So when you're doing relevance based rules, you have to think, am I only trying to evaluate these international ones by this rule, then I'm not strict, or is international just one factor of many on whether this should be a match and then you'd want to go the way to drought.

Chris Detzel (47:12):

Thanks. Joel. Newman says one key enabler would be to have our matching scoring available on UI like B match. So it will simply simplify for stewards. I think that's just a comment. And then thinking aloud, say the master data has two email ID fields. So if a source system sends one email ID, can it try to match with either of the emails?

Joel Snipes (47:41):

So most real tier models have an email nest and you might have two emails within that nest. You can absolutely match across the many to many emails from [ancient 00:47:55] entity. Now, if you have two entirely separate email attributes, right, you have one called work email, and one called home email or something. And there's simple attributes, they're different attributes. I don't think you can imagine across different attributes at this time.

Chris Detzel (48:15):

That would probably be a good, sounds like another, ask that exact same question on the community. And then maybe somebody can go in and Joel or somebody else and go in and test it out and see, and then have a better response so that we, if that makes sense. Does that help Joel? I mean, would that be something you could potentially do later or did you pretty much know that's not going to work?

Joel Snipes (48:41):

That would be wise. Yeah. I think that would be a good question to put, because I think you want to consider not just, can you do it, but if, should you do it? Maybe you want to reconsider your data model a little bit, but I'd be interested to see what constraints brought you to that data modeling decision to begin with too lot to consider.

Chris Detzel (49:01):

Yeah. So can you mix binary and relevance rules and how does Reltio handle the behavior? So does it apply binary rules first, then the relevance rules, which would be applied to remaining entities that weren't already on a binary rule? Does that make sense?

Joel Snipes (49:22):

Yeah. So you can have both in the same tenant and they aren't applied in any particular order. So if one evaluates to an auto match before the other, they're a match. If both of them evaluate to a potential match, they'll both show a potential match on the screen. No preference, or order, or anything like that.

Chris Detzel (49:47):

Cool. I really liked how you went into our docs and showed people stuff there. I think you seem to use it a lot and know exactly where stuff is.

Joel Snipes (49:58):

I spend too much time in the docs.

Chris Detzel (50:00):

That's probably good. So one question is we are evaluating relevance match rule for news source, how we can get relevance match score for all source records, without going through a very match API call, underscore very match APA call. Is there?

Joel Snipes (50:25):

That's going to be a good question, I don't know, off the top of my head, but-

Chris Detzel (50:28):

Okay.

Joel Snipes (50:28):

At the cool use case and I would like to know the answer to that too.

Chris Detzel (50:33):

Yeah. So ask that on community. This way we can find the right people to answer and even if Joel wants to go in. Sandro man, like usual, he's throwing out the questions, love it. If binary identified potential match, but relevance reconciles to auto match, I'm assuming it will auto match. So might be a good way to transition as the goal is to get more auto matches and reduce the potential match pull.

Joel Snipes (51:02):

That's right, Sandro. Yeah. If as long as one rule evaluates to auto, that record will auto merge. So that's, absolutely right.

Chris Detzel (51:12):

Great. I think that's all the questions. If you stop sharing your screen, by the way, tomorrow, we do have another community webinar, certainly excited about that. See my screen, Joel, is that right?

Joel Snipes (51:27):

Yes.

Chris Detzel (51:27):

Thank you. Great. So tomorrow's community webinar is... Sorry. Community show is introduction to survivorship strategies, rules, and operational values, OVCL. That one is going to be really exciting. Different folks are going to be on that. I bet Joel shows up to that, because he's probably highly interested in it too. The nice thing about these is our employees learn too. Remember tomorrow, and I'll show this tomorrow and I'll even show the link to where it's at. We're giving away some swag and there'll be some rules of the road on community.relito.com. One caveat, if you're from India or Russia, I cannot ship those out yet on that, but we will have some options still for you so that you can participate.

Chris Detzel (52:20):

Joel, once again, thank you so much for coming and bringing some value to the community. Really appreciate it. Hopefully you guys enjoyed this pushing the chat, what you thought. And I like to always give good or potential feedback that we need to make these better. But Joel, this was good. Lots of great questions, lots of participation. And this will be recorded and my hope is posted by tomorrow, but for sure by next week, for sure, Aaron says, "this is very informative, seems like a great new set of tools, great session and great capability." Thank you everyone. And we'll see you in the community and I'll see most of you or some of you tomorrow. Thank you again. I'm going to stay here for a minute or two, but welcome to the new year and the new Reltio episode community shows and we'll have some more episodes starting tomorrow. As a matter of fact, tomorrow starts the first part of the series. So I'm trying to think about Netflix type talk. So Joel, I've got to get in that mindset too. This is great. Really enjoyed it. I mean a lot of fun.

Joel Snipes (53:36):

Thank you Chris.

John Caiafa (53:36):

I appreciate you. Everybody appreciates you, pulling everything together. Thank you very much.

Chris Detzel (53:41):

No worries. I was telling Joel earlier today, I'm a little nervous, I haven't done this in a month. So it's great to be back and lots of great content coming. I mean.

#Matching
#Merging
#Relevancebasedmatching
#communitywebinar
#CommunityWebinar

0 comments

6395 views

Reltio Connect

Reltio Community Show: Relevance Based Matching - Webinar

By Chris Detzel posted 01-20-2022 17:21

Permalink

Quick Links

Privacy & Terms

Account Not Active

Reltio Connect

Reltio Community Show: Relevance Based Matching - Webinar

By Chris Detzel posted 01-20-2022 17:21

Permalink

Quick Links

Privacy & Terms

Contact Us

Account Not Active