Reltio Connect

 View Only

Transform Dirty Data- Reltio Cleansers 101 - Show

By Chris Detzel posted 03-11-2022 16:27

  


Reltio Connected Cloud deals with a large amount of data that accumulates on a daily basis. The data is not considered useful if you cannot make good use of it. A common approach to deal with large volumes of data is to regularly perform data cleansing and data standardization.

Join Ashley Branham, Configuration Specialist & Kim Toomey, Solutions Consultant to learn more about Reltio’s data cleansing and standardization tools used by customers to improve overall data quality throughout their organizations.

#cleanser
#Reltiocleanser
#communitywebinar

​​​​​​Transcript: 
Chris Detzel (00:05):
Welcome, everyone. My name is Chris Detzel and I'm a director of customer, community and engagement. I'm seeing a lot of long timers, so you pretty much know who I am. And we also have Kim Toomey, solutions consultant, and Ashley Branham, senior technical solution specialist.

Chris Detzel (00:22):
In today's topic on today's community show is, Transform Dirty Data-Reltio Cleansers 101, so I'm certainly excited. Please keep yourself on mute, ask questions in the chat. I'll make sure that they get answers. I have a feeling that this will be a lively conversation today. The show will be recorded and shared out to the Reltio community probably by tomorrow, but for sure by next week.

Chris Detzel (00:50):
And then, couple of shows that are coming up. One is today's show on Reltio cleansers. On the 23rd, we have a show on Enrich Your Data with the BvD Connector. And then, something new we have and it's not necessarily for everyone, but if you're interested, we do have a Reltio... it's kind of a content cab that is really focused in on our documentation. If you want to be a part of that, we are asking for some folks to join that community. And then, I think we're looking for two to three more folks to join to... Can I have a say in some of the documentation we have and if it's looking good, because we're changing some of that.

Chris Detzel (01:36):
And then, lastly, on the 14th, we're having an Ask Me Anything with Manish Sood, founder and CTO. We'll have swag and some fun things there, more to come on that. And I'm also looking at a couple of other community shows, for sure, in April, just haven't got those on the calendar yet.

Chris Detzel (02:00):
I'm going to stop sharing, Kim, and I'm going to let you share. So, Kim, take it away.

Kim Toomey (02:05):
Yeah. Thank you, Chris. Let's get this pulled up here. So, as Chris mentioned, I'm on our solutions consulting team here, as well as Ashley. Ashley and I are going to walk through some of the Reltio cleansers that are available in the product today. We're going to start with a few slides, and then take you into the product as well. And as Chris mentioned, please feel free to ask questions, keep this interactive, and engaging, and look forward to the conversation this morning.

Kim Toomey (02:38):
So, just to level set a few things that we're going to cover, in general, data quality within MDM, and how do we think about that as an organization? What type of cleansers we have out-of-the-box with Reltio? And so, we're going to deep dive into things like our address cleansing, standardization processes, phone and email cleansers, and then string cleansers, and some additional considerations as well.

Kim Toomey (03:06):
Really, when we start talking about cleansing your data in general, Reltio, and kind of in the MDM space in general, we understand that the foundation of that data landscape is really reliant on having highly accurate data, and your MDM solution really should be uniquely positioned to solve and correct any of that dirty data that might be entering your ecosystem from some of those downstream sources. MDM, really, as an aggregator of all of the critical business data is intended to make sure that once we're reconciling everything, you're making the best business decisions that you can by cleaning up data or pieces of data that might have been entered incorrectly from a user on a web form or something in... maybe your customer success organization when somebody calls in. So, that's really what we're focused on today, is one of those foundational elements here when we're talking about big data.

Kim Toomey (04:11):
Within Reltio, we have a few ways that we think about cleaning up some of that data. The first is cleansing. This is really identifying that incorrect, irrelevant, or incomplete data, and then actually replacing or modifying that where it's appropriate. And we're also going to take a look even kind of in parallel at some of our new data quality dashboards within the product, that might help you identify some of those pieces, so that as a data steward, you can either go into Reltio directly and make those changes, or identify different types of cleansers that you might need to implement, or even look at moving downstream in the process to say, "How do we stop or prevent some of this from happening in the first place?"

Kim Toomey (04:59):
The next component that we tend to look at is, how can we standardize pieces of that data? To a standard format that you, as the customer get to define, that's going to follow a certain format and rules for consistency that, again, you can leverage throughout your organization, and Reltio can help you with.

Kim Toomey (05:18):
And then, lastly, is this concept of also enrich some of that data. Specifically, when we look at some of our out-of-the-box cleansers for your address, data, phone numbers, and email addresses, we're not just kind of cleansing and standardizing, but as you'll see in the product, we're really enriching those fields with different values, whether that's pulling out of our third party database, which is powered by Loqate for address cleansing and pulling in things like latitude and longitude on an address, or parsing out different fields in an email address. All of this as well, again, take a look at in the product, becomes new attributes that you can use when you're talking about matching or, again, looking at sending data into different systems, or wanting to aggregate customers based on something like an email domain, all of this becomes much easier and standardized within the Reltio product out-of-the-box.

Kim Toomey (06:22):
Just a little snapshot. And again, we'll look at the slide as well, but this is our new data quality dashboard, which is going to allow you to drill down into some of those unique attributes that are populated by these cleansers. For example, we can look at things like a verification status on an address, and start to make decisions based on those values. Again, we'll take a deep dive into what that really looks like in the product, but just a little teaser to get you thinking and looking at some of that this morning.

Kim Toomey (06:59):
All right. Let's take a look at some of those native cleansers within Reltio. Again, just a few more slides, and then we'll kind of move into our live product demo here. First and foremost, and probably the most robust kind of out-of-the-box cleanser that we have is our address validation and cleansing. This is available globally. All of our customers out-of-the-box are going to get US, non-cast addresses, and of course you can subscribe to a global subscription or anything in between as your business needs that kind of third party location data to be enriched on the fly. And so, essentially, what's going to happen when you input any address data into the system, whether you're sending that in bulk or one-offs in real time is on the fly? The Reltio system is going to cleanse and standardize those addresses.

Kim Toomey (07:56):
And when we talk about cleansing, we are truly, really hard coding that new address into Reltio, so that when we think about the actual kind of data lineage of everything that's going to happen to that entity, as soon as it is sent into Reltio, so we're going to cleanse, and then you move into a potential match and merge process. You do want to really be using that highly cleansed and accurate data to make significantly better matching decisions, whether you're talking about an organization or a contact or specific locations, you're now working off that same set of data to make kind of those better merged decisions.

Kim Toomey (08:39):
And you don't have to just use a complete address, you can use any of those individual components that get parsed out as new individual attributes on that address field. One other kind of key thing to note here is, there are APIs available to leverage this feature outside of Reltio. So, while this is native to what you're seeing, when you're interacting with Reltio and sending data in, just know that we do have customers that are using that API, that's available to do address cleansing and standardization, not directly within Reltio, but in other applications as well.

Kim Toomey (09:25):
Ashley, feel free to chime in with any details or color commentary as you feel.

Chris Detzel (09:32):
I have a couple of questions already.

Kim Toomey (09:33):
Go for it.

Chris Detzel (09:35):
Are the original address and validated address persisted in the data layer and exposed?

Kim Toomey (09:45):
[crosstalk 00:09:45] Go ahead, Ashley. Yeah.

Ashley Branham (09:46):
I can take that one. So, yes, both of the original input and the Loqate output is going to be sort in Cassandra, and you'll actually be able to see both of them on this source screen too. And so, you'll have complete visibility on where the original one originated from, as well as the value that came back from Loqate.

Chris Detzel (10:03):
Great. How does this work if someone has multiple addresses such as summer home, and how will that impact the data quality?

Kim Toomey (10:16):
Yeah, absolutely. By default, every attribute in Reltio is multi-value. Meaning, you can store multiple addresses on a entity record. Every address, if you have a primary residence, or a secondary residence, or shipping and billing addresses, they're all going to be cleansed based on the input that we've seen, and then you can store. And again, we'll see this in the UI, essentially an address type, if you will.

Kim Toomey (10:46):
We can store each of those address values, they're all going to get cleanse and standardized, and then remain as those nested attributes in storing multiple values on the profile.

Chris Detzel (11:00):
Great. Thanks, Kim and Ashley.

Kim Toomey (11:02):
Yeah, absolutely.

Kim Toomey (11:07):
The next cleansers out-of-the-box that we'll take a look at are our phone cleanse, and essentially, similarly to Loqate, we're going to parse out different pieces of the phone number that's been sent into Reltio in enriching that attribute with additional values, like what type of line type it is? Is it a toll free number? Is it a fixed or mobile line information about that geo code based on where the area code is for that phone number? And things about your formatting for that phone number.

Kim Toomey (11:43):
Again, as we mentioned with addresses, these are all multi-value, so you can store multiple phone numbers on a record, just the same as an address, or really any other attribute within Reltio. And again, when we take a look in the product, these are going to be stored as nested values, so you have access to all of these different components of your phone number. Should you want to use those in matching? You have a lot of flexibility there as well.

Kim Toomey (12:17):
And then, lastly, our email cleanse. This works essentially the same way as we're parsing out that data, so that you get information about the domain, the username, is it a private domain? Is it the Gmails and the Hotmails of the world? So, as a data steward and as a business, you have a lot of flexibility in how you want to interact with the data that you've been collecting for those entity records. One quick note about email cleanse that comes up a lot, this out-of-the-box does not deliver ability. Is this a valid email inbox that I can deliver to? We can work with you if there's ways that you want to accomplish that in other third party tools, but the out-of-the-box cleanser is really looking at parsing and a valid format of that email address.

Kim Toomey (13:18):
I'm going to pass it over to Ashley to touch on some of the string cleansers.

Ashley Branham (13:22):
Yeah. So, there's string cleansers. There's a couple of different flavors that we've got of these, and they're more of these cleansers that you can customize to your needs. And so, we've got some directed towards people, towards organization data. Then, we've also got some that are more of a DIY version where you can pull an S3 buckets to use your own noise words lists, or even call out other APIs via an HTTP cleanse.

Ashley Branham (13:48):
These basic string cleansers are really focused on actually helping the quality of your data, whether we want to strip out any special characters, asterisks, or any just bad values that we may have. And then, we can also take those cleanse values and use our pattern-based cleanser to build new attribute values. So, it really comes in handy when we're looking to build new names or initial fields, and we can really get that full profile of a user just by using some of our cleansers, even if we have different attributes for those names.

Chris Detzel (14:20):
Quick question, and I'm not sure if either one of you know this question, but is deliverability coming? And on that note, for email cleanser... Well, you can answer that question, and then I'll go to the next one. Sorry.

Kim Toomey (14:36):
Yeah. I do not know if email deliverability as part of that cleanser is coming or not, but we can certainly talk to the product team and find out.

Chris Detzel (14:47):
Great. And then, it's kind of a same kind of question, but this is for the email cleanser. Is the deliverability on the roadmap for the future capabilities?

Kim Toomey (14:59):
Quick thinking.

Chris Detzel (14:59):
Yeah. Yeah. Got it. Okay. Just being a little bit more specific. Yeah. And Matthew and Sandro, I can get what the PM and see what they're thinking on that.

Ashley Branham (15:12):
I also on that same note. The HTTP cleanse is something that, if you guys did have your own email validation provider, you could use their APIs and leverage the HTTP cleanse to call out that service and have it still utilized in Reltio, without it being considered custom or anything.

Ashley Branham (15:36):
Is there a standard special characters list, and can we add to it? Yes, there are a set of lists or a list of characters that we can remove from string-based cleansers. Essentially, it's going to be your same regex patterns, if you want to take out any asterisks, slashes, ampersands. If we want to get into removing other special values, such as whole words and more complex, we can use the S3 file option and provide a noise list where anything that would fall into that list, such as the limited LTD, some of the more custom ones, those can be removed as well. But your string-based pattern cleanser does come with your standard special characters list that you would want to escape.

Chris Detzel (16:25):
Thanks, Ashley.

Ashley Branham (16:25):
And we can probably jump to the next one, if we don't have any more questions on strings. And so, this is a lot of codes, so a little warning. But essentially, our cleanse functions, whether they be our phone or email or address, they've got two main components, and as you're going to have your mapping section and your info section. Your mapping section really lets you define what attributes you want to receive the output from. For example, if I want to take the address cleanser and provide it only address or the address input, but Loqate will actually parse out all of those attributes for us into separate attributes. And so, that's where we can define in the mapping section.

Ashley Branham (17:09):
With the info sections, where it really gets fun and we can decide how we want our cleansers to behave, and we can come in and provide different parameters, different inputs and outputs. And if you guys can see on the screen, I know it's probably very tiny, but on the very bottom of it, where we see params and scrub on the right hand side, those are example of some of the string function cleansers and removing those out. So, anything that would come in with those characters will be stripped out of that value, whether it be in the first position or in the middle of it, all of that doesn't matter. You'll find it, take it out, and you'll have a clean value at the end.

Ashley Branham (17:46):
We want to go on to the next one. Custom cleansers. This is a little bit more of advanced topic, but we do want you guys to know that, that is a option. And so, if none of the available cleansers work for you, we do have the ability to build a custom one within Java. And so, if you're familiar with our LCA processes, we do provide a framework for our cleansing, and you can build it within Java, test it locally, and then have the jar executed whenever any data is updated within Reltio.

Ashley Branham (18:21):
One thing I always like to caution about this is that, these are something that's a last resort and we've tried everything else with our cleansers that just doesn't fit your use case. They are protected and reviewed by engineering. And so, we don't want any bad code into your system, making things perform poorly, but it is an option.

Chris Detzel (18:40):
Quick question. Can we integrate alternate of Loqate, like Melissa address doctor? Melissa space address doctor?

Ashley Branham (18:50):
Absolutely. So, if any of those have an API, the HTTP cleanse would be a great option for you, because that allows you to execute any APIs and receive that JSON or that enriched data back into Reltio. So, if you have an existing partnership with them using that address cleanse, then we can utilize that as well.

Chris Detzel (19:13):
Thank you.

Ashley Branham (19:17):
Do you have a couple more questions in?

Chris Detzel (19:20):
Is Informatica address cleansing also integratable?

Ashley Branham (19:27):
I'm not specifically familiar with those, but again, if they have an API that we can leverage, then absolutely. If it's not a straightforward API, then we can look into our custom cleansers in receiving any of that information that way.

Chris Detzel (19:45):
There are certainly a few phone address cleansers, which is considered the best practice phone cleanser to use.

Ashley Branham (19:55):
And I'm not quite sure if I fully understand that. Kim, if you do, I know that we have address cleansing and phone cleansing, but we consider them more separate in our world.

Chris Detzel (20:06):
Sandro, I don't know if you want to...

Ashley Branham (20:08):
Ah, okay.

Chris Detzel (20:10):
Speak up a little bit on that or...

Sandro (20:12):
Yeah, sorry. I just meant the phone cleansers. When we had migrated to Reltio, there was a default phone cleanser that was in place and it wasn't doing what we wanted it to do and it was causing us some problems, so we've gone through Reltio support and they said, "Well, you should be using a different phone cleanser as part of your configuration." That's what we've been testing with, but I didn't think that was an obvious thing. And I was just wondering if that has been documented in terms of which is the best cleanser that we should be using, or does it depend on the customer?

Ashley Branham (20:43):
I would love to dig into that use case, but it definitely depends on the customer and the countries that you're using. We do have a different versions of our cleanser, but I'm guessing, most likely, they change the version that you were using within the physical config. But I will say that, typically, it is a fairly standard cleanser for most customers and that's probably one of the more unique use cases.

Sandro (21:06):
Okay. Okay.

Chris Detzel (21:08):
Thanks, Sandro. And last question for now, does Loqate support NCOA?

Kim Toomey (21:18):
I would have to look into a few... unless Ashley, you know of the top of your head.

Ashley Branham (21:23):
Unfortunately, I don't. I definitely have to [inaudible 00:21:26].

Kim Toomey (21:26):
Yeah.

Chris Detzel (21:26):
Yeah. That might be a good one to push out onto the community, then I can find the right expert to answer a question like that.

Speaker 5 (21:35):
I love that question. Thanks.

Chris Detzel (21:36):
Yeah. And David Sterns just said, it does not the last time he... But it's still a good question to ask, and it's something that I can push PMs to kind of think about something like that, right?

Kim Toomey (21:50):
Okay. Absolutely.

Chris Detzel (21:50):
Thank you.

Kim Toomey (21:50):
Thanks. Yeah. All right. Are we good on slides, Ashley?

Ashley Branham (21:55):
I do have one more if we want to quickly touch on troubleshooting, because I know that's typically... There's a lot of different factors that can affect how your data is cleansed, especially when it comes to addresses. The physical config is where we decide what process we want to use for cleansing, whether it be using CAS, using a different version of Loqate, and then especially within the L3.

Ashley Branham (22:18):
And so, even within the L3, there are multiple components that could also affect how your cleanse results are. So, if you're not quite sure on the results that you're getting or you want to do some tweaks, always look at the attribute definition with the matchFieldURIs, and survivorship groups, and RDM, because all of those can play into whether you're getting different results getting put into your cleanser or even different results after it's consolidating.

Ashley Branham (22:44):
So, just keep in mind that there are definitely ways to make your cleansers behave differently based on how you set your tenant up. And now I'm good on slides.

Kim Toomey (22:57):
All right. So, let me dive into Reltio here, and we'll take a look at a few different things live. First and foremost, if we kind of walk through what we looked at in the slides, I have our data quality dashboard pulled up here. A few things that I'm looking at just to level set is, I'm looking at all of our organization records right now. And within this list of all of my attributes for an organization here on the left hand side, I can search, so I can find my address values. And as you'll see, we have not just kind of the basic things, but all of the fields that are available in my data model of values that can be enriched and standardized based on that Loqate cleanse. For example, one of the things that we were looking at is the verification status that is returned from Loqate.

Kim Toomey (24:05):
And one thing that we can take a look at if there is time or questions at the end is actually this ABC value too that is returned from Loqate, specifically. And you'll see, they look like these codes, and we have kind of the keys to decipher what all of that means. But part of that, again, is a verification status for your address. So, maybe, again, as a data steward leveraging this dashboard, I get a quick snapshot of all of my addresses that maybe have been unverified or only partially verified. And maybe I want to do some additional cleanup on these addresses.

Kim Toomey (24:44):
By drilling in from that dashboard, I'm now brought into our advanced search screen where I can come in and quickly see a list of my almost 2000 organizations that don't have addresses, that maybe meet my business requirements for an accurate address. And again, I can click in to any of those profiles and I see my address data here, which I can expand. And again, we really only were able to partially verify this address based on what was input here from that system, and doing that look up against the US Postal Service address where these locations actually exist.

Kim Toomey (25:33):
Now, let's go ahead and... Oh, just a quick question that I saw pop up in the chat. The DQ dashboard... Again, if I navigate up to my dashboard here in the tab, you should see this other icon here for data quality. So, I can toggle over to this and that's going to bring up that dashboard here as well.

Kim Toomey (26:01):
All right. So, I'm going to go ahead and just create a quick organization record. Sure, you have some of this works on the fly. We're going to go ahead and use Reltio for an example. And again, I've just pulled our basic address and phone number information straight from Google Maps, so if I want to create my address here, again, you can see all of the fields that are available to, not just input that address, and I'm only going to input a few fields and we'll see kind of all of the enrichment that occurs as I do that. Let me just grab some data here.

Chris Detzel (26:50):
A quick question. Does DQ come OOP, or is it a separate license? Does it come with...

Kim Toomey (26:58):
Nope. All of the data quality features in that dashboard is part of your-

Chris Detzel (27:05):
Top box?

Kim Toomey (27:06):
Out-of-the-box basic license. Yep.

Chris Detzel (27:08):
Thank you.

Kim Toomey (27:09):
Yeah, absolutely. And we're in Redwood City, and lastly here 94. All right. And we're going to do just a couple other... got our address, we're going to create a phone number here, and this will stay as our business line and email. We've got sales@reltio.com. All right.

Chris Detzel (27:40):
Quick question. I know you're about to get in there, but...

Kim Toomey (27:42):
Yeah, that's great.

Chris Detzel (27:46):
I think it's important because the data quality piece is always going to be a big one, but does data quality dashboard need to be enabled for our tenant?

Kim Toomey (27:56):
Right now, it's early access. So, that was released in our 2022.1 release just a few weeks ago. I would say, reach out to probably your customer success manager to help you get that enabled if it's something you're interested in.

Chris Detzel (28:16):
But the data quality dashboard is early access, that's right, and only right now... Yep, that's right. Gillian, thanks. Do we need to configure anything for data quality?

Kim Toomey (28:29):
Nope. You have options to configure things like validation, functions, and rules, which we weren't going to touch on today. Otherwise, there's nothing that needs to be configured to actually populate those dashboards beyond the early access piece.

Chris Detzel (28:47):
And I like the idea of having a show around data quality and what that might look like, gives me an idea to push somebody do that, because I think a lot of people are interested in that piece.

Kim Toomey (29:01):
Yeah, absolutely.

Chris Detzel (29:04):
Thank you.

Kim Toomey (29:04):
Yeah. All right. Let me make this a little bit bigger here on my screen. We put in some basic information on address, basic address fields, and this is what was returned on the fly from Loqate. I have all kinds of great information, different census details, which was released towards the end of last year, I believe, so that's relatively new.

Kim Toomey (29:29):
Again, you'll see that ABC code, which can be leveraged in a variety of ways to ensure that you're working with really highly verified and accurate address data. We've parsed out some of those premise numbers and delivery addresses, et cetera. So, all kinds of good data... And again, now, you can actually leverage any of these values and attributes in any way that you would interact with something that was kind of input in a more standard or traditional manner.

Kim Toomey (30:07):
In similar kind of expansions on that phone number that we input, we understand now it's a toll-free number, all kinds of information if this was a specific area code. I'm in Portland, Oregon, we're 503. In some of those geo fields, you'll get more information about where that phone number is kind of tied to from a geography standpoint. And then, again, email as well.

Chris Detzel (30:46):
Will we have the E.164 phone format available?

Kim Toomey (30:54):
Ashley, are you familiar with that format? That's not something I am familiar with.

Chris Detzel (31:00):
Ashley, do you know? You're on mute.

Ashley Branham (31:06):
I don't believe that one is... I think the E.164 is the plus one format. And I want to say that the formatted number is... Yeah, the international code. I'll have to triple check. My gut check says no, that it's not in there. And if it is, it may be something that you just have to add to attribute for. But let me triple check that one real quick.

Chris Detzel (31:30):
Yeah. It's international code plus one, basically.

Ashley Branham (31:32):
Yeah, let me check.

Speaker 6 (31:37):
Yeah. So, Chris and Ashley, the question here is because, usually, the phone cleansers return the international codes, so we would like to have... It's a request from several clients. We would like to have the international format, if possible.

Chris Detzel (31:56):
Great. And by the way, I'm pushing the PM for April, to do a deeper dive into the address cleanse and that'll be a great question. Some of these questions are going to great for him, and I'll make sure he listens to this show as well.

Kim Toomey (32:15):
Yeah. And I was just pulling up an international phone number, for example. Yeah, it's not quite that E.164 but again, we can support those international phone formats and parsing some on that out.

Chris Detzel (32:31):
Thank you.

Kim Toomey (32:32):
Yeah.

Kim Toomey (32:38):
All right. And this is another great example where you can see some of those multiple addresses stored on a profile. So, leveraging that type attribute, we can say here's a shipping address. And again, this is going to work exactly the same for an address, whether it's in the UK or any other country globally, as long as you've got Loqate license globally. Okay. It is really flexible based on where your customers and accounts are, and what kind of global customer base that you might be working with.

Kim Toomey (33:23):
And similarly, you'll see this billing address that was input doesn't actually have any of the... 56 Market Square, we just saw that it was added on this market square street or market square, I guess, those UK addresses. And so, you can see the different ABC codes, for example, between these two locations where we have more details. We know that this is actually only partially verified, because we don't have the complete address down to an actual kind of premise number here.

Kim Toomey (34:03):
And again, those different values might be good indications, especially when you're looking at matching on some of these location addresses to say, "I only want to use an address if it's met a verification status about a certain threshold."

Kim Toomey (34:27):
All right. Any questions or... Ashley, do you want me to turn the screen share over to you for anything in product with the string cleansers?

Ashley Branham (34:40):
If we want to dive in to some deeper code, we definitely can. I'll let you guys tell us how technical we want to get today, if we want to see any more live examples for concatenation and string cleansers.

Chris Detzel (34:51):
Well, I think that's always a good thing. We do have one other question, though. If you use Loqate API outside of Reltio, I'm assuming it's an additional cost, is that right?

Kim Toomey (35:03):
I don't think so. I think those API calls would just go towards your allotment of API calls that you have with your tenant subscription in general. I'll, again, double check on that, but I believe that is really included as part of that Loqate product through Reltio.

Chris Detzel (35:29):
Yeah, it would be good to know.

Kim Toomey (35:30):
Yeah. We'll follow up with that, Sandro, but I'm pretty sure that there's no additional cost to leverage that.

Chris Detzel (35:40):
Great. And Ashley, did you want to show one or two things? I mean, I think anytime you can show a few things, it would be great.

Ashley Branham (35:48):
Yeah, of course.

Chris Detzel (35:50):
I know it's on the fly.

Ashley Branham (35:53):
Story of my life. Okay. So, I just down to a different one of our tenants, and just like Kim was doing, I was walking through just entering in a couple of basic different fields. The string cleansers are really good whenever we want to build full names, that's a pretty common use case that we've got. And so, in creating a new record, I'm just going to put in three separate attributes. And then, if we want to get fancy with this, we can also see how it's going to strip out some different ones. But just as a very base example, you can see, I got an additional attribute just by entering in the original input of first and last name.

Ashley Branham (36:31):
The same concept can be used to take out any different strings or any of the fancy characters that we don't like. And once we take a look at this, the cleanser itself is going to be providing a different value with its own crosswalk. And so, you guys will have complete distinction on what values came from, which location. As we can see, I have gotten... Oh, let me spread this out a little bit.

Ashley Branham (37:06):
I've got my new value and this has came from our Reltio source. And then, we've also got another source, which is our full name builder. And this shows that even though I have my original source from Reltio, Reltio is also providing a different source, and you guys can decide if this is a full name builder, if this is just a string cleanser or whatever you want to call this, is producing a different source for that.

Ashley Branham (37:31):
Same concept applies for Loqate, so whether we talk about how we've got the uncleansed value, which was originally inputted from your source system, and then your Loqate value, those are also going to be provided with a new crosswalk as well, so you can see the cleansed version and the original version.

Ashley Branham (37:52):
You guys tell me, do we want to get into the weeds of things? Do we want to just see some cleansed examples? And as you can... Oh, now, that I could refresh in here, you can see that the original asterisks and ampersands are still in Reltio, but they are not being chosen as my operational value because I only want to use my pretty version of the names for the golden record in itself.

Chris Detzel (38:18):
Yeah. I'm not seeing any feedback, but there is a question that I'm not sure if you'll know, but is USPS database backend for Loqate? I think that's the question.

Kim Toomey (38:29):
Yes. So, out-of-the-box with kind of your standard tenant subscription with Reltio, it's the USPS non-cast database from USPS from Loqate.

Chris Detzel (38:43):
Okay. And there's other questions coming in. Can we call sequence of cleansers on a single attribute?

Ashley Branham (38:53):
Absolutely. And so, that is a very useful use case, especially when it comes to cleansers... whoops, string cleansers. And so, the same thing is being happened here. If we want to look behind the scenes, we'll have, I think, three or four cleansers chain together, and doing that allows us to strip out all of the any bad characters. If we want to do any capitalizations, we can do that as well. And then, finally, once everything is cleansed how we want to, we can chain it with a pattern-based cleanser to have this pretty version beautiful new attribute.

Ashley Branham (39:25):
And that is going to be done in JSON. If you guys want to take a look at that, we can. I'm always hesitant to show code, if we want to scare anybody off, but chaining is something that we absolutely do-do.

Chris Detzel (39:39):
Great. And there's a couple of more questions. And if we have time, we can go on the code. Ashley, you don't have to be scared on these calls to go into the code. We do it a lot.

Chris Detzel (39:50):
Can the string cleansing for full name be contingent on whether the full name has already been entered? Example is, if it's blank, it triggers the string cleanser. Otherwise, it leaves it as it is?

Ashley Branham (40:04):
Yeah. So, there is some is force parameters that we can utilize. And basically, we'll say, if this is empty, force the cleanse, if not, leave it as is.

Chris Detzel (40:14):
Great. And if we want to use string cleanser to auto-populate an attribute based on another existing attribute values, will the cleanser calculate or populate value at the crosswalk level or at the golden profile level?

Ashley Branham (40:31):
The golden profile level will always be determined by your survivorship. And so, if you have an OV value, let's say for first name, the cleanser itself of the other attribute will choose this OV value. It's not going to go choose this poor value, because it's not the operational value, and I'm not going to build my new, full name based on that original one. I'm going to build it based on the operational value, which is going to be the one that's cleansed already.

Chris Detzel (41:00):
Great. Any other questions or does anybody want to go deeper into one of the questions asked? I hear nothing, then I'm just going to... All right. Any last words, Kim, Ashley?Please show I3 code.

Kim Toomey (41:25):
The L3. L3.

Chris Detzel (41:27):
L3, sorry. L3 code.

Kim Toomey (41:30):
Okay. All right. Go for it, Ashley.

Ashley Branham (41:31):
Okay.

Chris Detzel (41:32):
And as you're kind of going in there, Ashley, quick question. Is Reltio maintaining its own copy of the USPS reference data that needs to be refreshed on a regular basis?

Kim Toomey (41:47):
No. All of that is actually Loqate, as that really leveraging, so we're not copying the database from Loqate, we are working with the APIs between Reltio and Loqate to directly pull that address data from the USPS database.

Chris Detzel (42:07):
Yeah. Great. And then, when you show the L3 code, can it be for custom cleansers? Is that right?

Ashley Branham (42:13):
I don't actually have any custom cleansers available to show, but if you guys had a certain use case, I can definitely follow up. I would always love to see what's making you choose a custom cleanser versus trying to use a customized version of our cleansers, just because jars are always heavy and harder to maintain a little bit more of a deeper process.

Chris Detzel (42:37):
Yeah, makes sense.

Ashley Branham (42:40):
All right. So, I do have an L3 pulled up. This is going to be in a consumer example. The cleanse configuration lives on the entity type definition. It's standard with everyone. As you guys do see, I've got my two sections, I've got my mapping and my infos. Mapping is typically used for addresses. Very commonly used to take any inputs from address line one, and parsed out into any other fun attributes that comes back from Loqate.

Ashley Branham (43:11):
We simply just get to decide what attribute from Loqate gets populated into what attribute in Reltio. But the real fun stuff comes in when we get into the info section. And I know we had talked about concatenating different cleansers together in a sequence version. And so, that's what I'm going to dive into first. We've got a standard definition. I'm going to be building a full name here, and I'm going to sequence together a chain of different cleansers. With the first one being, I want to scrub out any of these characters from my first name attribute.

Ashley Branham (43:53):
If you guys have any questions, interrupt me as we go through here. I'm going to repeat that same process for the middle name and last name, so that way, I've got a clean first name, clean last name. After it's been cleaned, I'm then going to go through and give it title casing, so that way, I'm taking the pretty version and making it even prettier. Same format, different parameters, still chain together.

Ashley Branham (44:25):
I do want to point out that whenever you are changing or chaining the cleansers together, it is very important to use a different source as an output for each one, that is because by default, the cleanser itself are going to ignore anything from the rail geo cleanser source. It helps prevent loops behind the scenes. And so, the output of the scrubbed version is going to be my string cleanser version or string function cleanser. And then, the output of my case cleanser will be my case cleanser source. That way, I can ensure that whenever I come in here to concatenate everything together, it is being considered as part of that concatenation.

Ashley Branham (45:12):
To recap, I've taken the first name, last name, I've scrubbed out my parameter, scrubbed out any bad characters. I, then, title case both of those attributes. And then, finally, I'm going to concatenate both of them together using the full name builder for a first and last name, new attribute that's being outputted.

Chris Detzel (45:33):
Awesome. I do have a couple of more questions. I knew some would pop up. Does company name and person's name use the same cleanser?

Ashley Branham (45:45):
It depends on what cleanser you're talking about. Are we talking about just a string cleanser, as far as stripping these out? Or are we talking about the matched token cleansers that are more geared towards organizations and people?

Chris Detzel (46:05):
JP, I don't know if you want to take yourself off the mute.

Ashley Branham (46:07):
Sorry, I don't see the chat coming up.

Chris Detzel (46:09):
Yeah, no worries. Let's go to the next question. Oh, go ahead, JP. Yeah.

JP (46:18):
Yeah. No, it's for Chrish [inaudible 00:46:20]. I believe we're talking from a string cleansing point of view. Let's say, if we've got an organization name, we can allow a certain specific characters like hashtag or [inaudible 00:46:31] slash the organization name. But if you talk about in our personally, we don't expect such characters to be a part of the association, right? So, are these cleansers common when we're trying to scrub these specific characters? Are these specific characters common to both person name and the organization name?

Ashley Branham (46:49):
Absolutely. And so, if we take it from that example, if we're talking about straight cleansing and I want to strip out these values from the person name, but not the organization name, those are automatically going to be treated differently just because they're defined as separate pieces in the configuration. Whenever I'm cleaning the first name, I'm specifying which attribute I want to cleanse, and I would not use the same situation for an organization name.

Ashley Branham (47:16):
After the data's been cleansed, if we're looking at matching, there are different match token cleansers that are going to be used for organization name versus match name. But again, those are also defined in your match rules, and are different for each attribute.

Chris Detzel (47:34):
Great. Two more questions, [crosstalk 00:47:36] I think. Go ahead.

Ashley Branham (47:38):
Oh, I was just making sure that I answered your question, if we were on the same page.

Chris Detzel (47:41):
Can a single [crosstalk 00:47:45]. And thanks for that question. Can a single cleanse mapping invoke by multiple entity types?

Ashley Branham (47:54):
No, and that's simply because the cleanse mapping itself is defined per entity type. You can copy these over. For example, if we're talking about the addresses, a lot of times, I'll just copy what I have per individual, do a fine replace on this to organization, and then use that same address mapping for organization, but we do need to define them in both places because they are entity specific.

Chris Detzel (48:16):
I just thought you were going to say no and leave it at that, but nice explanation. That was great.

Chris Detzel (48:25):
Can we have a country filter in the cleanser? Example, distinct full name by roles, country, and last name plus first name, or first name plus... then comma last name plus last name.

Ashley Branham (48:40):
Let me look at that chat real quick.

Chris Detzel (48:42):
Yeah.

Ashley Branham (48:43):
Sorry, that's a lot. Okay. A filter in the cleanser... filter by country. And so, we basically are looking to format it differently based on the country location. On the top of my head, I don't know if we can have a filter with-

Speaker 6 (49:07):
In Europe, different countries use different approaches to the full name. Some countries use the last name and the comma and the first name, the other countries use the first name and the last name. So, is it possible to have a filter there for these specific countries? We will use one format, and for the other countries, using another one, is it possible?

Ashley Branham (49:35):
I think we can add filters in, but I want to double check because I haven't played with this on firsthand, but I'm happy to follow-

Speaker 6 (49:42):
If you have any general process to do this, it would solve a lot of different things.

Ashley Branham (49:50):
And you guys want an actual attribute stored in that, not just the data label pattern?

Speaker 6 (49:54):
We have an attribute for that. Yeah.

Ashley Branham (49:55):
Okay. And currently, some-

Speaker 6 (50:00):
[inaudible 00:50:00] when we load the profiles, and then we have a process that feels that attribute depending on the country. So, if we could have that embedded here, it would be great. Yeah.

Ashley Branham (50:14):
And if that isn't an option as a worst case scenario, we could always have two different attributes for each of the different country formats that we want.

Speaker 6 (50:25):
There will be several formats.

Ashley Branham (50:27):
Okay. Yeah, I'm happy to follow up with you and see if that is a possibility.

Chris Detzel (50:31):
And then-

JP (50:35):
Hey, Ashley... Just go ahead, please.

Chris Detzel (50:39):
No, go ahead.

JP (50:42):
Okay. Right. I've got a question. I mean, we have already discussed that on a single attribute, we can have multiple string cleanse functions, right? So, is there any specific order in which the cleansed forms will be executed or is it random?

Ashley Branham (50:53):
It is. It would be executed in the order that you define within the L3. And so, that's why it's very important that you guys see here that I'm first cleansing out the poor attributes before I title and concatenate it together. The chain defines the order that this cleanse is going to be executed for this full name builder.

JP (51:16):
Thank you.

Ashley Branham (51:16):
Mm-hmm (affirmative)

Chris Detzel (51:20):
And I think, last question, is there a string cleanser to force case? Example, cleanse to all uppercase?

Ashley Branham (51:25):
Absolutely. And so, there is a title that just one example where we have an upper, a lower, and maybe a couple of more. We'll have to look into the documentation to see what all parameters are possible for the string function cleanser, but there's definitely a couple of choices.

Chris Detzel (51:42):
Great. Looks like there are no other questions, but there's some follow up on some of the questions. One is, I think I did mention this, I am pushing the PM to, again, do a little bit deeper dive, so some of these questions can be answered by him.

Chris Detzel (51:59):
But Ashley, you mentioned you can go a little bit deeper in some of the one or two of the others, so I'll give you a copy of the chat so that you can kind of look at them and then we can answer them. The other piece is, I assume that there'll be some other questions that might come up. My recommendation is that we start posting some of those on the community. This way, I could either have Ashley, Kim, or the PM go answer some of those particular questions. And then, we can address them on the next community show, which I think is going to be another good one.

Chris Detzel (52:36):
So, Kim, Ashley, thank you so much. This is really a good 101. We probably went a little bit deeper than 101, which is always good. And other than that, thank you everyone for another community show, for coming and asking your questions, that's what these are about. If you have other questions, please go to community.reltio.com. Also, you go to the events section, look at what's coming up.

Chris Detzel (53:04):
By the way, I do record these, post them out to the community. Hopefully, everybody liked it, so thank you, everyone. I'm going to stay on for another minute or so. Glad you found it helpful. Yeah, these are always good.

Kim Toomey (53:19):
Yeah.

Chris Detzel (53:20):
Glad that you two just volunteered or... I asked, but...

Ashley Branham (53:27):
Volunteered is a very...

Kim Toomey (53:28):
Yeah.

Chris Detzel (53:32):
But it's just a great way to talk to a lot of different people at one time about this. And hopefully, future folks that want to kind of come back and rewatch these things like that.
0 comments
6384 views

Permalink