Reltio Connect

 View Only

Reltio Google BigQuery Connector - Show

By Chris Detzel posted 03-05-2023 12:46

In a recent episode of the Reltio Community Show, Jon Ulloa discusses a new product that aims to overcome these challenges - the Reltio Google BigQuery Connector.

Data management is a crucial aspect of business operations, yet it can be challenging to identify a single source of truth, integrate large amounts of data from various sources, and ensure data accuracy. In a recent episode of the Reltio Community Show, Jon Ulloa discusses a new product that aims to overcome these challenges - the Reltio Google BigQuery Connector.

The Google BigQuery (GBQ) analytics warehouse is a valuable tool for organizing data and making accurate, data-driven decisions in near real-time. However, customers often face challenges in managing data integration, configuration, and validation across different departments. The GBQ Connector helps customers overcome these challenges by providing a seamless user experience, exporting and visualizing data, and offering real-time updates.

One example of how the Connector can help businesses is with financial institutions' onboarding customers and compliance with regulations. The Connector facilitates the process of validating customer information while protecting organizations from malware and ransomware attacks. It categorizes events through microservices and transforms them into compatible formats for GBQ, making it easy to match incoming requests to an actual source of truth in near real-time.

The GBQ Connector also helps customers manage costs and archive data through routines, and it works with AWS and Azure tenants. It improves performance and saves costs by using Google's API. Customers can use the Connector by following six basic steps outlined in the documentation, and it provides five views and five raw tables, including options for Json view table schemes. It also allows for tracking interaction and matching and merging entities, including a new on-the-fly merge option. The Connector is low or no code, providing a 360-degree view of data in near real-time.

The demo shows the configuration of a tenant in both Legacy column-based and new Json formats, discussing data schemes, entity types, and relationship types. The connector includes reporting templates and options to include operational values. The data is organized and synced with GBQ, and the Legacy view has business rules to align formatting and schema with GBQ's. However, the new JSON format provides a one-to-one mapping, is cost-efficient, and highly performant.

Jon demonstrates changes made to a query and how it is reflected in two different versions. Custom views created on top of out-of-the-box tables show the latest version. A reporting template can be modified for different data sets, including queries and drill-downs for understanding merges, matches, and potential matches. The system generates a report that displays the latest activity and potential matches for entities. It provides real-time or near-real-time visibility of the data pipeline and updates various tables.

While the GBQ connector has future features for workflow data types and activity log data types, it requires specific set-up for security purposes and may require creating a support ticket if there is an incomplete event or anything that doesn't match the expected GBQ element. The Connector is compatible with GBQ and allows for sharing data outside of Looker Studio and connecting to other BI tools like Power BI.

The Reltio Google BigQuery Connector provides a solution for accurate data management, allowing customers to organize their data and make accurate, data-driven decisions in near real-time. The connector facilitates the process of validating customer information while protecting organizations from malware and ransomware attacks and provides a seamless user experience. With a 360-degree view of data and real-time updates, businesses can streamline their operations and make informed decisions.


Chris Detzel (00:00:07):

Welcome, everyone, to another Reltio Community show. My name is Chris Detzel and today we have Jon Ulloa on the phone, again. He was here last week, talked a little bit about our new Snowflake Connector. But today he'll be talking about our new Reltio-Google BigQuery Connector. Jon is as principle product manager here at Reltio and I'm the director of our customer community and engagement.


By the way, thank you everyone for all the questions and things like that from last week. So the rules of the show are simple. Keep yourself on mute. All questions should be asked in chat, and/or feel free to take yourself off mute at times and ask your question. Call is being recorded and will be posted to community as usual. I will post all those links on where it will be here after Jon starts.


There is a survey taken at the end of the call. The Zoom will pop up, there'll be a pop-up and just please take that survey at the end. Matter of fact, some of our shows that you've asked for in that survey is actually being on the community. So we do have some shows coming up. Today's show will be on Google BigQuery Connector. We're having continuance of our how to search my data with Reltio APIs on a ask me anything. Really excited about that. And then on the 15th we'll have one on understanding Reltio API performance.


And then this is the one that somebody asked for and seemed like there needs to be a little bit more understanding of the ABCs of crosswalks, understanding their purpose and use. Then we have another one coming up on the 26th of April, but there will be a lot more in between, but I'm just working with some of the product managers to get some more going, but this one is BYOK: Taking control of your data encryption with Reltio Shield. So you'll have to click on that link to see exactly what that is.


But feel free to go to the, click on the events or upcoming events, and register. I'm going to stop sharing, Jon, and I'm going to let you take control and here we go.

Jon Ulloa (00:02:22):

Hey, Chris, thank you so much and hey everybody, thank you so much for joining today. Excited to talk about our new Reltio GBQ connector that was just released in the past release cycle. Yeah, again, like Chris reiterated, I want this to be an open forum here so please feel free to include your questions in the chat. Feel free to let Chris know and Chris will go ahead and moderate that, ask some questions in between.


But excited to present what this connector is about, go over a little bit of the architecture and some of the features, benefits, problems we're trying to solve here at Reltio for our customers, and then hopefully we'll have some time for a demo. We'll be taking you through what this connector looks like, feels like and how it operates with Reltio to our target customer, Analytics Warehouse GBQ. So that being said, let me go ahead and share my screen. All right, can you all see my screen?

Chris Detzel (00:03:31):


Jon Ulloa (00:03:33):

Excellent. Okay, so what I want to talk about is three key challenges here that we're trying to solve. One, as many in the room might be familiar with is this idea of a single source of truth. Getting accurate insights and that starts with getting accurate data to be able to get those insights. And that's a huge challenge with the organizations here today. Various data files that exists, it's really hard to get a hold of what a single source of truth is, and to get that confidence to be able to go in front of your leadership and be confident about some numbers that might be reflecting your initiatives.


And that is true here today. What we're trying to accomplish is with the Reltio MDM platform is to do this, but a part of that experience is also exporting that single source of truth to where you can access it, where your team can access it, where your partners can access it, to be able to share that accurate dimension of insights and data.


The second thing that I want to talk about is this idea that data is possibly flooding in and it's not stopping. It is growing at tremendous rate and every organization is trying to be data-driven right now. And that involves collecting large amounts of data, whether that's zero party data, first party data, second party data, third party data, it's going to be up to 100 party data in 10 years. And this data is coming in from all sorts of integrations with partners internally and it's going to be hard to manage for our customers.


And so where should this data live and how do you create a data strategy to be able to share this, right? In near realtime. That's the key here, right? And to be able to get up to date of what's going on with all these different transactions, especially in an eCommerce world, it is quite overwhelming, right? So to be able to understand your customer and to be able to get that latest and greatest view is absolutely key when you think about presenting data in this world we live in today.


And then the third thing I want to call out is integration headwinds, to be able to get this data to the right place and to access it. There's lots of hurdles and steps and different offerings that our customers have here today to be able to get this data, right? There's a lot of configuration stuff, there's a lot of validation that needs to happen, and a lot of partnership and project management that needs to occur, and kind of unblocking so many different departments to be able to actually see what is being said about data integration.


And we're talking about time to value here, right? And so customers are often overwhelmed with this and they want to be able just to maybe put it on autopilot and just say, "Take care of this for me because this is too much to implement," right? And that's a lot of what we're seeing from our customers nowadays.


So when we think about a connector, an endless connector, what Reltio here is trying to do is partner with Google and Google BigQuery, because this is the very popular Analytics Warehouse that many of our customers and the market are being a part of. And when you think about a data strategy, a lot of our customers are thinking about, okay, so lots of data flowing in, lots of different silos, how do I essentially organize this data to be able to take action on it at a fast philosophy and also a trusted philosophy?


And so there's a trend here that we see with customers, especially part of a data strategy to use their analytics warehouse as that main repository, right? And so what we're trying to do from a Reltio standpoint, since we are looking to provide the single source of truth when it comes to different data types like NEDs, relations, interactions, backers and merges, right? We want to be able to be connected to where it all begins. And when you think about the GBQ Connector, that's exactly what we're trying to do, is take this data out of the Reltio platform and how to persistently sync into the GBQ warehouse.


And with that, it's persistent but it's also near realtime and it's enabling these customers to go ahead and make these data-driven insights that they require from accurate data. And that's essentially what we're trying to do with analytics, report and data science teams. Trying to enable them to make the right decisions with Reltio data, and that's pretty much the gist of what we're trying to talk about this Reltio GBQ Connector.


So a lot going on on this slide here but wanted a little bit to talk about the different benefits that this connector can offer you. So when you think about the Reltio platform, we take in data from legacy systems, third party data sources, various integrations and applications, and that's done through our APIs and our data loader UI and Reltio does its vetting. It cleanses, it duplicates and enriches this data, but where the GBQ Connector comes in that Reltio just came out with is exporting that data to the business user audience to be able to get this rapidly, but also accurately, to use this for analytics inquiries but to also report on data visualizations, right?


So we see here the end state of what this connector is trying to do. We're trying to get you the right data, avoid having you guys spend all this time data prepping and organizing this data and getting it in the right format, manage that experience so you don't have to. And then ultimately get to an aha moment of visualization, right? What can I do with this data? Give me a visual depiction of what we can do.


And so we've integrated it with the local native BI tool here, Google Studio, to represent that. And that's going to be part of the demo here today. But what we're really trying to hammer in here is this near realtime experience and how important that is, and Chris, I know I mentioned that we could have another session on how important this is, and so many use cases out there. I think I'll just pick one. I think last time I talked about logistics and this time I'll talk about financial services. So before Reltio I was a product manager in a larger financial services institution, and when you think about digital onboarding experiences and how customers that are looking to make investments have so many different options today, and it's no longer you have to go into a bank, it's all done online to open up an account.


But there's still rules and regulations that customers need to follow. KYC, AML, rules particularly are important, right? So the institution needs to trust that customer to make sure their client, and it's safe to invest with that individual. So throughout this experience in the eCommerce world our customers expect decisions to be made like that, and when they have so many different options to go ahead and sign up for an account online it's important to make this experience seamless, as what they would expect when they put in an Amazon order.


So when they're starting out and applying to be a part of a financial institution, invest with them, essentially pass their money to them, the institution also needs to do some background checks as part of that. I was part of developing a tool to not only validate that experience or validate the customer for that institution but at the same time they could be inserting malware or ransomware attacks. You've heard about that, that was going on in the industry a couple years ago and still is prominent today.


So the idea of being able to pass, let's say, a password or a driver's license, and to protect that organization from malware coming in, but at the same time trying to get that back to the customer and say, "Hey, okay, you're good to go. Your password cleared. Your driver's license cleared," takes a lot of work under the seats. And there's so many different departments and protocols that need to be followed but reflecting back on that experience, the hardest thing to do was actually match this incoming request to join the institution to an actual single source of truth, right?


So when you're getting all this metadata coming in about the password, about the information with the customer, you're also trying to validate that against a database with a single source of truth. And that validation needs to be in near realtime because the customer's not going to hang there for 15, 20 seconds. They're going to think that something's wrong with the page and they're going to move on to the competitor. The institution loses a lot of money to go out and partner with that individual.


So that's just a case of near realtime trends and experiences with data that we're facing today. Just wanted to share the importance of this and what we're trying to do with this GBQ Connector, is essentially facilitate these kinds of experiences. Okay. So said a lot here. Hopefully this is clear in terms of what we're trying to do. I'm going to move on. Chris, feel free to sprinkle in any questions you've had, but I'll go onto the architecture.

Chris Detzel (00:13:36):

Yeah, keep going. That was a great use case, I loved it. Thank you.

Jon Ulloa (00:13:40):

Awesome. Okay, so this is an architecture in terms of GCP tenants here at Reltio. How do we connect to a GCP project, right, to deliver this data with the Google BigQuery Connector? There's pretty much two options here. You see the separation of a Reltio GCP project. We see this platform services where your tenants live and there's lots of activities that are going on in this environment. You're loading new data every day. You have data storage, making some changes to some mat rules, the new entities created. Relations leading, what happened. We set up an event to essentially listen to those changes as they're happening and we're essentially categorizing these events through micro-servers, which is the GBQ Connector and transforming this into compatible formats to land into your GCP project and within your Google BigQuery as a project dataset and as your GBQ project, right?


And so what we're doing here is we're getting all these events and putting them into tables, views and routines to effectively manage a near realtime ingestion of data from Reltio into your Google BigQuery account. And that's essentially, in a nutshell, what it's doing. There's some permissions that need to be done here to be able to write into your project, right? And that's done through either you using our Reltio service account, which is on the left side, or we can also take in your service account, right? But we need to have some permissions to be able to do so and after those permissions are granted and we enable this tenant for you it pretty much starts working like that, after the initial data load everything is working for you and you're getting a data in easily-formatted tables just being kind of query against.


And when we talk about routines we think about the fact that there's a lot of events that are floating in, right? Let's think about cost concerns about getting all this information flooded into your GBT product that's going to go ahead and put a lot of bills on you. And so we have routines here to go ahead and act and archive this data so that you're not going to get hit with these bots, right?


So we're thinking about your cost management aspect of getting this realtime experience but we're also think about getting this actual data. So you can go through your analytics applications. And so these are just two different ways that we can move forward from an architecture standpoint about how the GBQ Connector works, and I do want to do a quick shout-out here that also for anybody that's a GBQ customer, you maybe don't have a GCP tenant, this also works with AWS and Azure tenants as well to be able to get this information into your Google BigQuery account.


We have separation between your tenant right here and then a project and what we require to go ahead and facilitate this is a little bit different. It's just getting some key information as a kind of intermediary approach, but everything else just pretty much flows the way we just talked about, right? And a cool thing I do want to mention here in terms of performance, for those that maybe have the legacy analytics connector that we had here in the past is that we are now using highly performant Google Write API. Before we were using this inserted API.


So this way here in the picture, I just want to go ahead and call that out because that is a huge performance gain, and it's also saving on costs at the same time for you guys. So making some good performance and cost savings features here with this new product.


And so I mentioned integration high wins. Customers want to go ahead and see the value of this. There's really, pretty much, just six basic steps for each way you want to go ahead and go about getting access to this connector and getting access to this data. If you're using our service account you've just got to get that information through our documentation. You create a project and you create a dataset and you provide your tenant environment, your ID, the project's name, the dataset name. Put it into a ticket and if you're set up, you know, contractually for this product, then you receive a go-ahead to be using this product, now it's enabled.


And then there's some validation stuff. So we have in the documentation, and then you just transfer your data and then everything seems to be working, and you don't have to do anymore work, right? It's very low code, no code, out of the box situation. And if you're using your server's account, the only difference is you just give us that information instead of going to get our information. Key credentials, API to facilitate that. That's pretty much the only difference. Everything else is pretty seamless here and that's what we're really trying to aim for with this new product.


So this is just a look at what the table structure and the data schema would look like. We do have two options here. We have this view, JSON view/table schemas option which we're really excited about, and this is what you would see here today. So, when you talk about out of the box, what do you get using this connector, you get essentially five news and five log tables. The log tables are the ones that, especially if you're getting this data in to start, they're managing all the different event versions that are coming in to facilitate these views that you see here on the screen.


The data schemas are the same for the views and the tables so you're getting, from an entity standpoint, you're getting that information about their attributes and their cost logs, what entity type is it, is it an organization, is it a contact, for example. And you're getting some events details in terms of who created this event, who updated it, and what time this is occurring. The only difference with the relation view or [inaudible 00:19:46] is that we have a start and end object to facilitate that lineage of that relation.


And we talk about interaction, if you guys have used interaction data, we have members of the interaction. So through an organization submitted a payment to an employee, right? You'd be able to track that member view. And then from a match perspective, this is what a lot of people are interested with this use case, the match and the merge aspect of it, is that you get potential matches, you get what's not a match, and you get manual matches. And you get the latest version of these considerations, the rules associated to that and the timestamp of when a potential match occurred.


So today, in the Reltio platform, it is available but you only can see so much on the UI. You get a subset of it, of these potential matches. But when you think about a use case for people to look at, they want to go into production using some match rules and be tested on the lower environments. They want to go ahead and play around with that lower environments, understand those matching rules and potential matches and see how they actually orchestrate in lower environments to see if they feel competent moving that into production. And so you're getting all the potential matches, not just the subset, but this export, which is pretty neat.


And then from a merge standpoint, you're getting what happened after maybe a potential match was taken action on. And this is historical merge tree information, who was the winner, who was the loser after our merge, what were the match rules to be associated to that? Was it an auto marge? Was it a manual merge? And now pretty exciting we have a new type of merge that we're also categorizing on the fly and we'll talk about that a little bit in the demo here.


But on the fly merge is essentially a process where, not like auto when there's two entities and you have that documented within the Reltio platform. There's one entity and one entity that's becoming an entity but not quite there yet. So you'll be able to capture this on the fly that you already understand that this phantom entity is associated to this other entity, and we're also categorizing them now as an on the fly merge, which is also helpful because a lot of our customers are looking into that, whether that's for entitlement concerns or also just trying to get better at analyzing this information before it gets too downstream.


And so we have this information as well. So pretty exciting, I'll definitely go ahead and show you in the demo what this looks like with some real data, but that's essentially the data schema in a nutshell and stuff.

Chris Detzel (00:22:29):

Right. Couple questions, Jon. Frank, thanks for the question. How does or will the connector resolve new entity types and relationship types in Reltio? So harmonize the graph node mode with the relational model tables and views and GBQ post instatation. Something like that. Does that make sense?

Jon Ulloa (00:22:50):

Yeah, it does. Yeah, yeah. This one we covered in the demo, you'll be able to see exactly how everything updates. And, yeah, it's super interesting. So hopefully the demo answers your question, Frank, and if not feel free to follow up and we can talk about it.

Chris Detzel (00:23:06):

Great. And then one other question on entities. Are the crosswalk level attributes like OV and non-OV available within the crosswalk array?

Jon Ulloa (00:23:18):

Yeah. So we'll go over this, look into the actually real data, what it looks like in the demo, but we do have the option to include, will be N, non-OV, that's a configuration study. We also have an option, if you're interested, I just want to have the operational values, right? The single source of truth. I don't want to worry about the non-OV. There's a configuration option within the GBQ Connector to give you that. But we offer OV and non-OV as default.

Chris Detzel (00:23:43):

Great, thank you. That's all the questions for now.

Jon Ulloa (00:23:46):

Awesome. Cool. So, yeah, I talked a little bit about this, that part of the connector experience is that it's a managed pipeline. You don't have to worry about it. This is pretty much out of the box. We provide the out of the box schema, the formatting, the transformation into your GBQ project and dataset so you don't have to. You don't have to worry about this aspect of things and, frankly, if the JSON format, this new feature here is selected, when you think about if you're adding a new attribute as part of your data model within the Reltio platform, that essentially is taken care of by Reltio. So you no longer have to submit a support ticket whenever there's a data model change to be able to understand and get the latest on that information, right? This is now managed by Reltio.


So when you talk about out of the box, this is another thing that Reltio does on behalf of you. And then, yeah, core tenant data, so I mentioned we do matches, merges, relations, interactions and entities, and this is to give you that 360 view of the data, near realtime. I think I beat that drum pretty prominently. And then we also have the supporting template which will be part of this demo to show you how to visualize data and how to start building your own reports off of the data that we provide you. And this aspect of the JSON and column-based formatting.


So column-based is for our legacy Reltio Analytics customers who may be on the column, we have that, it was a column-based format. And I'll show you in the demo what that looks like, and we have this backwards compatibility in case they wanted to migrate to the new one, but we would definitely recommend the JSON format, which we go with in the demo as well. Both are easy to manage in the receipt, whether that's for backwards compatibility purposes or for just net new and starting out with it.


Yeah, so again, what this connector's doing is enabling these accurate insights. It's doing this rapidly and it's taking over the integration so you don't have to worry about that. And I think that's pretty much what I wanted to cover here before the demo. Any other questions, Chris, or can we go hop into this demo?

Chris Detzel (00:26:04):

I will be. Seeing is believing.

Jon Ulloa (00:26:07):

Gotcha. Okie-dokie. So here, hope you guys can see my screen, we have the Google BigQuery open here. I'm going to take you through a variety of stages of this demo. We have a tenant here that's already configured, both on a legacy format and the new GSL format. So I wanted [inaudible 00:26:30] time just to kind of show you what the legacy version looked like and just to show you what we've done in between. So I'll start with that. This is an option that's available with this GBQ connector. So we'll start by running this command here.


Okay, so you'll see that we give a URI, a type, an ID, created time, who's it created by, updated time, updated by. If there's a deletion event, the event time, insert time, we have a version time. This is important in both of the formats here because this is how we manage the latest object versions with our new views. We have this in the legacy compatible format as well.


And then you see now that the columns start exploding, this is column-based format here, so we have crosswalks, value, attributes, URI, update. All this stuff is going to go ahead and explode all the way out. Hundreds of columns for you. And that's how the data is organized. And it's the same for attributes, it'll go on and on off of this. And what I'm going to do now, just show you what the view looks like. It's pretty similar. Yeah.

Chris Detzel (00:28:05):

Out of the box, L3 then, how would it sync with data box GBQ model? Is it manually or automated? What will be the process?

Jon Ulloa (00:28:14):

Yeah, so that's a good question. With the legacy view we have a variety of business rules that we apply, force functions, to make the formatting and the schema for the match as close to 101 as possible. And within the documentation we note exceptions, right? For example, if there's an underscore in the Reltio UI or of an attribute, like my underscore field, it would be my dash field, something like that, right? So there's some discrepancies in that. We have business rules to go ahead try to resolve as much.


But this is why we're really promoting the new JSON format because that is now a one-to-one rounding of the Reltio data format that you see in the Reltio drive or APIs, right? To what you see in GBQ. GBQ recently added JSON as column, as a datatype itself, and we took advantage of that. That's essentially what this new feature's about, that now we don't have to do any manual business rules to make sure that that is aligned with Reltio data. It's now done automatically. That answers that question.


Okay, awesome. Yeah, so this is the view that we have here. Among the legacy views we see that it's going to be pretty much the same. What you would get here that's different is that when there's multiple versions it will come up as the latest version. But that's it. You get all these columns that are blown out here. Okay. So I'm going to be done with legacy pretty much for the most part of this demo. I just want to make anybody aware that we do have backwards compatibility that we're keeping in mind through the migration, if you guys choose to do migration, to avoid any hiccups with trying to rework anything that's already set up.


We have all this for you to consider. But we also want to recommend that you switch to the new format because this is just more highly performing and cost-efficient and provides that one-to-one mapping. So moving forward to the GSL format schema here, I'm going to show you. This is a different tenant and it will start going into the raw tables. Oh. Let's see. Okay, my bad. All right. So query response time's a lot quicker on this new format here, but you'll go ahead and see the URI, the same thing, the type of the entity. This is the contact, the ID that created, time, update time. All this is very familiar, right? Event time. But then we have the version here, but we now store these attributes and the crosswalks in JSON, right? So you have all these attributes, your crosswalks here that are going to associate with this.


Somebody was asking do we include the OV values, true, and we do. We also have this pulse rate and this would all be occurred in the attributes in the crosswalk section. So you look at the JSON, here you can see all the attributes, crosswalks kind of bundled in here. This is just the way we're designing it is it does match that 1:1 mapping of Reltio so you're not going to be getting data discrepancies. So when you think about data prep you don't need to do any additional work to go ahead and get those to match in Reltio, right? And that's kind of the point of what we're doing here with this new format. It's also more having performance.


So that's what the raw entities look like. I'll just quickly go over what the view looks like for entities. That's going to be pretty much the same kind of structure here. It's a view, so it's a virtual query, right? We recommend that you use these vies to actually query events. Like I mentioned we do use version times so that it represents the latest object version. There's a lot of events that might occur in the entity. We'll go through this in the demo today, but you would see this version time and it would always be the latest version. The raw tables include all the version history, explain this to you, that compaction algorithm I mentioned to help you with your cost-savings. But the views itself, you want to use for your queries because this is the latest object version. This is the latest data for your insights that you want to go ahead and experience.


Nice. So now I'm just going to quickly go over the schemas and the relations here. And I'm just going to call out, we talked about just in terms of the start, [inaudible 00:33:36] rights, that aspect of it. That's what we're showing here in the relations and that's target and object. That's what we're looking for of a relation between these two entities, for example. That's what you're getting out of that relation view up here.


All right. Then for interactions, I slightly digress because I'm going to go back to the old tenant, and this is the legacy view. There's no interactions on the demo tenant I have, but I just wanted to show you the member stuff. There'll be a part of this situation. Yeah, interaction type, payments, rights. And then the members of the transaction, pretty much.


So again, that's what we're seeing here for interactions. You have your member stuff, that's important, and for interactions that would be in JSON format, you would get the same kind of thing except you're not getting the column-based, you're getting the JSON, the crosswalks and attributes package, but you would understand an interaction member through that. So now we're switching back to the merges. I want to show you this real quick.


Yeah, like I mentioned, you have the merge key information, deleted, winner ID, loser ID, the match rules, first name, last name, this is a match rule and where a merge happens. The version, timestamp, inserted time, whether it is a type. This is a manual merge, I actually did this for the demo. There's auto. And then this is on the fly aspect of it, right? So this was done with a match rule here, with stamped identity, and this was created, and this is an on the fly. So this is the table structure we have here for merges.


Right, last but not least we'll do the matches. Here, entity ID, whether it was deleted, false retrieval. When you think about deleted, this is an aspect of when a match becomes a merge. We don't need to represent that anymore in the match because it's now on merge, right? So we're reflecting that here. The version, the potential matches. This is what you're looking for here, the match rules associated, potential match, whether if there's a match more enabled, put that there, and then somebody classifies it as not a match, and then if it's a manual match it's just off in a rule.


That's in a nutshell what you're getting to start with but now I wanted to switch gears to actually showing an integration within the tenant. So here I have this tenant open in the UI. Might create a new individual here. Excuse me. I'm a Loony Tunes fan so going to just create one for Elmer. What else do we want to do here? What else do I show? For example, change. Call him 35. Say he's a hunter. Let's get down. I'm putting in some random facts just to show you that we're capturing all the attributes through here too, as well. Sales values, 100,000. And I guess this is more a leads situation so I think the tenant is insurance, so lead on somebody that's trying to buy a policy. This is the aspect of that. And call him married for now, and he has this income. And then credit score stays. This is all starts with a credit score, right? And his discount or something.


Okay, so we're adding this Elmer Fudd, right, here? I'm going to save this. All right, so I have this new ID right here. I'm going to go back here and what we're going to do is show you how quick this updated. So I'll just place this here for now. When I select this statement here, this is the raw entity table for URI I just created so I want to show you if it's updated how quick it is. Let's see if it actually did update. It did. So that's super quick, right? We see this entity ID matches this one. It's created by me at this time. Deleted events and then we have these attributes here, 45, income, credit score top notch. And Elmer Fudd, right? So this is exactly, we're capturing all the attributes here, the crosswalk is from Reltio. JSON just to... Yeah, creation source is Reltio. I just made it with Reltio.


That's pretty exciting. Now I want to show you the entity view here. This is going to become important when we actually change a bit. Little bit longer on the virtual query but everything shows up here. It reflects in the view, this is what it is. And now what I'm going to do here is change this entity and let's say he's 55, actually, and he has 125,000. That's really what it is, it's the change, you get a raise at work, like that. He got a raise at work, he's a hunter. He's still in as a hunter. All right, so I just made this change, right? You'll see how it reflects over here.


This is the same query, actually I'll just use... Okay. Right, so now we see that there's two versions here, all right? The first version that I had, version one, and then version two. And I see through 45, this changed to 55. So 75,000 changed to 125,000. Right? And so that's pretty important to show because when you look at the view you'll just get the latest version. Version two. Because that's the most up to date. That's the right age and that's the income after the raise. That's what we have here.


So super face, super performant, and that's it for the near realtime experience of the demo. But I want to try to jump into... Let's see, we have 15 minutes left. I'm going to try to hurry up here. I'm trying to show you here, now we have this court. These are customer views that I created on top of the out of the box tables and views that we provide here at Reltio, part of the GBQ Connector. I created a view here and we'll just... Just to show you this is part of a report. I want to understand maybe a mass KPI report.


And what's important is that we show essentially some aspects of the potential match. We want to see this in a report and so I'm organizing by type, match rule, the source ID and if there's a match associated to that source ID, and then a match score, right? And these are queries and SQL queries that we provide to you as part of the documentation. So you can actually recreate this report yourself. Just showing you what exists in this report.


You want to look at the merges too. So, again, what type of merge, auto, manual, on the fly. The match rules associated to it, the ID and your account merges, right? Accounting merges throughout the dataset. I also want to understand what wasn't merged because I'm trying to get some totals here. Trying to understand total entities. Trying to understand what wasn't merged, what was merged, and what is a potential match for future. So I'm trying to get these totals to represent this and these are my entity type, all the entities that don't have a merge to it.


And then last but not least, this is the entity view. Entities that are updated, the start and end date, and the number of entities per the contact type. So using just these few views that are going to come out of the box for our application, but also showing you how this gets updated to a snazzy little reporting template. So we have this reporting template here, this is something that we provide to you as part of the product here, and you'll see this is off of a different tenant, but we want to go ahead and showcase the entities, entities with no merges, potential matches, auto merges and manual merges, right?


And we have merge rules and match rules and entity types of filter. We have this match rule, the number of merges associated to the match rule, and then a match rule for potential matches as well, and if there's an average score associated to it. More importantly here we have these entity ID drill-downs. So when you think about this you're trying to understand a particular drill-down to a specific entity and what just happened with the entity. What was the match rule associated to, the merge rule and the entity type.


You see entities here represent this number, entities with no merges, this number is four different. So these are the four entities that were merges and these are no merges. And the matches are separated because they're potential matches for a merge. We track them separately on this side. Just again trying to showcase that aspect of it and the merges reflect this number of merges here.


And so this is a template that you were given to work with here and really all you do to make this relevant for your data is you go in here and these are the four queries that I just went over. And you essential go in, I'm not going to replace them all because I already did it, but you go into your projects, your dataset, have this insurance, and then you pick, it's actually these tables I just created. I forget which one it was, I'm not going to add it, but you just add it. And this is all the data they connect to after the submission step two. This is pretty exciting with Google Query, right? Lots of different applications.


But once you do that for all four of these, you essentially get this. This is my demo that we're doing this BigQuery application on, the insurance. Now this is the total number of entities, entities with no merges. Potential matches and the auto merges and the merges here. So when you think about entities you subtract this no merges, then you're dealing with entities that has some kind of merge. The majority of them was auto merged. Some were manual merges. And the difference here is those on the fly merges that we've just added. So you have, again, the number of merges associated to the match rules, these match rules are going to be associated to your data model and your tenant, right? So you see here it's different. The other one was manual, this has a bunch of other ones based on the certain rules that are put in.


And then again, you drill down to the merge entities and the match and this is all actually populated as the latest merge and the latest match that was generated. So it's ordered by that. And this represents a snapshot of the data any time. Any time you'd be able to look at this report, this is the latest activity that's occurred. It's consistent with the connector value proposition.


So that's what the report looks like and what I want to do now is I'm going to add a new entity, a match rule, and then we look at that potential match and then display the report change. So I'm going to go ahead and add a new entity. And it's Elmer Fudd again. And let's say that he's 55, he's a hunter, he's male. Whatever. Let me get down to something here. Right, he's put in a sales value and he has different income. But it's the same name. And I'll say that he's now married. And then his credit score is lower or whatever. All right.


So I'm going to save this and I have a new entity ID. I go in here. This is the entity ID. I'll just run this. See, already showed up as a potential match to this entity that I just created. And you can look at this also in this potential match section here. Go here, Elmer Fudd, hunter, Elmer Fudd, hunter. Looks like it's recommending I merge based on a rule here. Last name, exact, right? This is the rule. And when we go to this report, let's see, refresh this. It's a new match, so let's add it here. Let's just go back here again to look at this one for PW, supposed to be.


I think it's for one. There you go. XLZPW. Right? So it shows up in near realtime. This is something we just created. That's pretty impressive, right? And then what I'm going to do now is merge it. I'm going to merge this guy because I think that they're the same person, based on this rule. So I'm going to go ahead and do that. It is now merged. I'll place this information here. This one. There we go. Merged. This is a winner ID, often I merge loser ID. And I'll see how this guy shows up now on the matches, right?


Again, I mentioned if you're doing a flag on the deletion, is it still a match. I'm going to look at that view real quick. Two versions. One was deleted, one was not true, right? And we have this potential match but now this is a deleted aspect. So you look at the latest matches view and we should actually see nothing. Right. So it's orchestrated here fundamentally updating throughout all these different tables based on our communities that you see in the Reltio platform, which is great.


And then last but not least you see this... Let's go ahead and change this up. Include all of this now. Let's see how this goes to the merge. Click on refresh data here. I'm going to go into the rules. So this rule here is first and last name, NOI and you see it pops up, this is the winner ID, right? And XLX, then the XLX. And it shows up all the way to the top because this is the most recent merge that happened. So that's essentially in a nutshell how this works, just showing the near realtime experience, look at your matches and merges and see how this updates in the report. It's pretty exciting stuff.


I think we have five minutes. Chris, I wanted to show this API, this entity in the [inaudible 00:53:41] monitoring API, it's probably the most exciting feature, but I'll just put the-

Chris Detzel (00:53:44):

Start with that first from now on.

Jon Ulloa (00:53:47):

Now I'll get started, maybe you've got to invite me to another one. But just a quick plug on that is that we now have a monitoring feature that allows us to look at the events that occurred within that specific entity. So there's a lot of things that happened with this, then the XLX entity, Elmer Fudd, right? We created a new entity, we updated it, we saw there was a potential match and it got merged. You can actually see all that event flow in terms of events and timestamps through this API, and that essentially, what that does is that gives you that visibility of troubleshooting what happened with this.


If there's an issue on the GBQ platform or there's an issue on Reltio you'd be able to see all the steps in terms of what the step says is data pipeline process. And when it says that then that means that it's definitely in the GBQ element. But if you see something that doesn't say that or you see an incomplete event that means that there might be something wrong that we didn't investigate. You an go ahead and create a support ticket for that.


So we're providing that visibility, right? That everything should be working as expected but in case not, for your peace of mind you can look at that slope and we're opening that up to you guys to take it out. So it's pretty cool, it's pretty neat. I'll have to probably get back to you to show that. I know we don't have a lot of time today. Wanted to just leave some time for any questions, Chris, that might come in.

Chris Detzel (00:55:14):

There's a lot of questions. I don't know that we'll get to them all but let's start. Will there be any table in GBQ for workflows like DCR?

Jon Ulloa (00:55:22):

Love the question. Yeah, that's on my roadmap. We are looking to get that out in our next release. We're looking to get out workflow datatypes, working to get out history log datatypes for that year comparison of what happened with how many merges, we saw the profiles happened this year versus last year, and we're looking to get out some activity log datatypes as well to see from an auto perspective who's doing what in the platform and looking to see on usage there.

Chris Detzel (00:55:54):

Great. So, Sean's new to Reltio. He has several questions, I'm not sure we'll get to them all, but why do we need a GCP project for Reltio? I think you did cover it but just need to have more details of what services are needed and how to connect. How the connector deployed.

Jon Ulloa (00:56:13):

Yeah, so just the connector is good for any kind of tenant within Reltio. AWS, Azure, GCP. They're just the different architecture for security purposes to ensure that there's no commingling of different environments. Since this is a Google BigQuery Connector there's a different setup when you do have a Google tenant with us versus an AWS or Azure. But the same thing is needed for the connector to connect to Google BigQuery product and data set. We need to be able to load the data into that dataset, [inaudible 00:56:50] project and do some authentication. Read our service account to allow us to get into your Google BigQuery instance.

Chris Detzel (00:56:59):

Great. And he has three other questions, I'm going to skip those for now because I think they're not going to be quick but one other question are you planning to include CSV flat format version into the query results or is that GBQ enhancement?

Jon Ulloa (00:57:18):

That's a GBQ enhancement.

Chris Detzel (00:57:20):

Okay. Do we also-

Jon Ulloa (00:57:21):

Yeah, we have the AHA-

Chris Detzel (00:57:21):

Sorry, go ahead. The AHA stuff.

Jon Ulloa (00:57:24):

Yeah, AHA, the plug for AHA, if you have the enhancement request. Put it back.

Chris Detzel (00:57:31):

Good. And can we also integrate or push this data into Google 4 or GA4, Google Analytics 4?

Jon Ulloa (00:57:39):

Yeah, so I think what's good about this is that it goes into your Google BigQuery aspect of your instance so I would say whatever Google BigQuery is compatible for you can definitely integrate into other Google services. I think I showed some of the different data things that you can pass through.

Chris Detzel (00:57:58):

Yeah. And then we'll just skip to the last one because I don't know if it's going to be long but what are the options if we need to share this data outside of Google Studio and use another reporting tool like Power BI, et cetera?

Jon Ulloa (00:58:10):

Yeah. So there's a share process here and that's going to be part of our documentation, to show you how you can share that within your organization to facilitate those types of use cases.

Chris Detzel (00:58:22):

Great. And lastly-

Shadra (00:58:24):

I meant if you wanted to use this data but not use Looker Studio, we may not want to use Looker Studio, and should we want to connect, let's say, Power BI directly to GBQ, and access the same data, be able to run similar queries, and build our reports there instead.

Jon Ulloa (00:58:46):

Hey, Shadra, how's it going? Long time no talk.

Shadra (00:58:50):


Jon Ulloa (00:58:51):

Yeah. So the way that this works is it's native to Looker Studio but the thing is that if Looker Studio can connect to BI, a tool like Tableau or Microsoft BI, then if that integration exists then it can work. So that's just something you have to investigate. But these are SQL queries that you can be able to look at what's under the hood of the report, we show those queries to see if that can go ahead and blend into your Tableau or BI instance.

Shadra (00:59:23):

All right, sounds good.

Chris Detzel (00:59:25):

All right everyone, we're out of time.

Shadra (00:59:27):

Thanks, Jon.

Chris Detzel (00:59:29):

And, so Sean, I will actually push those questions out to the Reltio community, so thank you everyone for coming. After this show you will get a pop-up from Zoom. Please take the survey and all your feedback is very helpful. Until next week, to another show, Jon, thank you so much. Jon, this was really good, really exciting stuff here on the two connectors, specifically this one, GBQ. So, thank you everyone for coming, and that's a wrap.

Jon Ulloa (01:00:02):

Appreciate it, Chris, thank you so much.

Chris Detzel (01:00:03):

Take care, bye.

Jon Ulloa (01:00:04):

Yeah, bye.