Reltio Connect

 View Only
  • 1.  Best Practices For Handling Events in Reltio DLQ

    Posted 11-03-2022 21:06
    What are folks doing to handle entries in the Reltio Deadletter Queue (DLQ):

    1. How are the specific entries in the queue identified and read
    2. What approach is used to address the errors that occur especially as the issue would appear to be in the Reltio stack
    3. Are you using the 200 attempts threshold before entries are written to the DLQ
    4. What have you been able to automate
    5. What role is Reltio (Support or otherwise) playing in managing the queues, identifying issues, resolving the root cause and reprocessing the previously queued entry
    Thanks,

    Mark

    ------------------------------
    Mark Burlock
    Dodge Data & Analytics
    Hamilton NJ
    ------------------------------


  • 2.  RE: Best Practices For Handling Events in Reltio DLQ

    Reltio Employee
    Posted 11-04-2022 10:01
    @Mark Burlock, I can answer your question on number 5.  I will also have @Saurabh Agarwal ​answer some of the others.

    Support does not have any proactive mechanisms in place. The reason being DLQ is a collection of failed messages which have been retried over a period. IRS tasks should rebuild the missing messages to ensure consistency of the tenant.

    With that said, I do have some documentation that could be helpful. 

    Monitoring Queues: 

    https://docs.reltio.com/en/explore/get-your-bearings-in-reltio/console/tenant-management-applications/tenant-management/external-queues-overview/monitoring-queues 


    The internal queues are used to process data internally within Reltio. The external queue is used for real-time integration with systems external to Reltio. The internal queues are further classified into the following queues:

    Queue Size
    The following image shows the queue size of the internal queue:



    Match Queue Processing Speed

    The following image shows the processing speed of the CRUD and Match queue:


    The following image shows the details of the time spent by events in the queue:


    The following image shows the number of messages that failed:


    The following image shows the size of the DeadLetter queue:



    The External queue dashboard displays the number of events or messages sent to the external queue.



    ------------------------------
    Chris Detzel
    Director of Customer Community and Engagement
    Reltio
    ------------------------------



  • 3.  RE: Best Practices For Handling Events in Reltio DLQ

    Reltio Employee
    Posted 11-04-2022 11:20
    Hi Mark,

    ------------------------------
    Saurabh Agarwal
    ------------------------------



  • 4.  RE: Best Practices For Handling Events in Reltio DLQ

    Posted 11-04-2022 12:31
    @Saurabh Agarwal

    We currently have just shy of 500 entries in the DLQ and that number has been relatively consistent since early October.

    Visibility into the entries in the DLQ would permit us to know what entities may be impacted so  that we can research further to take possible corrective action.  A failure to ever process the DLQ entries results in a loss of data.  A significant delay results in time sensitive data changes being "missed".

    The fact that  DLQ entries will be lost after 14 days as we are running on AWS, further requires visibility before the entires are lost forever.

    With the 500 entries in the DLQ and no further visibility, I don't know whether it's one entity, 500 entities or something in between.

    It would also be helpful to understand the types of errors that are occurring, especially before they are fixed by Engineering so that we might understand how to minimize or mitigate the occurences of same.

    Thanks,

    Mark


    ------------------------------
    Mark Burlock
    Dodge Data & Analytics
    Hamilton NJ
    ------------------------------



  • 5.  RE: Best Practices For Handling Events in Reltio DLQ

    Reltio Employee
    Posted 11-07-2022 10:28
    Hi Mark,

    Are you referring to events that are transmitted to external queues for consumption by your systems? For these events, Reltio's infracture is designed to  guarantee that all the events are transmitted to the external queues.

    Events in DLQ are the events that fail to sync between Reltio's primary and secondary databases and may cause temporary inconsistency in customer's tenant data. These temporary inconsistencies are resolved by the IRS. Investigation on events in DLQ is Reltio's responsibility and does not need customers to investigate. No corrective actions are required from you on these events.

    Thanks,

    ------------------------------
    Saurabh Agarwal
    ------------------------------



  • 6.  RE: Best Practices For Handling Events in Reltio DLQ

    Posted 11-07-2022 14:56
    Hi Saurabh,

    We are seeing that requests successfully made to Reltio are not making it out to the external queue for consumption by our systems.

    It is unclear where the message is being dropped, we are currently seeing max time in the queues of just over 2 days, a dynamic failed message count that does go down to zero and a continual inventory in the DLQ with some drops which I believe are primarily related to the AWS 14 day retention of the queue entries.

    We have intentionally modified an attribute for a specific entity id for testing purposes on 11/2 and have seen NO indication of the change reflected in the external queue intended for consumpion.  At the same time we did same change to a different entity id and all is working as expected, as is the case for about 99.9% of our requests.

    There were earlier event entries (up thru 10/4), but they are no longer coming thru.

    I strongly suspect that the data we are waiting to find is in the DLQ and will remain there until it ages out. 

    We did experience some errors a few weeks ago where crosswalk and nested sub-attribute limits were exceeded, but the changes which we are now "missing" occurred after those changes.

    Originally we were expecting that these would be addressed by Reltio, but concern is that some appear to be dropping.

    Do you know is customers can have direct read access to the Reltio DLQ.

    We do have a ticket opened with Reltio Support, but I posted to the community in the hopes of gaining a better understanding of what might be going on and also what other customers are experiencing.

    Thanks,

    Mark






    ------------------------------
    Mark Burlock
    Dodge Data & Analytics
    Hamilton NJ
    ------------------------------



  • 7.  RE: Best Practices For Handling Events in Reltio DLQ

    Reltio Employee
    Posted 11-10-2022 10:27
    Hi Mark,

    I will suggest that we let the support team do their initial investigation to identify the reason for the issues you are seeing. The support team will connect with the product and engineering teams if they need assistance.

    We are unable to provide you access to Reltio DLQ

    Thanks,

    ------------------------------
    Saurabh Agarwal
    ------------------------------