Saturday, December 1, 2018

Reasons For Data Variance Between Analytics Systems

I'm almost certain that at some point in your career, you've been asked to troubleshoot an age old problem of comparing numbers between different analytics systems. I have been asked about this (several times!) so I'm writing this post to share what in my experience, are some of the reasons for this difference and also some potential ways to close these gaps. I will talk about a recent experience I had with a client and talk about some of the issues we faced and a potential solution the team is evaluating. 

Based on a few comments I got about it to not pursue this type of analysis and only look at trends, I want to clarify that we did that and the trends did not line up in this case. We all know that no two systems are going to be the same but in such a scenario, you do need a way to validate your orders to establish the web analytics tool as the ultimate source of truth. That can only be done by going through this painful exercise at least once which happened to be this time when I was first looped into this project. 

The Challenge and Context

My client reported a 25% variance between eCommerce orders reported on the website and their backend system where the website was lower. The issue started after they redesigned the UI (Single Page App) and updated the backend platform. The difference was caught about 2 weeks after the new UI went live. We had used DTM to slightly revamp the implementation to add additional attributes and asked the platform development team to fire a Direct Call Rule (same as before) on the confirmation page which was already in place in the previous version. 

The client wanted us to come up with a solution to 'backfill' these historical missing orders in Adobe Analytics and also apply the right kind of marketing attribution to them which was always going to be challenging as time had already passed. The basic issue is that once a Visitor ID has had been to the site and visited a bunch of pages and certain eVars were set, backfilling data for that user would ‘break’ the pathing flow, the visitor profile and any downstream conversion metrics would also be messed up. This Adobe article on timestamps explains the issue better than I did.

What is an Acceptable Variance between Systems?

Honestly, the answer to that question depends on the client, system and the metric we're looking at. If we're looking at deduplicated metrics such as Visits, Visitors or even Orders, then anything below 5% is acceptable in my experience. For duplicated metrics such as Page Views or Clicks, I've seen clients accept anything below 10-15% if we're comparing one client side system (Adobe Analytics) with a backend CRM system (my use case). 

The other thing to note is that if we're comparing Adobe Analytics to Google Analytics or any other web analytics system, then a lower variance should be expected. Keep in mind that both client side solutions should fire as close to each other as possible on the page to expect a low variance.

Analysis and Approach to Pinpoint the Variance 

During the course of the investigation, we looked into the following factors that could've contributed to the variance. 

But before we begin, make sure you download records from both systems that should be from the same time frame, time zone and need to have the same dimensions and filters as much as possible. We know that no two systems will be apples to apples but let's make sure the comparison files are as close to each other in all respects. Let's now take a look at some of the factors we looked at keeping the orders metric in mind in comparing a web analytics solution such as Adobe Analytics vs. a backend CRM system.
  • If you're not tracking all the payment methods in one system vs. the other system, you will see some discrepancy in the overall orders.
  • There may be conflicting JavaScript code deployed on the confirmation pages (privacy, plug-ins etc.) that will not be compatible with scripts used by other payment methods or Analytics tags. This was actually one of the causes of the difference.
  • Legacy Browsers may not run JavaScript tags so that is another cause of the variance.
  • There may be internal IP filters or bot filters that are applied at the web analytic tool level that won't be present in the CRM system so that is another reason.
  • Given that our Direct Call rule was firing at least 8-10 other marketing tags including Google Analytics, we thought that in some cases, Adobe Analytics didn't fire because of that so try to limit the number of tags on key pages such as confirmation.
  • Most important is the order of execution of the Adobe Analytics tag and if it's able to successfully grab all elements present in the data layer. In our case, we noticed that Adobe Analytics was not able to grab elements from the data layer 100% of the time.

What was the Root Cause of the Variance?

The main reason for the variance was that the website did not fire the Direct Call Rule in DTM required to track the orders 100% of the time in Adobe whereas the CRM system always captured the order if it was a successful submission. We were able to nail this down when we looked at the orders in Google Analytics which ALSO reported a 25% variance between the CRM system.

Potential Solutions to Backfill Missing Traffic

Partial- Data Insertion API

Data Insertion API is a very well known and common methodology to send data to Adobe Analytics server side. In this case, we tried to leverage this method to backfill historical orders and it worked but the other requirement around marketing channel attribution could not be met by this method. If your client also wants to fix the marketing channel attribution problem, this method is not going to work.

Regardless, I still want to quickly cover how we went about testing some orders using Postman to send a POST request to Adobe Analytics. The following screenshot shows a sample XML request to send historical order to Analytics. Note that we need to pass an epoch based timestamp which is a required field to backfill historical or any data.

Complete- Data Reprocessing

I don't have much experience with this but based on what I've heard, a custom consulting engagement with Adobe's engineering team to reprocess the data can fix the attribution issue. This is still being evaluated and I will provide an update when we're done with the project.

How Can We Avoid Issues Like This?

The only way to avoid such issues in the future is to thoroughly test this in a development environment before anything goes to production. A similar kind of rigor is needed to test key metrics in a test environment similar to how it is in a production environment especially during major redesigns so that we nip it in the bud at the onset. The other thing that you can do on an ongoing basis is create anomaly detection dashboards in Adobe Analytics or other tools that are designed to catch any noticeable drops or inflations. I know there are other ways to avoid issues but that is a separate blog post in itself.

My hope with the post is that I'm able to help you find some reasons around why traffic is different between systems on how to be better prepared while undertaking such a task. I wrote about something similar way back in 2007 when I started my career that might be relevant. Have you run into a similar issue for your clients?


Sharan S said...

Great post! I was not aware of the data insertion API feature to historically load order data. We had similar data variance issues on order confirmation page and the business was losing confidence in Click-stream data from Adobe. So apart from placing Anomaly detection on these important pages, what we made sure is to have our QA team create a test environment that is an exact replica of Production(which needed some effort and approvals). We started to check for variances with back end CRM systems for couple of months. In those 2 months of thorough testing, we fixed unwanted tags, made sure the Adobe beacons were firing 100% with no race conditions capturing all elements from data layer, performed cross browser/platform testing, regression testing (peak traffic) and tried to cover all the test case scenarios. With this the variance reduced and only then we had the confidence pushed the code into Production. I think with time, thorough testing and right environment is needed to avoid such variances.

Rohan Kapoor said...

Sharan, very well said and articulated. I hope this is what all clients do from the start. Thanks for sharing!