Saturday, December 19, 2020

Overview of Adobe Privacy Service and Data Repair API

Imagine a world where user data and information flows freely without any regulations and guidelines. Information ranging from your date of birth, health records to credit card number is accessible by anyone on the Internet. Well, given that we're still in the middle of a pandemic, let's not make things worse and get back to reality. 

We know that web analytics and data management platforms (barring CDPs) are not supposed to store PII or PHI information and are designed to capture anonymous behavioral user information with the ability to store demographic data using indirectly identifiable encrypted IDs. However, it's virtually impossible to control what data gets sent into these tools and in my personal experience working with a lot of different clients and tools, I've seen PII data such as email address, phone number and even SSN still being passed accidentally. Now, I don't think there ever can be a situation where we can completely guarantee that this won't ever happen again but we can certainly put in the right guardrails in place before data is captured.

GDPR and CCPA (and other regulations) are some of the ways which put more power in the hands of individuals to know, opt-out and delete any data collected on them and understand how it's used. This article explains the impact of these two regulations on marketers but before that it is also important to understand the difference between data subjects, data processor and data controller as explained here.

In this post, I will cover how Adobe Analytics customers can obfuscate or delete data which is done using either the Adobe Privacy Service (UI) and Data Repair API respectively. Given that my article will primarily cover how these two tools can be used to execute the deletion requests, please review the official documentation on these two tools for the finer details which I won't cover. So, let's dive in!

Adobe Privacy Service

The Adobe Privacy Service provides Adobe customers with a user interface and APIs to help manage their customer data requests AFTER the data has been collected as opposed to an opt-in service such as OneTrust which blocks all data from being captured in real-time. The Privacy Service allows you to selectively delete data from all Adobe solutions such as Adobe Analytics, Target etc. based on user identifiers or namespaces such as ECID or encrypted user ids. Please note that the Privacy API should not be used to delete PII data captured accidentally but should only be used to serve delete requests from data subjects. Also, note that I will specifically cover the privacy UI in this post but there also a privacy service API which allows you to accomplish the same tasks if you want to do it programmatically.

Use Case

The primary use case for leveraging the Privacy Service is to either access or delete data for users who explicitly reach out to a brand to request a copy of all their personal data or ask for their personal data to be deleted. A good example of how users can do so is shown here.


Overview of Privacy Labels


In order to access or delete any data from Adobe Analytics, Adobe Experience Platform and other solutions, the first step is to add privacy labels to each attribute which contains sensitive data. The labels are classified into three primary categories as covered here so please review these as the scope of my article doesn't cover what these are. In this article, I will use two Adobe Analytics variables Internal Search Term and User Name as examples to first perform a data deletion request through the Privacy Service UI and then through the Data Repair API. 

The first step is to go to the Data Governance section by visiting Analytics > Admin > Data Governance within Adobe Analytics. You'll first land on the page which lets you select your report suite and will also show you the data retention policy for now long your data will be retained.

Once you select your report suite, you can see that in my case, Internal Search Term (eVar1) variable has a l1 label which essentially means that there's PII information in this variable which is captured for some user names captured in eVar50 which has the label l2 (indirectly identifiable). Please note that only the variables which are labelled will be part of the delete requests and unlabelled variables will be left as-is.


We also need to pick some data data governance labels where I'm specifying that I want to delete all data at the person level. The labels are explained in more detail for eVar50 below. Please note that in order to delete data tied to a custom variable, you will need to define a custom namespace (profile_id in my case) which will be the primary identifier to delete IDs. You can name it anything or can use an OOTB namespace such as ECID.


Privacy Job Request


A privacy job request can be made by visiting the privacy page (requires IMS authentication). There is an option to pick the regulation type (GDPR, CCPA etc.) and delete IDs either manually or in bulk by uploading IDs in a JSON file (covered below). I will cover how to do it using both methods but before I do so, let's take a look at the data captured in eVar1 and eVar50. One thing to note is that only data tied to user ids "abc123" and "def456" will be obfuscated as I will only be processing delete requests for these ids and the rest of the data will be left as-is.


Here are the two methods by which you can send a delete request from the privacy UI.


I'll first process a delete request using the manual method. Please note that you can process a request for any of the solutions mentioned but in my case, I'll be deleting (obfuscating) data from Adobe Analytics for the user id "abc123" captured in eVar50 which is tied to the namespace "profile_id" so anytime I enter the value manually, it will be obfuscated but this is not a scalable approach if you want to delete ids in bulk.


Once you click create, you will see that a job id is created which contains my user name "abc123". 


I'll now process a delete request using the JSON upload method which allows you to upload up to 1000 IDs at a time. In my case, I only have 1 ID to delete called "def456" but you can upload up to 1000 per request but limit the requests to up to 10,000 IDs per day. Below is what my JSON looks like. Note that you need to include your IMS org, specify the namespace and add more ids in the "users" array among other attributes.


I also had my network tab open while doing this so you can see the request actually makes an API call to the Privacy Service and sends over the necessary fields to process a delete request.


Impact on Adobe Analytics Data


Once the requests have completely processed which typically takes anywhere between 5-7 days, you can see that the status is marked as "complete". I've purposely hidden the job id and requester email address.


As far as data in Analytics is concerned, you will see that data in the two variables in question will contain the word "privacy-" followed up with a unique identifier for every record tied to the user IDs I sent in the request.



There's a lot of information available in the official Adobe document but I've only covered information relevant to my use case.


Data Repair API

The Data Repair API provides Adobe Analytics customers access to APIs which allows them to delete any data which they want to remove. This API scans all rows of data for a particular report suite and deletes all data in a custom variable defined as part of the API request.

Use Case

The primary use case for leveraging the Data Repair API is to completely delete data from Adobe Analytics variables. The typical scenario is when a customer may have inadvertently captured PII data in an Analytics variable.

Data Repair API Request

The official documentation covers all information around the prerequisites (admin console, console.io access token, global company id etc.) and caveats so I highly recommend that you read that. In this section, I'll cover how I went about sending the API requests using Postman.

Populate the access token, client id (from I/O Console) and global company ID (from Analytics) and as headers in the /serverCallEstimate GET call.


Pass the start and end dates as query string parameters in the same request. You will see that the API response generates a validation token along with specifying the total number of server calls.

The request is the actual delete POST request where we specify the variable that needs to be deleted. You can add multiple variables as part of the same request but I only sent a request for eVar50 in this example. Also, take a look at the response which provides us with a job ID and status.

So, that was the extent of what I will cover but a lot of other useful information is covered in the official Adobe documentation.

Impact on Adobe Analytics Data

Once the request is processed, all data is deleted from Adobe Analytics as shown below.


What Else Would I Like to See Added

Even though I'm happy that we finally have a productized solution for deleting PII data, I would like to see the following enhancements added to it:

  • A user interface to make these requests in addition to APIs
  • Ability to add regex or conditional logic built into the UI to selectively delete data instead of deleting everything 
  • Make this API available for other solutions including AAM and others
  • This is not a big deal but when I tried to send multiple requests (one for each variable), I ran into the following error. It will be nice if users are able to send multiple requests without waiting. Regardless, you can get around this by sending a delete requests for multiple variables as part of the same call.

All in all, it is a great tool for customers who want to delete data using a productized solution which is much cheaper to use. Given that I learnt about this API recently, I wanted to share my learnings with everyone. Hope you found this post informative. 

No comments: