Step One of Refocusing Your Organization Around CLV: De-silo Your Data

The following is an excerpt from our new book, The Chance of a Lifetime: How to Use Customer Lifetime Value Reporting to Grow Your Retail Business. To read the book in its entirety, get it here.

At first glance, this might look like an intimidating step. You’ve got point of sales data, purchase history, online browsing behavior, demographic data, social media metrics, email engagement metrics, and so on. It’s a lot to pull together.

Prioritize Data Sources

We believe that your most valuable data by far is individual-level purchase data. Why? The actual exchange of money shows strong intention and is the fundamental starting point for how cash will flow in the future.

Within the purchase history, there are nuances of transactional behavior that tend to be highly correlated with CLV. The treasure trove of informative data points embedded just in your online and offline transactional history include things like:

  • What products they buy
  • Product purchase history
  • Channels of entry
  • Promotions
  • Shipping and payment methods

Other types of data that may be useful but aren't as high-priority as transactional history are things like customer demographic data (geo-location, age, gender, marital status, income, etc.) and non-purchase engagement behavior (e.g., how they interact with your loyalty program, email open and click data, site browse data).

But you don't need to unify ALL of your data before you can reap any benefit. Taking a pragmatic approach, focus strategically on the data sources that are most important for CLV modeling. You can enrich the high-priority data by adding new dimensions over time.

Let’s say you have a transactional file that looks something like this:

User ID

Order Date

Revenue

12345

2019-04-16

$299

23456

2019-03-27

$144


And then you have a users file that looks something like this:

Email address

City

abc@custora.com

Brooklyn

xyz@custora.com

Columbus


If we're going to create a unified profile of each user, we need some way of linking our users back to their purchase behavior. So what we mean by "stitching it together" is to identify the common identifiers that can be used across data sources.

So for the users file here, what we might want to do is add a column for the user ID.

Cleanse Your Data

There are three major categories of data cleansing: data standardization, data validation, data deduplication and consolidation.

190527_BLOG_DesiloYourData_Embedded-02


1. Data Standardization 

Data standardization creates uniformity by grouping like values in a set. In the world of e-commerce, there are countless instances where slight variations in data will carry the same operational value.

For example, when inputting a shipping address, Jayne Dough might write out “street” for her first purchase but then, for her second purchase, she abbreviates it as “st,” thus creating two records.

Recognizing that these data points represent the same person ensures that

data is organized based on standardized criteria rather than meaningless differences.

190527_BLOG_DesiloYourData_Embedded-03

2. Data Validation

Data validation processes guarantee that data makes sense against all governing business rules. An obvious glitch in data might appear if, for instance, the assigned date of return is actually earlier than the date of purchase. These are the kinds of things that need to be fixed before modeling can begin.

190527_BLOG_DesiloYourData_Embedded-04


3. Data Deduplication and Consolidation

Data deduplication and consolidation eliminate redundant pieces of information and provide a retailer with a single definitive set of records. Should the same customer check out as a guest three times and input slightly different variations of her name or address each time, the model should recognize these variable inputs as coming from the same person and consolidate them into a single user profile.

Best Practice for Data Unification

Focus on Scalability

Making CLV usable for your organization is not a one-time exercise. You're going to want to keep those CLV scores refreshed and updated. From a de-siloing perspective, that means building scalable ETL (extract-transform-load) processes, which is basically the process of moving and unifying data.

If you'd like to learn more about how to prep your data for predictive modeling, see our webinar, Jumping the 3 Big Hurdles to Predictive Modelinghosted by Custora Head of Product Marketing Jordan Elkind,

And to learn more about how Customer Lifetime Value reporting can help you to efficiently and effectively grow your business across customer lifecycles phases, download our new book, The Chance of a Lifetime

Like this? You might also enjoy these.

Meet Your Best Customers for Long-Term Profitability

In today’s article: How your VIP segments break down across demographic...
Read

Rolling Out the Red Carpet: Measuring Your VIP Segment

We’ve said it before, and we’ll say it again: retention is the new acquisition....
Read
, ,

How Customer Data Turned Store Closure Into Opportunity

One of Custora’s clients is an omnichannel retailer that decided they needed to...
Read