Beyond Attribution to Action: Building a Personalization System


Share on LinkedIn

Personalization may not be as controversial, discussed, or intriguing as attribution, but it’s ultimately rather more important. I didn’t really intend it this way, but framing attribution as a problem around lift also makes attribution and personalization more closely related. Every marketing touch is a kind of personalization – singling out a distinct pathway to conversion that may generate incremental lift. Personalization is the set of techniques you use to tune a specific touch to generate maximum lift. So personalization and attribution aren’t, like “hot” and “cold” – the same concept along a subjective relation – they are more like wealth (measuring how much you have and how you got it) and investing (the techniques for maximizing what you have). Not surprisingly, like wealth and investing, it tends to be the heaviest spenders (richest) enterprises that focus their attention on attribution (wealth). The less wealthy are too busy trying to make things better (investing) to worry so much about assigning credit and blame! 

Having said that, I don’t want to undersell attribution. While I’m deeply skeptical about the general way in which it’s executed, understanding true campaign performance is an essential part of digital analytics. Still, personalization is more important. More important to the business and the business model since, as I’ve been saying over and over in this series, personalization is the heart of digital. As a direct channel, digital rewards personalization in almost every form and the sites and companies that have created distinct competitive advantage using digital have almost always found unique new ways to deliver a highly personalized experience.

So why isn’t personalization more common?

Both personalization and attribution must be operationalized to be effective. There’s not a lot of point in building an attribution model unless you can maintain it and use it to optimize campaign spend. Still, people do exactly thatall the time. And while it’s no picnic to operationalize attribution, it’s fairly easy to get a handle on. Personalization is much more demanding. There’s no point whatsoever to doing personalization analytics without the ability to actually personalize touchpoints, and the requirements to operationalize personalization can be numerous and complex.

The absolute (and obvious) necessity of operationalizing personalization coupled with the daunting technical challenges that are often involved form a significant barrier to delivering digital personalization.

So in this post, I wanted to outline what it takes to deliver on digital personalization. In my previous posts, I surveyed the type of analytics required in considerable detail. That’s the fun part. The operational part is the challenge.

At the highest level, here are the core components you need to deliver personalization:

Analytics Warehouse (Exploratory Modeling Environment)

If you go back and review all the different personalization tactics, you’ll find some that are almost analytics free. So yes, it is possible to deliver personalization without deep analytics and the warehouse that can drive them. But if you take that route, you’ll be limiting – and probably crippling – the underlying capability in the program. It doesn’t take much thought to see why this is almost always true. If you build personalization rules by hand, you’re certain to limit the number of specific cases you address. That, in turn, forces you to focus on cases where the segment sizes are very large so that you have significant impact. This, in turn, ensures that the personalization is shallow.

Personalization at scale is a modeling exercise that requires significant computing power. It also requires a wide-range of integrated data sources. Put those two things together, and you’re talking an analytics warehouse.

What does that analytics warehouse need to contain? Typically, you’ll integrate the digital data that tells you what customers and prospects are looking at and thinking about right now with sufficient traditional customer and segmentation data to tell you who they are and what relationship they have with you. Enough data, in other words, to support a rich, two-tiered segmentation. That isn’t a coincidence. It’s not that personalization techniques are identical with two-tiered segmentation. They aren’t. Our digital segmentation is just one type of personalization model. But the data necessary to support most other types of personalization is essentially the same.


Production Modeling Environment

If you’re building significant personalization into your business (and you should be), then you’ll have to supplement the analytics warehouse with a production system that provides more rigorous control over the environment in which the models are actually run. Think about it this way: the analytics warehouse is a kind of playground for your analysts and data scientists. They are going to be running large-scale data exploration jobs. The sort of jobs that can kill almost any system. When you start to operationalize personalization systems, you’ll be running models on a regular cadence. Those jobs will HAVE to run, since they are supporting operational systems. Now you’ve got a conflict between the core purpose of the analytics warehouse and the operational necessities that drive the business. Guess which wins? Over time, this can create a warehouse where the analysts simply don’t have the flexibility they need to analyze data. I’ve worked on some traditional relational database warehouses where, because of operational constraints, the queries were so locked down that it was impossible to do even basic data exploration.

It isn’t just a matter of competing processor time, either. The analytics warehouse will, by design, keep data at a fairly low-level of structure. But as you discover key aggregations that work (such as a two-tiered segmentation), it won’t make sense to re-impose that structure on the raw data every single time you want to apply it. One of the biggest mistakes I see companies make in their Hadoop systems is to insist that data live only at the lowest-level of detail. This is conceptual error in almost every respect and a nearly fatal misunderstanding. A key goal when you do analysis on these systems is to find mid-level data structures worth retaining. While this is true even within the confines of analytics exploration, it also has broader implications for creating a production environment. Some mid-level structures will need to be maintained in essentially static format, and having a stable production environment makes that far easier.

I know that it’s a huge temptation to blend the analytics warehouse with the production system. Sometimes you won’t have a choice. But if your enterprise is getting serious about personalization, you’re eventually going to have split these two functions and create a system that provides a stable, protected environment for executing the models you’ve discovered.

Customer Profile Repository

An analytics warehouse is where you build the models and production modeling environment is where you operationalize them. But there’s really a third piece to the puzzle, which is where you deposit every model result and customer fact that you need to support personalization decisions. Might this be the same as the Production Modeling Environment? Yes, but there are strong reasons why it might be something completely different. The production modeling environment will include huge amounts of data. Leaving that data even potentially accessible to the outside world is risky. What’s more, the production modeling environment is a place where you’re still executing large scale, complex analysis jobs. Those large-scale, high intensity jobs make it a poor system to support date exchange across the enterprise. There are also architectural reasons why the Customer Profile Repository may need to be separate. The type of system that can provide easy, transactional lookups for customer profile data points and batch jobs isn’t necessarily the type of system best for instantiating large-scale statistical models.

There are competing technologies for creating an analytics warehouse, but they are all, ultimately, the same thing. It’s not quite the same story when we come to the Customer Profile Repository because there are fundamentally different types of systems competing to house this information. The DMP, the Customer Data Warehouse, the CMS, CEPs, and custom personalization engines all provide a potential resting place for customer profile data. It isn’t an easy choice. The system(s) you pick are likely to be determined by the number of different personalization strategies you’re pursuing and the channels you’re striving to personalize. Housing your customer profile repository in your CMS may be fine for driving Website personalization (it may also be insufficient even there depending on the type of personalization you’re delivering) but it’s unhelpful for driving email or call-center personalization.

The ideal solution customer profile repository would provide very fast access via Web Service to any application desiring to access customer profile data. As with many IT Technology problems, however, the ideal solution is not always the right solution.


Personalization Engine

A personalization engine drives the decision-making around personalization within a channel or campaign. Think about it this way: in the analytics warehouse we’ve built segmentation models, propensity scores, decision-trees, etc. But all of these, however fancy, are just classifications of customers. To get personal, you need to match up those classifications with offers or creative. You need to decide that customers with a high score in dimension X get campaign Y (or vice versa). That’s the job of the personalization engine.

As with the Customer Profile Repository, there are a LOT of different ways to create a personalization engine. Every CMS is, to a greater or lesser extent, a personalization engine. A/B and MVT solutions are all, again to a greater or lesser extent, personalization engines. Complex Event Processors (CEPs) are often created as personalization engines. There are personalization engines embedded or available inside many existing channel solutions – traditional mail, email, call-center and Web alike. And, of course, there are black box solutions that provide some level of integration across modeling, production, repository and personalization engine.

In my next post, I plan to dive deeper into this spectrum of personalization engines and discuss the range of solutions from simple rule-based solutions inside Web testing tools to solutions that allow you integrate custom models and scoring to black-box solutions. I’m of the opinion that for most enterprises that are serious about personalization, there actually is one right place to be on that spectrum and I’ll explain what that answer is and why other approaches are generally inadequate.


Delivery Engine

Depending on your choice of personalization engine, it may or may not come tightly coupled with a delivery engine. The delivery engine is what delivers the experience to the customer. For a website, the delivery engine might be the CMS, it might be a testing tool, or it might be a pure personalization engine. In many cases, the delivery system will be fixed. You may, for example, have an email system with a pre-existing method of delivery alternative versions. You’ll need to move your personalization strategies into the existing delivery engine. Sometimes, the personalization engine may sit on top of an existing platform but provide its own built-in delivery engine (testing tools are a good example of this as are black-box solutions).

If you aren’t working inside an integrated platform, pulling off the integration between your models, your repository, your personalization engine and your delivery engine(s) is likely to be as much work as all the rest combined.

Part of the reason that integration can be such a challenge is that the most effective personalization often requires deep integration with your operational systems. For industries like travel, hospitality, social networking and ecommerce, personalization of the operational components of the touch – things like search results, checkouts, and product configurations – will be far more important and impactful than personalization of the home page or other relatively static creative elements. Particularly in mature industries, the operational systems ttat drive those components are often the realm of big iron; untouchable systems with millions of lines of code than almost nobody understands and everybody is afraid to touch. When that’s the case, you have to find creative ways to interface with these systems, jumping through hoops so that they don’t have to change the limited interfaces they happen to have.



Not every channel is created equal when it comes to personalization. It’s much easier to personalize channels like email or direct mail where you have, from an IT perspective, plenty of time to model, decision and deliver personalization. For channels like the Web, personalization is just plain harder. This is particularly true when your personalization strategies require you to integrate current behavior with either past behavior or customer data. There’s a fairly straightforward hierarchy here:

  • Easiest: Personalization without real-time
  • Moderate: Real-time personalization that doesn’t require on-the-fly modeling
  • Difficult: Real-time personalization with on-the-fly modeling

As I’ve already mentioned, email is a good example of the first. Personalizing the home page for return visitors is a good example of the second. Changing search results based on what the visitor has just searched for AND who the customer is an example of the third.

If you’re trying to support this last, most difficult scenario, it will likely have a big impact on the decisions you make around technology stack. The analytics warehousing components are largely unaffected, but every other component in the personalization stack may need to be re-tooled, extended or supplemented. The personalization engine is likely to be a CEP. The repository will likely have to be in-memory; often requiring a slimmed down version of a full customer record. The delivery engine will have to be very tightly integrated to achieve the requisite sub-second response time. It’s hard and it’s certainly going to be expensive.

Is it worth it?

I think it is. If that’s where your business really lives, personalization is the only path in the digital realm that provides true competitive advantage.

Republished with author's permission from original post.

Gary Angel
Gary is the CEO of Digital Mortar. DM is the leading platform for in-store customer journey analytics. It provides near real-time reporting and analysis of how stores performed including full in-store funnel analysis, segmented customer journey analysis, staff evaluation and optimization, and compliance reporting. Prior to founding Digital Mortar, Gary led Ernst & Young's Digital Analytics practice. His previous company, Semphonic, was acquired by EY in 2013.


Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

Please enter your comment!
Please enter your name here