Image Source: Gizmodo
There’s been a lot of talk about the future of personalization in email and, while there are many approaches to personalization, one thing is for sure: you need data. The trouble with data, however, is that it’s hard. It’s messy. It’s incomplete. Before you embark further in your personalized email strategy, there are a few things you should understand about your data.
Most Data Gives an Incomplete Picture
Is your business the number one place where your shoppers spend their money? Unless you’re Amazon, probably not. Put yourself in a consumer mindset and answer this question – how much do you spend each month with your favorite retailer? $50? $100? $500? Chances are, it’s only a fraction of your disposable income. How much do you spend with your second favorite retailer? Your third? By the time you get to your tenth favorite, it’s likely a place you’re only shopping a few times a year. So how on earth is this retailer – or any retailer – supposed to create a picture of you as a shopper based on such limited transactions? In data environments, retailers are always only exposed to part of the total picture.
Data Models are Sensitive
If you don’t see a customer very often, you have to use models to predict their evolving tastes, moods and behaviors. These models can be quite sensitive, which is why platforms like IBM Watson will never be ubiquitous. If you think of IBM Global Services as a contractor, Watson is like a hammer in the toolkit. Yes, you need the hammer to build the house, but you can’t confuse it with a complete solution. Retail data model problems are so sensitive that 90 percent of the money you’d spend on Watson would be for professional services to tune Watson to your specific case.
Think about it: if a model can predict consumer shopping trends and behavior at LL Bean, will that same model be able to predict behavior at another (very different) retailer like Staples? Probably not. Data models are extremely sensitive and are tuned to the particular customer environment. The solution is to parameterize the model in the same way you’d use a sound mixing board for music. You parameterize your data model because it’s very unlikely that one size will fit all retailers’ data models.
Data is Messy
Much like data models, data signals are also very sensitive. There is so much noise and so many human errors within a system it’s hard to get a ‘clean’ view of the data. A great example of how data can really mess up customer profiles is the fact that, during the holiday season, Neiman Marcus sold a $1.5 million rose gold private plane on its website as part of its fantasy gift collection. Let’s pretend Neiman Marcus is doing an A/B test and, in one group, someone buys a plane. That’s a huge revenue skew that would make it look like test group A had a 1000 percent lift over the control population. This type of scenario happens fairly frequently. When Mariah Carey shows up at Neiman Marcus and spends a million dollars – is that because she got an email from the store? Probably not, but who knows. Retailers need to understand how to control for this noise and craziness when it happens.
Another scenario is for retailers that have customers who buy up a lot of inventory only to sell it elsewhere. That’s a funny looking customer in the data – this person shops the same luxury retailer every day and buys across all product categories, all sizes. It’s hard to make sense of this customer’s profile – do they really wear a size XL and a small?
These factors make the data messy and hard to control, and they can drown your signal if you’re not careful. When training the model, you need to filter out the useless junk that won’t help you predict the future – especially if you want to see results of predicting for each customer in the near-term. Otherwise, Neiman Marcus would be trying to sell every customer an airplane.