Questions I have always wanted to ask – About Predictive Analytics


Share on LinkedIn

A couple of weeks ago I was at the eMetrics Marketing Optimization Summit in Toronto and participated on a panel on Predictive Analytics. After the panel, quite a few in the audience asked me questions on how to get started with predictive analytics, which tool to use, where is it useful and a bunch of other questions. I thought I will write a post based on those conversations which will serve to share my views on the topic and also as a reference to those starting in this field. This is not a technical or statistical write-up; rather, it is a general overview of the topic and some of the benefits that can be derived from it.

Q1. What is predictive analytics? Why should I use predictive analytics?

While one can get a number of variations on the definition of predictive analytics, from a digital marketing or business perspective it comes down to one key component – customer behavior. In simple terms, predictive analytics is a set of methodologies and processes that assist us in anticipating customer behavior. The customer behavior of interest could be anything – spend, buying habits, page views, response to a certain trigger or something else.

There are many reasons why one would want to get into the predictive analytics world. First and foremost, predictive analytics helps us filter out the noise and create triggers on which we can then act upon. Second, and this is true especially in the world of web analytics and multi-channel analytics, the various data elements form complex patterns and it becomes almost impossible to create simple rule-based decision support systems. And lastly, predictive analytics allows you to do wise investment allocations.

Q2. What are the key factors to consider while doing predictive analytics?

To create a good predictive model a few things have to be thought through before the modeling can start.

Clear objective(s): An unambiguous statement of the business problem can have a deep detrimental impact on your efforts to create a good predictive model. Therefore, you should spend some time thinking about what you would like to predict or your desired target to be modeled.

Two pieces of advice when thinking about using predictive modeling. One, you should have a good reason for wanting to create a predictive model for a certain behavior. Ask yourself this question – What am I going to do with the results from the predictive model? If the answer is ‘not actionable’ then it might not be worth the effort to go into predictive modeling.

The other factor to be aware of is the cost of wrong decision. All predictive models have a degree of error associated with it. So if you were to make a decision that resulted in an unwanted outcome, how would it impact your company or the people associated with the outcome.

Data: Having the right data is one of the absolute keys to creating a model that delivers value to your business. Most tools still operate under the GIGO (garbage in, garbage out) principle. So if your inputs to the model are irrelevant or not properly coded, you will still have a model but one that is going to give you a lot of false positives and negatives (see cost of wrong decision above). Things to keep in mind regarding data:

  • Inputs
  • – for the behavior you are trying to model, think about and list out the various possible predictors – transactional, firmographic, demographic, pricing, web data etc.

  • Accessibility
  • – Do you have readily available data sources that can provide you with the information you need to create the model? If not, do you alternative sources which can be used to create proxy variables?

  • Properties
  • – It is really critical that one understands the nuances of the data they are working with so that it can be treated accordingly. A preliminary analysis should be done to ascertain the amount of skewness, percent missing, collinearity, how dense or sparse a variable is, and if there is a magnitude of difference between the values.

  • Curse of dimensionality
  • – Having too many variables and extraneous inputs can many a time hurt your model by injecting noise into your predictions. So it is worth thinking about if you would need a ‘feature selection’ step in your model to reduce the dimensions you will finally work with.

  • 3rd party data
  • – This is a bonus. If you have access to external information – competitive, industry, market, macro economic or demographics – it is worth considering if you can incorporate that into your model and if yes, will it add value in terms of developing a better prediction.

Methodology: Once you know what you want to predict and what kind of data you have available at your hands, you can start thinking about the methodology – what statistical method do you need to use to get to what you want? There is a plethora of methodologies to choose from, with some of the more common ones being regression, association analysis, decision trees, and clustering to more complex modeling techniques like forecasting, linkage analysis, and survival modeling. What is appropriate would depend on a host of factors including what you are trying to predict, data properties, and how you plan to use the results.

Tools: As in the case with methodologies, there are a lot of tools available in the market that can do predictive analytics to varying degree. Which tool is right for you would depend on a host of considerations including how much money you have to spend on a tool, in-house analytical capabilities, and other infrastructural and contextual conditions. I do want to leave you with a couple of points when it comes to tools:

  • Complexity not always better
  • & Costly not always better: To create a good predictive model, one does not always have to use a neural network or genetic algorithm; and neither is it necessary to invest in a tool that costs an arm and a leg. This is why it is critical to be clear with your objectives and know exactly what data you have. Sometimes, a simple regression can get you the answer you are looking for and you can do it using Excel.

  • Data visualization extremely important
  • : I cannot stress this enough. Proper visualization is critical for more than one reason. It helps others grasp the fundamentals of what you are trying to convey fast – easier to ‘read’ a graph than a page full of numbers. It makes your work an easier sell. And in general, use of good visualization leads to better decisions as folks understand the nuances of the relationships better.

Putting it into action: Once the model is created, a few things need to happen to really get the full benefit out of it. First, the model and the benefits should be socialized with all the key stakeholders. In addition, the key results should be made accessible to all the decision points so that it can be incorporated into the strategy and planning stage. The model should then be deployed – so that you can begin using the model by applying it to new data and generating predictions. And lastly, a refresh process should be set up as even the best models get useless with time because of the changes in the ecosystem and the variables involved.

People: While one does not need to have a Ph.D in statistics to do predictive modeling, it is critical and indeed beneficial to have someone who is analytically proficient and has base statistical knowledge. Misinterpreting the results from a good model is as bad as creating a bad model.

Q3. What are the ways I can leverage predictive analytics for my business?

While you can apply predictive modeling to a variety of scenarios, here are some of the more common applications from a business & marketing point of view.

Response modeling: Response models are very useful in identifying customers likely to respond to a particular offer. With the help of response models one can save money by targeting fewer people while getting better than average response rate. Normally, these types of models are build on top of existing customer data within the Enterprise.

Up-sell and cross- sell modeling: Predictive models can be build to detect the association between customer interactions and purchases, which in turn can be used to identify cross-sell and up-sell opportunities – the likelihood of a customer needing additional products ( an excellent example here is Amazon and how they ‘suggest’ related products/books based on your purchase).

Risk assessment modeling: Risk modeling is very useful in quantifying uncertainties or likelihood of events that may adversely affect a business. While risk taking is necessary to an extent in order to stay competitive and profitable, predictive models can help Management in assessing which risks to take and in supporting informed decisions.

Attrition modeling: Attrition models (also called churn or retention modeling) can assist a business in identifying the customers most likely to churn; in other words, estimating the probability of a customer switching to another provider. Knowing this, the company can focus their retention efforts on specific customers with high likelihood of leaving.

Segmentation: Segmentation is a powerful way to identify product purchasing trends, spending patterns, and uncover pain points related to a specific behavior. The advantage of using predictive techniques like clustering for creating segments (as opposed to rule-based segments) is that it is objective and reveals natural patterns or groups within the customer data and so much more insightful.

Q4: Okay I get it. Where should I start? Also, I am specifically interested in using predictive analytics on web data – can you provide some pointers in that regard.

If you are just starting out with the use of predictive analytics in your organization, it is best to pursue the lowest hanging fruit first. Depending on where your organization is in its analytical maturity and the key priorities for your business, you might start out with a simple response model or a model to create an email targeting list. Develop the model, socialize and deploy it and show Management the benefits of using the model -either in relation to historic ROIs and response rates or in relation to a control group created for this purpose.

Specific to web data, predictive models can help you on a variety of fronts. It can help you with increasing your conversions and page views, it can help you with driving cross-channel behavior (store to online and/or vice-versa), it can help you with bid management and cost-management for your display ads, and when properly combined with offline media data and transactional information, it will provide you with a far better attribution model for your sales.

Ok now your turn. Did this overview help you? If you are already using predictive analytics in your organization, do you agree with what I said? Disagree? What else would you add in terms of giving advice to someone just starting out in the predictive modeling area?

Please share your feedback via comments. Thanks.

Note: The views expressed herein are my own and do not necessarily represent the views of any organizations that I might be affiliated with.

Ned Kumar
NK Consulting
Ned Kumar has over fifteen years of experience in customer analytics and strategy, with expertise in both online and offline channels. He currently serves as a Strategist and as a corporate advisor for search optimization, interactive marketing and multi-channel analytics. His current interests and engagements include collaborative thinking, social networks, social crm, and innovation.


  1. Ned,

    I thoroughly enjoyed your take on predictive analytics, and was especially overjoyed to see you focus on data. You really nailed this one.

    I spent 10+ years developing and testing custom models mostly for electric utilities, and I was often amazed at how a new creative/messaging test or segmentation approach would often lead to new models. Eventually, we ended up with several, and sometimes dozens of models, all for the same product or service.

    The various models would be applied to a given segment or population to not only define goodness-of-fit and risk, but to ascertain which campaign or positioning statement would be most likely to resonate with a customer or prospect. In other words, the model(s) not only predicted response, revenue, lifetime value and risk, but also provided guidance on the “mix” of campaigns and messaging that should be used.

    In some cases the customer’s behavior was so radically altered by different approaches that one model would score them in the bottom decile, while another – using a different approach – would score them in the top decile.

    It often took several campaigns to obtain this type of functionality and maturation, and the real challenge was getting new clients to embrace and stick with the program. Once they saw the results there was literally not turning back, but it’s often difficult to look beyond this fiscal year’s returns, especially if it means investing a significant amount of time and money along the way.

    Thanks for the article, Ned. Really struck a chord with some of my past pleasures and experiences.

    Jeramy Fishel
    Sr. Marketing Research, Thought Leadership

  2. Jeramy,
    Thanks for your wonderful comment. I am glad you enjoyed the post.

    The approach you mention using multiple models is a good one — especially if you are doing customer modeling. There are so many variables (and here I don’t mean fields in a database but factors that are in a state of flux and can change) in predictive modeling that it is often a good idea to use some sort of triangulation to have a better grasp of where a customer should belong.

    At the end of the day, I keep saying to folks that this is not a one time effort – as you mention, there has to be a commitement to continue on this path for a while if one has to really get a ROI from analytics & predictive modeling.



Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

Please enter your comment!
Please enter your name here