Finding the Right Problem to Tackle: When Web Analytics Technologies Chase Problems


Share on LinkedIn

Though I was probably a bit too nervous to really appreciate it, the WAA lined up a pretty cool speaker for the Awards Gala this past Tuesday – Ryan Zander of Sportvision. If you are at all into sports, you certainly know of Sportvision’s work (the yellow first down line, the digitally mapped strike zone and pitch trajectory, the glowing hockey puck) even if, like me, you’ve never heard of the company. But while computer graphics are certainly fun and interesting, they have nothing really to do with analytics. Fortunately, that isn’t all Sportvision does. They’ve used the data collected from their graphic and computer systems to build an intriguing analytics practice.

Ryan presented a number of outstanding visualizations of the data they collect. One striking example was a heat-map (think Clicktale) of the pitch location that sure HOF Yankee closer Mariano Rivera achieves with different pitch counts. With no strikes, he paints the inside corner with high precision. With one strike, he tends to come even further inside. With two strikes, his hot-zone is just off the outside part of the plate. Beautiful. Not only is the visualization striking, it’s uses are clear. We may not have needed visual proof of Rivera’s deadly efficiency (Yankee haters like me have known that for years), but it’s obviously a way both to evaluate pitchers (if you’re a Manager or GM) and plan against them if you’re a hitter.

Perhaps even more compelling as an exercise in data analysis was an example of Oakland A’s pitcher Brandon McCarthy’s release point. By tracking the release point of every pitch, Sportvision was able to show that by lowering his release point, he gets more ground balls and becomes a more effective pitcher. As a tool for evaluating the mysterious workings of pitcher mechanics (and the surprising large difference small changes can make) it’s pretty amazing.

It was Sportvision’s latest and most ambitious effort that really got me thinking (which is surely the goal of any really good presentation). They’ve created a system that tracks every player on every play (digitizing player and ball movements). This system creates a true digital map of the action, allowing Sportvision to make the first non-subjective measurements of fielding efficiency. This system is complex. It requires a network of trackers, sophisticated tracking software, and complex big-data analytics. I’m sure it’s very expensive. I’m also fairly confident that it could give a good answer to the basic question : “How well does Player X field his position?”

What I’m much less sure of is whether that question is worth the effort of answering. There are folks at Semphonic who know (and care) far more about baseball than I do. But as a fairly casual fan, I have some doubts about whether fielding efficiency is really that interesting. My gut instinct is that the difference between a good and an average fielder at the Major League level is modest (baseball folks love to think about value as the incremental difference between a player and his likely replacement – Wins Above Replacement or WAR – and I think it’s a pretty reasonable approach). What’s more, I suspect that pitching and hitting each dwarf fielding in this regard. Further, while hitting and pitching tendencies are inherently valuable in shaping immediate tactics, fielding is much less so. Yes, players are positioned but the basic mechanics of player positioning are fairly well understood and most players have a deep sense of, at least, their own range. Nor is it possible for a batter to take conscious advantage of fielder limitations in the vast majority of circumstances.

So if I were a GM, there are a number of data analysis projects that I think would be far more important that measuring fielding efficiency. I’d probably rather have a model to optimize farm system progression or predict deterioration curves for aging veterans. I’m willing to bet that with a combination of lifestyle, demographic, mechanical, and psychographic data, I could build a pretty good model of age deterioration that would significantly out-perform most GM’s mental math. Double that for farm system progression, where I suspect many organizations are markedly inefficient. An analysis that tackled either of these issues would, I’m guessing, be far, far more impactful than fielding efficiency for the organization. These problems might not yield sexy visualizations but they would yield true competitive advantage.

So it seemed to me that maybe Sportvision chased a pretty small problem with a very large technology solution. Or maybe not. Maybe fielding efficiency is quite a bit more important that a tyro like me believes. And they may well be pursuing analytics across all of the ideas I’ve suggested and many more (I’d actually be kind of surprised if they aren’t). I’m not really all that concerned about baseball analytics one way or another.

The thing is, it got me thinking about what the big analytic problems are in digital measurement and whether those are, in fact, the opportunities we (collectively) are chasing. In particular, I was wondering if there are cases where we are chasing small problems with a technology big stick.

I think that might be the case, for example, with Campaign Attribution. Campaign Attribution has become a very popular analysis in the last year or two and it’s one that can involve significant expense, considerable integration work, and expensive additional technology. Like fielding efficiency, it’s certainly not without value. If you’re spending boatloads of money on digital campaigns across multiple channels, have significant multi-touch customer behavior, and you’re using last-click behavior to optimize the mix, there’s a pretty good chance that campaign attribution analysis will drive real improvement. On the other hand, there may be relatively light-weight alternatives to getting the same information. With minor tweaks in your Web analytics infrastructure and some inexpensive reporting, you can get a pretty good sense of the multi-touch behavior in your system and the difference between last click and other basic attribution models. This isn’t the same thing as a full attribution model, but it may drive much the same organizational learning.

There are, in addition, two other analytic approaches that are worth considering as alternatives. Media Mix Modeling provides an easier and, in some respects, more actionable analysis than full attribution modeling. By carefully testing the mix of channels in your program, you can create an optimization model for all your spending – even in cases where the customer journey can’t be tracked. Since most attribution models will leave gaping holes in the actual customer experience, Media Mix Modeling has some significant advantages (and it doesn’t require any additional technology infrastructure). Media Mix Modeling is by no means a one-to-one replacement for attribution analysis (we often suggest it as a complementary technique). But it is a potentially attractive and much lower-cost alternative to a full campaign attribution system.

Another approach to the problem is to focus on building customer-level lifetime value models. In this approach, you measure the incremental lift of campaigns on individual customers by comparing their actual results to predicted results. This approach doesn’t work so well in situations with mostly new visitors, anonymous transactors, or very occasional shoppers. However, where you have existing customers and the wherewithal to predict their likely future value, this approach can answer at a deep level the question of whether or not a marketing campaign actually drove incremental value.

My next example – full traffic counting – may seem a bit surprising. By full traffic counting I mean nothing more than counting every page view to a site. There’s hardly anything more basic than full traffic counting and it’s something we all do. And there lies the problem. Because I’m not convinced it’s something we all do need to do. If you’re selling advertising on your Website, then obviously it’s mandatory that you count views. But the systems to do this are, in most cases, not Web analytics systems. For low-volume sites, the cost of full traffic counting is minimal. But if you’re site is churning out millions of page views each day, you are likely paying a pretty good chunk of change to count all those views.

What, exactly, do we get out of those views? Given that there is no accounting purpose to such things, why do they add value over the collection of a large sample? In some cases, you’ll want to tie that behavior to specific customers. Fair enough. But if all we are talking about is reporting and analysis, the case for comprehensive collection is much less clear. Let’s suppose you’re paying $400K for your Web analytics CPM. If you took a 1/10 sample of your site traffic, your cost would probably drop to something like $80K (since your CPM rate will go up). That’s enough money to pay for a couple of analysts. What would you lose? If you’re analytics solution is sampling anyway (and that’s often the case), you’d lose nothing. But even if you’re analytics solution isn’t sampling, there’s only a very small set of problems that can’t be tackled by a good 1/10 sample. This sampling strategy can be applied in a variety of ways. If you want to track all opens of a Mobile App but want to do GUI analysis of detailed scrolls, you can comprehensively collect the former and sample the latter. Sampling is nearly always an appropriate solution when you want track actions exclusively for UI analysis and you’ll still save lots of money.

Multi-variate testing is another technology that I sometimes think is chasing a problem. I’m all in favor of testing, but, as I’ve written before, I have a pretty deep skepticism about the virtues of MVT vs. simple A/B testing. It seems so cool to be testing multiple creative variations at once that it’s easy to forget that good creative has to originate in specific hypothesis about particular problems facing particular customer segments. It’s also easy to forget how expensive it is to develop that good creative. As with each of the other cases, there’s clearly at least some set of use-cases where MVT drives real value versus the alternatives. I just think that set of cases is much smaller than is generally recognized.

Which brings us to my last candidate for a big-stick technology chasing a small business problem: real-time reporting. While the need for real-time reporting certainly exists in a few verticals, for most organizations it’s simply not useful. Current Web analytics solutions differ widely in their average latency. If you’re a high-volume publisher without an alternative solution, this may be a significant decision-point. But for almost everybody else, real-time is a difference that makes no difference. Most numbers can’t be meaningfully interpreted in very short time-frames. Most businesses can’t react in anything like real-time. Put these two together and you have little demand for real-time reporting.

All of my examples share some similar characteristics: cool technology, great demo material, and real value in at least some set of circumstances. Each can also (and maybe more often) involve significant costs without necessarily providing commensurate benefits to the organization. In a world of limited resources, chasing the wrong problem with a big technology can cripple your digital analytics effort. Taking this back to baseball, if you spent $126 million on Barry Zito, you’d better pray you drafted Tim Lincecum and Matt Cain to help offset the damage!

Republished with author's permission from original post.

Gary Angel
Gary is the CEO of Digital Mortar. DM is the leading platform for in-store customer journey analytics. It provides near real-time reporting and analysis of how stores performed including full in-store funnel analysis, segmented customer journey analysis, staff evaluation and optimization, and compliance reporting. Prior to founding Digital Mortar, Gary led Ernst & Young's Digital Analytics practice. His previous company, Semphonic, was acquired by EY in 2013.


Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

Please enter your comment!
Please enter your name here