Online Opinion Research: The Sampling Problem Revisited


Share on LinkedIn

In one of my first posts on Social Media Measurement, I pointed out the hidden difficulties of sampling social data and the impact this has on the use of Social Media metrics for consumer and brand research. When you can’t pull a representative sample, your research results are biased and, in some cases, unusable. Because social media measurement has pretensions to providing universal coverage, most people don’t realize they are sampling the data and have paid little or no attention to the sampling issues there. That certainly isn’t true for online opinion research. Everybody understands that online opinion research is based on a sample. But how good a sample is it? Is the quality of our sample changing over time? And, finally, what are the limitations on the uses of online opinion research given the limitations of sampling?

These points become particularly important if you try to, as I’ve suggested you should, use online opinion research for more than site satisfaction tracking. When you’re trying to explore the drivers of consumer choice, understanding the “shape” of your sample is going to be critical.

Let me give an obvious and ubiquitous example. Suppose traffic to my web store (and subsequent conversion) is declining and I want to use opinion research to find out why. I can create a series of questions for visitors to my Website to find out why some people didn’t end up purchasing. But suppose the real problem is that a segment of my previous customers has simply stopped needing my product. If that’s true, I’ll never find it out from asking visitors to my Website. The consumer audience I’ve lost has no reason to visit the site and be sampled.

This most basic online sampling limitation (site visitors) may seem almost too obvious to note, but it’s implications are profound and more easily missed than you might think. With online survey research, you’re sample is always based on the population of site visitors. This makes online survey research inappropriate for brand awareness, brand tracking, and early-stage consideration research.

Here is a related and critical fact about online opinion research that is easily forgotten: as you gear up your PPC program, improve your SEO, add functionality or Content, or take down your Display advertising, you are changing the population you are sampling on your Website. In traditional opinion research, your sample and your marketing are entirely unrelated. That’s simply not the case when it comes to online surveys.

With that in mind, it’s fair ask how representative is an online sample of the population of site visitors? I’m really not sure. I see two very common attitudes about this question. Most common, perhaps, is a blithe assurance that’s it all fine. People who aren’t schooled in opinion research generally don’t appreciate how hard getting a good sample is and don’t really recognize how damaging a bad sample can be to most uses of the data. Almost as common is a deep but largely unsubstantiated concern that your online samples skew toward the “happy” and the “unhappy.”

If all you’re doing with your online opinion research is trending site satisfaction, this might not actually matter very much. As long as the skew is relatively constant (which is at least possible), then a sample skew toward strong emotional visitors may not matter. Of course, given the fact that your sample is influenced by your site marketing, there may be no stupider use of online opinion research than trending topline satisfaction scores!

Regardless, if you’re trying to use opinion research to understand customer drivers of choice, then oversampling the strongly opinionated population can lead you significantly astray.

There are several ways to check the quality of an online sample. The simplest one, a method I generally recommend, is to integrate the sample with your behavioral data (Web analytics) and measure the representativeness of your sample versus key behavioral metrics. Doing this will not only allow you to measure sample skew, it will allow you to oversample or weight your sample using behavioral metrics to study under-represented populations.

If you’re sampling a known population (such as registered customers), you may also be able to measure the sample against known exogenous data (such as customer demographics, relationships, account size, purchase history, call volume, etc.). When available, this method has some distinct advantages since it will allow you to measure sample skew and to potentially oversample or weight your sample against key consumer characteristics and that’s clearly better than using behavioral metrics.

I don’t think there is one right answer to the quality of online sampling. Not only will it vary based on offer methodology, choice of survey instrument, industry vertical, and strength of brand, it may change over time. In my WAA session in Philadelphia, several of us wondered about whether surveys have become too pervasive on the Web – damaging every organization’s ability to pull a representative sample. In the offline world, the dramatic growth of telemarketing made phone surveys much harder to do well – too many customers simply opt-out of the medium making it almost impossible to get a good sample.

In our experience here at Semphonic, most online surveys we’ve tested have done a fairly creditable job of sampling the “known” population. They have also mostly oversampled the behaviorally very engaged (likely translating into the most and least happy). You should not take either of these statements as a given. The use of anecdotal experience as a predictor of wider truths is a classic sampling error and one to which consumers of case studies are all too prone! We haven’t studied this problem with anything like a large enough sample to be broadly predictive.

Test it for yourself – it’s the only way to be sure.

Republished with author's permission from original post.

Gary Angel
Gary is the CEO of Digital Mortar. DM is the leading platform for in-store customer journey analytics. It provides near real-time reporting and analysis of how stores performed including full in-store funnel analysis, segmented customer journey analysis, staff evaluation and optimization, and compliance reporting. Prior to founding Digital Mortar, Gary led Ernst & Young's Digital Analytics practice. His previous company, Semphonic, was acquired by EY in 2013.


Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

Please enter your comment!
Please enter your name here