Sampling and Social Media

October 23, 2011

151

Sampling is one of the core techniques in primary research. In traditional opinion research, the sample is the thing. It’s not just that it’s critical to have a valid sample to have useful research; it’s also that it’s surprisingly difficult to get a good sample. But for Social Media measurement, most people don’t think of “sampling” as a part of the problem. Social Media is generally considered to be more like Web analytics where we analyze the complete set of ALL behaviors. After all, tools like Radian6 and BuzzMetrics are designed to capture everything, aren’t they?

Well no, actually they aren’t.

Sampling is surprisingly important at several levels of Social Media Measurment and limitations on samples have a powerful impact on what types of analysis are appropriate with social data. So it may be no surprise that the role of sampling in Social Media was one of the most interesting parts of my panel this past week at eMetrics.

Let’s start at the top level and work down. Listening tools in social media measurement DON’T collect everything. Collecting everything is impossible. Every tool vendor makes a distinct set of decisions about what (and how often) to collect. They decide which sites to scan and how often to scan them. They also make decisions about what data to collect. Not every vendor, for example, will collect comments on blog posts. Some do, some don’t. All vendors are further limited by the closed nature of communities like Facebook that restrict what can be collected.

So at the very top of the social measurement process there’s a sample. It’s a sample that’s not designed to be representative. Instead, it’s designed to be as comprehensive as is practical. Most vendors choose to collect what they view as the largest feasible collection of information. For some sources, this comes very close to being comprehensive, but for other sources, not so much.

There’s really not a whole lot you can do about this type of sampling except to determine whether the differences by vendor are important to you or not. Still, it’s useful to be aware that you aren’t starting with either a comprehensive collection or a representative sample when it comes to social measurement. This has real implications to how you can use the data and how you should think about your findings.

What about the next step in the process of social media measurement – the step we describe as “culling”? In the culling phase, you subset your data to find the verbatims of interest to you. In most listening tools on the market, you do this by creating keyword profiles that select any verbatims that match the keyword logic you specify (using a combination of Boolean and Text Operators like “Near”).

This approach holds true EVEN for the vast majority of machine-learning tools. These tools can classify (taxonomically and by sentiment) your verbatims, but they work on a subset of the verbatims that is chosen using keyword profiles.

As with the top of the funnel (collection by the vendor), this subset is usually chosen to be comprehensive not representative. However, the way you configure this subset will have profound implications on all of your subsequent research.

Suppose, for example, that you create a profile based on every case where your “brand” is mentioned. If you then try and use this profile to do product/feature research, you have a subset biased heavily toward your products. That may be fine or it may be fatal depending on the type of analysis you want to do.

If you want to do competitive analysis, you need to be sure that you setup EVERY competitor in exactly the same way. If you build a rich profile containing your product names and sub-brands, then you need to match that profile exactly for your competitors. If you don’t, you’ve biased all your results with a poor sample.

Or imagine that you want to understand the share of mentions by key topics such as Customer Support, Price Comparisons and Feature Mentions. If you’ve eliminated a significant number of Price Comparisons by setting up exclusionary rules to weed out “sales” posts, you’ve biased your sample.

What’s particularly tricky about bias at this level is that there is virtually no way to detect it – particularly when the bias, as in my last example, exclusionary. The data simply never shows up in the reports.

The vast majority of profiles that we look at in Social Media Measurement introduce significant biases in the subsets they create. Doing so is nearly unavoidable. What’s far more worrisome is that almost no one who is using the data understands what’s been done or what the implications are.

If you’re using machine classification and sentiment analysis, you may be done with sampling after these two levels. On the other hand, if you are using human readers for sentiment analysis and classification, or for the isolation of key verbatims, you’ve got at least one more sampling problem before you’re done.

Many organizations, having realized that the sentiment analysis contained in keyword-based listening systems is somewhat worse than useless, have opted for listening agencies that use human readers to classify sentiment. I thought one of the most surprising aspects of our panel was that both Michael and Christopher were deeply skeptical of the quality (not just the scalability) of this approach. Their objections were concentrated on the problem of human interpretation and the difficulties in achieving consistent sentiment analysis with human readers.

But there’s also a sampling issue here. It’s impractical for most large enterprises to pay for the reading of every verbatim included by the profiles creating in the culling process. If your volume of social chatter is small enough, it may not be an issue, but for a large or socially-oriented brand, comprehensive readership is impossible. So you have to sample the subset produced by “culling”.

This sample raises its own set of issues, because it is fully intended to be a representative sample.

But here’s the question: representative of what? If your subset contains 10,000 verbatims in a month, it seems that by taking 1 in every 10 you could create a representative sample of verbatims. And so you could. But let’s suppose that your 10,000 verbatims represented the following:

9,100 Twitter Mentions

800 Blog Mentions

100 Press Mentions

With a 1 in 10 sample, you’d likely end up with something close to the following:

910 Tweets

80 Blogs

10 Press Mentions

I’ve sampled everything perfectly, but I’ve reduced the size of my source populations so that I can no longer draw any conclusions about Press Mentions and only very shaky conclusions about Blogs.

This implies that if I want to understand sentiment by source, I should oversample less common sources to make sure that I have sufficient volume for analysis. Unfortunately, the likely set of my interests doesn’t end with source. If I want to understand sentiment by influencer level, for example, I have a different population I need to oversample for.

What’s more, if I keep drawing samples for every report and then trending them, sooner or later I’m going to get a bad sample. Suppose I have a 95% confidence that my sample will be +-5% of the real number. If I’m pulling a weekly sample, there’s a good chance that sometime during the year my sample is going to be significantly off – creating either alarm bells or misguided back-patting. Oh, and there’s no immediate way to know that the sample is off unless you repeat the whole process several times.

In short, human readership MAY add to the quality of sentiment analysis even as it lessens the quality of the reporting. The cost/benefit of the trade-off is likely to be determined by the degree to which human readership forces sampling and the extent to which an organization needs to slice their data by sub-categories. Since nearly every meaningful report or analysis involves sub-categories, I suspect that human readership when it demands sampling – is a poor solution for social measurement.

I’ll have more to say in future posts about the whole idea of sentiment analysis. I’m not convinced that social media measurement is the proper channel for measuring either brand awareness or brand sentiment. Much of the reason for my skepticism comes down to the fact that Social Media measurement isn’t based on a valid sample at any level. This doesn’t mean Social Media measurement isn’t interesting or important. It does mean that it can’t fulfill every function equally and of the functions that are most problematic, brand sentiment may be tops on the list!

Many thanks to Michael Healy and Christopher Berry for their thoughts at the panel (and Marshall Sponder as well since he and I talked on Friday). We had a great turnout – which was nice to see – and the discussion was lively and interesting.

Unfortunately, my own sample of the Conference was a poor one. I was just crushed by meetings on Wednesday when I was speaking at eMetrics, missed Thursday, and found the IMC Conference pretty much DOA on Friday (I’m thinking that particular Sub-Conference might need a bullet in the head to put it out of its misery). On the plus side, I’ll be back on the East Coast in a couple of week and there’s still plenty of time to register for the WAA Symposium in Philadelphia – should be a terrific event – and I’m looking forward to my “noveau” panel!

Republished with author's permission from original post.

ADD YOUR COMMENT
Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

New Posts

Transforming Customer Interaction with Conversational CRM

Why Community Engagement and Awards Matter for Brand Reputation

Travel Loyalty in a Value-Driven Landscape

The Revenue Impact of CX Training

5 Ways to Improve Your D2C Customer Service Efforts

ADD YOUR COMMENT Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

New Posts

ADD YOUR COMMENT
Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.