Thoughts on Sentiment Analysis from eMetrics


Share on LinkedIn

I’ve caught less of eMetrics than I was hoping as the accumulated pile of work in my office kept me extremely busy these past few days. It’s a big advantage to have a Conference in your home city. It’s a big disadvantage too. It’s hard to resist the office and I’ve got two large analysis projects on my plate that simply demanded my attention.

A shame, because I missed some great speakers. That said, I wanted to write out a few notes from the very interesting and enjoyable session I shared with Michael Healy on Sentiment Analysis and I’m also going to be writing a post on Comscore’s big product announcement: Digital Analytix. This may be the single most important new product introduction in our field since Google Analytics and it has the potential to significantly alter the Web analytics vendor landscape.

Before I go there, however, here’s a quick overview of the sentiment session.

Point 1: What’s the Business Purpose of Sentiment Analysis?
There’s little point to the standard use of sentiment analysis: positive/negative/neutral at the brand level. Why? First, the raw statistics, like all global measurements, are nearly impossible to interpret. We often say that there are no site-wide KPIs in web analytics (and we mean it). Similarly, there are no Brand-wide KPIs in social. Everything, to be meaningful, needs further classification. If your positive sentiment is up, it can easily be because people like your mass-media campaign, or because people are responding to your viral efforts, or because they like your products better, or because the tool you use improved its analysis system, or because of data quality issues. Unless you’ve classified the data, you have no way of knowing which of these explanations (or a host of others) might be true. Second, both Michael and I agreed that the tools themselves are simply not up to the job – especially the tools in widespread use for Social Listening. Michael is very deep into this type of analysis and he talks convincingly to how hard the problem of sentiment analysis is and how relatively primitive the current generation of tools are.

That being said, there are deep uses for sentiment analysis. It can be used in the context of careful classification for audience research purposes. It can be used to cull out potentially important posts (“sponge-worthy” in the Seinfeld lexicon) from a larger brand stream, and it can be used to drive alerts or research into potentially interesting topics.

Point 2: What’s the right way to do sentiment analysis?
Other than “Not at all”, which may be the best answer, human classification is likely to be your best bet. I know of several companies that rely on human classification for sentiment and it does work significantly better than current machine classifications. That being said, it’s a time-consuming and expensive process to manually classify social mentions, so you really have to want that data. In the long run, automation is essential to scale but it will take better technology than the mainstream products currently deployed. This is one of those places where a company needs to be willing to partner with a smaller technology provider and really work and grow with them.

Would it be worthwhile? That brings me to Point #3.

Point 3: What’s the right way to start using Social Data?
As impactful as Social Media Dashboarding and the integration of Social Data into Scorecards is, it isn’t necessarily the right first step. In Web Analytics, almost every organization does reporting before analysis. In Social Media, they should do it the other way around. Why?

It’s pretty simple really. It takes time and effort to get comfortable with Social Media data collection, culling and classification. The best way to really understand your data is to analyze it in depth. Dashboards and reports don’t do that – and they tend to leave serious problems in the data buried and undetected. With the Web Analytics vendors, a reasonable job in implementation will get you a pretty clean and understandable data stream. That’s just not true in Social Media.

Companies need to walk before they can run. In Web analytics that means reporting before analysis, but in Social Media it’s just the opposite.

Fortunately, there’s a tremendous amount of competitive, brand and audience segmentation information trapped in social media data. It isn’t going to emerge easily from scorecards, but it can be tapped with hands-on analysis using more advanced text-classification tools. This data is rich in the type of color that makes it possible to connect with your customers, build better segmentations, identify competitive opportunities and risks and make all your marketing more relevant – really the entire panoply of classic marketing research.

Point 4: What’s the most comparable research instrument?
Social data isn’t nearly as accurate or scientific as survey data – and because it’s so uncontrolled it probably never will be. No matter how good our tools get, the environment of social media just doesn’t lend itself to careful data control. So it would be wrong to think of Social data is the world’s largest survey instrument. On the other hand, it can be far deeper and richer than Survey data. It’s closer, in many respects, to a focus group. Unlike a focus group, you can’t steer the conversation to the key points of interest and it’s much harder to do segmentation, but it’s similar in that what you are getting is deep color and consumer attitudes that often have to be subjectively interpreted. There is more scope for analysis in Social data and it is broader and potentially more exploratory than focus group data but I think there is an opportunity to learn from the way focus group data is consolidated and presented to the organization.

Ultimately, I think social is its own distinct entity. Just as I tell our clients that OpinionLab and ForeSee aren’t competitors but complementary research instruments that tackle fundamentally different problems, I think Social data will prove to be a distinct and extremely important avenue for marketing research.

Point 5: So what to think about tools?
Tools are a big decision and big pain point for Social Media analysis. One of the most important points I’d make here is that organizations don’t need to (and probably shouldn’t) standardize on a single tool. The market is too nascent and the functional requirements too distinct to make it likely that one tool will meet all your needs.

The distinction I’ve often made between measurement vs. monitoring is a core part of this decision. For monitoring, the ability to collect from a wide-stream of data, quickly identify potentially interesting posts, and then engage in organizational conversation and workflow are all critical elements. For measurement and analysis, sophisticated and flexible classification of text is by far the most important capability. I think most organizations that want to do measurement will also want to do monitoring (though not necessarily vice versa); so it’s important not to get caught up in the “we have to pick a single tool” mantra.

It’s also possible to go the “roll your own solution” route by collecting data off of key platforms (Twitter makes this particularly easy) and then either creating your own processing solutions or customizing them using commercial text analysis systems (SAS for example). There’s a considerable investment (in time at the very least) involved in this approach, so it goes back to how important this type of analysis is to you. If it’s worth being on the bleeding-edge, then a “roll your own solution” approach is worth considering.

Point 6: And how does any of this tie in to Digital Database Marketing?
In my introduction at the session, I mentioned that the two topics I’ve personally been most involved in over the last 18 months are Social Media analytics and Digital Database Marketing. No connection, right? Not so.

In an earlier post from my ongoing series on Database Marketing I walked through the role of survey research in traditional database marketing. Survey research provides the essential link between the segments you can target (the data you have) and the attitudes, needs, and desires of that segment. Without that bridge, you’re left building campaign and marketing strategies for target segments on gut instinct. That’s not right.

Properly used, Social data can provide a similar type of bridge in the digital world. It’s not a perfect instrument because Social Data isn’t embedded in the targeting data you have. In traditional database marketing, many of the targeting variables were the identical demographics being collected in the survey. In Digital that’s not true. The Digital behaviors we use for targeting aren’t embedded in either Social or Survey data.

The trick is to map Web site behaviors using the type of meta-data I’ve been talking about to topic classifications in the social space. It’s in the classification of the data that the potential for mapping, analysis, and usage actually resides. If you know that viewing Page X on the Website indicates early-stage interest in, for example, an investment product, you can use that knowledge to fine-tune your understanding of the issues, attitudes, preferences, and characteristics of the Social audience discussing that investment product. It’s from that knowledge that targeting and creative testing strategies should emerge.

I’m sure I haven’t captured more than half of the discussion but I’m hoping that at least I’ve remembered the most significant half. It was a thoroughly enjoyable session and many thanks to Michael for sharing it with me.

Republished with author's permission from original post.

Gary Angel
Gary is the CEO of Digital Mortar. DM is the leading platform for in-store customer journey analytics. It provides near real-time reporting and analysis of how stores performed including full in-store funnel analysis, segmented customer journey analysis, staff evaluation and optimization, and compliance reporting. Prior to founding Digital Mortar, Gary led Ernst & Young's Digital Analytics practice. His previous company, Semphonic, was acquired by EY in 2013.


  1. Great Article Gary…

    I liked your line ” In Web analytics, we do first reporting and then analysis but it should be other way around in social media analysis”

    Lack of tools intelligence to decide tonality of the sentiment is the biggest hurdle. It requires a lot of human effort to clean social media data.


Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

Please enter your comment!
Please enter your name here