We *heart* Unstructured Data.


Share on LinkedIn

This is the second in a series on how CX practitioners can move past focusing on the ‘bigness’ of data and get ahead with smart data techniques. In case you missed it, here’s the first post: Forget Big(ger) Data: It’s Time to Get Smart(er).

Feet chalkboard

One way to help smart data drive more decisions is by putting qualitative components into your voice of the customer (VOC) program. Qualitative is easy, it’s cost effective, it gets you more root-cause behind the rating. However, it’s also still relying on one source of data: Surveys.

An alternate approach to becoming ‘smarter’ about data and getting at the why, how and when of customer experience (CX) is using unstructured sources already available. Although we know information is everywhere, unstructured data in particular is not dealt with very effectively. To bring back an earlier theme, some data sources truly are big data. Voice transcriptions, the Twitter fire hose, chat sessions and real-time IoT are absolutely big data. Maybe other data sources aren’t really that big by today’s standard but instead aren’t ready to use. Why? Because they’re unstructured and are located in different source systems without a pre-existing way to extract and analyze them (which is a topic for another day).

What I want to focus on here is the nature of dealing with unstructured data. That’s where text analytics comes in. Text automation and analysis provides a way of harnessing unstructured information and incorporating it into insights and planning. I’ve been working with text for the last 12 years and have watched the field change a tremendous amount. In the earliest commercial offerings, text software was very time-consuming. It was heavily geared toward predictive modeling and typically used methods based on statistical and machine learning. Capabilities were far less robust for the most common analytic application today – survey verbatim classification through building taxonomies. (I remember working with one insurer whose team spent over 3,000 man hours building out text analytics just for categorizing claims notes).

Clearly, times and adoption rates have changed. Today’s off-the-shelf options deliver fairly decent models within 1-2 days of tweaking. In fact, I’d estimate about half of CX practitioners already have text analytics deployed within their VOC platform and, in most cases, it’s tightly integrated into an online portal. But I’d argue that, as a whole, text analytics still are used in too limited a fashion. Approximately two-thirds of companies I’ve consulted with deploy text mining only against survey open-ends, completely missing other rich sources of smart data that deliver a broader customer view.

Give Your Platform a Boost

When you’re searching around for additional data to include in your text analytics environment, try to prioritize. Customer emails, online feedback and community forums are three of the cleaner sources for augmenting VOC verbatims. Each is easy to use and commercial tools usually have some functionality already built in. Moving up the complexity chain, chat sessions also are a tremendously rich source for understanding CX. Sessions are a record of the interaction itself that can be tied directly to a survey. Ironically, the data I go to least often are call center or agent notes. They sound great in theory but in most cases there’s wide variability in accuracy and usage. Agents don’t use the same acronyms, keep notes the same way or capture information about the same things during calls. I encourage you to look there but it’s not going to be the quickest way to generate value from text analysis.

If you want to systematically deploy text as a smart data initiative, your first step should be moving past whatever default text mining capability is in the portal. Instead, there are three things to look for:

  1. A solution that’s optimized for multiple sources: Some text engines are strong in social media analysis, others are optimized for short-term survey feedback like you might get from a SMS survey. What you really need is a full-featured, stand-alone text analytics function … something designed from the ground up to be source-agnostic with a strong NLP engine. Skip simple word counts; go straight to context and sentiment. Getting sentiment accuracy is an ongoing, iterative process. Expect to spend time fine-tuning it. Do a gut check by comparing sentiment output to what you see in survey responses. You also want something flexible in handling data. Text is hard to work with so find a solution that allows ongoing categorization flexibility, flexibility in vocabulary, flexibility in exclusions. Depending on the strength of your Data Science organization, you can investigate open source tool kits like Stanford’s CoreNLP, OpenNLP or even R to customize to your environment. Many of these have fairly easy to use implementations and offer a high degree of customizability.
  2. A solution that integrates with core CX reporting: Integrate insights into regular reporting, particularly dashboards and push reports, so staff continue to see value. If you skip this, you’ll likely see your text engine fading from visibility and your budget shrinking exponentially. We know text analytics is subjective (read: ‘it’s not perfect’) so the more you tune, the better fit you’ll get which is critical in building credibility as stakeholders compare text output to other CX insights.
  3. A solution that offers both business and use cases: Text analytics for its own sake is only interesting to an analyst. The strongest ROI in executive sponsorship comes when you use non-survey data to deepen what the company knows about VOC, as well as when you analyze it with customer complaint data and email data. In our practice, we’re starting to use unstructured interaction data to predict customer behaviors … without the use of VOC data. (I talked about this in a session at THE Market Research Event last month.)

Here’s Proof It Works

Sound too complex? It’s not. Check out these examples:

One of our financial services clients was struggling to understand how to use call center process data for root cause analysis. The company had a mandate to improve transactional NPS from the top down and needed to meet an aggressive year-end goal. We helped aggregate survey scores and unstructured data by pulling together verbatims, coaching feedback and agent insights. Text analytics identified root causes that were included in action planning and immediately impacted NPS. By including text results in dashboards and metric management, the company successfully integrated unstructured insights into a comprehensive improvement process. Key finding: One factor driving call center volume was a previously unknown issue with workforce staffing in the branches during the lunch hour. Low staffing levels drove long wait times, increased customer frustration and higher call center volume. When they made changes to branch staffing during high peak arrival times, NPS improved rapidly with a 12-point gain expected once all changes were implemented.

We recently conducted a project for a leading communications provider to try to predict NPS among customers that don’t take a survey. In the engagement, we partitioned 100’s of thousands of chat transcripts into meaningful categories and separated customer from agent comments. We matched results of individual surveys to a specific chat session, resulting in 10’s of thousands of matched records. The net result? We were able to predict NPS with a surprising degree of accuracy, based completely on text analytics. Using these insights, the organization scored chat sessions in near real-time to identify opportunities for immediate intervention and service recovery. The company launched a small pilot and found attrition rates fell 8% for a test group compared to a control.

Next Up

In the final blog in this series, I’ll share more information about how to implement a real-time intervention strategy using text analytics. In the meantime, I recently conducted a webinar called, “Predictive Analytics: Evolving from Big Data to Smart Data.” If you’d like to watch the recording, click here.

Read the rest of our Smart Data blog series:
Part I: Forget Big(ger) Data: It’s Time to Get Smart(er).
Part II: Smart Data: Integration to Action

Image source: Thinkstock

John Georgesen, Ph.D.
John Georgesen, Ph.D., is Senior Director, Analytics at Concentrix. He specializes in designing customer experience (CX) programs that drive tangible improvements. With 20 years of applied experience, John is a recognized innovator in the field of customer experience management.