How Reviews-Focused NLP Facilitates Discovery of Customer Insights


Share on LinkedIn

When used effectively, new technological innovations can shape the way companies understand and manage the customer experience. Natural Language Processing (NLP) technology is one such innovation that can facilitate the discovery of significant customer insights in a small amount of time.

Why Natural Language Processing is Useful

Natural language processing is the ability of a computer program to understand human speech as it is spoken. Today, companies are using this technological innovation to interpret customer feedback (from online reviews and social media comments, for example) and unlock insights.

Using advanced machine learning techniques, NLP models can read through thousands of reviews, comments, and feedback in the time it would take a human to read through just a few. The right NLP technology will provide valuable summaries, trends, and statistics that can be applied to support data-driven decision-making and business innovations.

For example, a company might begin to notice a negative trend in reviews that talk specifically about one of their store locations.

Diving deeper, they may find that the program is regularly surfacing the negative keyword smelly. This then leads them on to a number of reviews mentioning a dumpster near the entrance. This allows the company to take action and relocate the dumpster to the back of the building, resolving this recurring customer annoyance.

What Kind of NLP Solutions are Available?

Build your own model: The most customizable approach is to create your own in-house machine learning model. This is somewhat unrealistic, except perhaps for the largest of companies, because it requires a dedicated team of software engineers and data scientists to build and maintain.

Use a generic solution: Another approach is a generic out-of-the-box solution, such as those offered by Amazon (as AWS Comprehend) or IBM (as Watson). These are designed to be easy to use even without programming skills.

Use a tailored solution: NLP models are also often built to provide specific solutions. A balanced approach for most use cases is to work with a company that offers a product that leverages advanced machine learning technology and is specifically tailored to, say, customer conversations and Voice of the Customer data.

Managing Customer Reviews and Feedback? How NLP Can Help

Whenever customers review a company, they’re also sharing information more useful than what you’ll get from star ratings or satisfaction scores. The value lies in the textual information contained in the review: a type of unstructured data that an NLP solution can crystallize into insights — helping brands achieve a more accurate, complete, and unified view of the customer.

When it comes to analyzing review data, Natural Language Processing involves three core tasks: keyword extraction, sentiment analysis, and classification.

Keyword Extraction

Keyword extraction is a task that consists of extracting relevant terms from the review text.

The definition of what constitutes a relevant term can vary wildly based on the data type and user needs. For example, over a corpus of news text, it is common to extract person, business, and location names. For review text, this can be much broader.

Let’s look at the example review below. It is taken from the restaurant domain, but the same concepts apply to reviews in other domains, such as retail or healthcare:

“The waiter forgot our drinks at first, but they were worth the wait. So unique and tasty!”

Initially, extracting only nouns may sound sufficient, but human language is wonderfully diverse and messy, so in practice many relevant pieces of information surface as different parts of speech. The simple approach of extracting only important nouns would only capture waiter, drinks, and possibly wait from our example above. Most analyses would benefit from capturing additional terms.

The example above contains the adjectives unique and tasty and the verb forgot. It is not a stretch to imagine other reviews that contain important adverbs such as quickly or professionally.

Takeaway: The informed user of a Natural Language Processing system should be aware of what types of keywords their system can or cannot extract, and decide what level of coverage is optimal for their needs.

Sentiment Analysis

Sentiment analysis is another Natural Language Processing task that assigns a sentiment prediction to a word or piece of text.

When applied to reviews, this in effect analyzes whether the writer of the review is pleased or not with the topics they are writing about.

Some research directions explore predicting more specific emotional qualities, such as angry, fearful, happy, sad, etc., but the overwhelming majority of systems use either a binary positive vs. negative sentiment, or sometimes include a neutral sentiment option in between.

Again, an example review will be used to highlight different approaches to this task.

For very short reviews, this approach may be effective, but it is insufficient for all reviews that mention both good and bad attributes. As seen with the example review above, marking the entire review this way does not meaningfully capture the whole picture.

The next step is to predict the sentiment for each sentence in the review. However, as the example review shows, it is not uncommon for sentiment to be mixed within a single sentence. Some systems will simply return the sentiment label “neutral” or “mixed”, but this is not informative unless it tells you what specifically was positive and what was negative:

A more advanced strategy is to use Natural Language Processing tools to extract a chunk of the sentence, usually a keyword and its immediately surrounding context.

This way, the model can separate the positive chunk and the negative chunk from a mixed sentence and run prediction on each chunk separately. This performs well under ideal conditions, but due to the diversity and complexity of language use, it fails in many real-world cases.

In our example sentence, we know that the sentiment for burger should be positive because the reviewer loved it. However, because those words occur in separate sentences, it will likely be missed by this approach.

Finally, the most fine-grained approach is to directly mark each keyword for sentiment.

This can be difficult to achieve because the model needs to see the entire text of the review to look for clues whether the sentiment is positive or negative, but at the same time, it needs to know which keyword to focus on for prediction. It would need to know that loved indicates positive sentiment for burger, but that it does not affect the prediction for the word service, despite being in the same sentence. Assuming the model is smart enough to overcome these hurdles, this is by far the most useful level of analysis for the end user.

Takeaway: Review-level sentiment analysis forces complex, nuanced, or longer statements into a single box, throwing away finer sentiment details. Sentiment analysis is most informative and useful when it can make a separate prediction over every keyword.


Classification refers to the task of assigning a word or piece of text to a class belonging to a pre-defined group. This is also sometimes called categorization, although it is not the same task as clustering.

In many ways, classification parallels the task of sentiment analysis, except instead of the classes being positive and negative, they may be things like product, service, value, location, etc. Like sentiment analysis, the main consideration in classification is granularity.

The simplest approach is to assign the class label to the entire review. Some models assign only a single label, while multi-label classification is able to assign more than one.

Using the example review, the single label approach might only assign it to the label “food.” Because the review contains multiple labels, this fails to capture a lot of information. The multi-label approach would ideally assign the review to the “food” and “service” categories. This is an improvement, but it still does not specify which parts of the review point to these classes.

The most fine-grained approach to classification is keyword-level classification. Because most keywords only logically belong to a single class, multi-label classification is usually not relevant for this granularity.

The benefits of this approach compound over many reviews by allowing the user to select a certain category and see the exact breakdown of which keywords are driving the category.

Takeaway: Among deep learning-based models, keyword-level classification provides the most information and contextual awareness.

Final Thoughts

Online reviews provide a wealth of insights for a business but can be labor-intensive to read through and digest.

There are many ways to try to automate this task. Currently, the leading approaches and online reputation management software solutions use deep learning models trained on online review data. The models best suited to this application are able to extract many different kinds of keywords, predict their sentiment, and classify them into relevant categories, which allows businesses to improve operations, make better decisions and elevate the customer experience with data.

(Image credit: Customer Experience Analytics by ReviewTrackers)

Migs Bassig
Migs is the Content Strategist for ReviewTrackers. He loves sharing his marketing knowledge to help businesses succeed, and has helped brands like Intel, Dell, Honda, and Acer communicate more effectively to audiences. His work has appeared on Forbes, the New York Times, CNN, and Ad Age, among others.


Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

Please enter your comment!
Please enter your name here