A CX No-No: Cross-Culture NPS Comparisons

July 12, 2018

4 Comments

I was prepared for the worst.

It was a short domestic flight in France so I thought I would gamble on EasyJet.

I had some experience with “ultra low cost” airlines in the states so I assumed that EasyJet would be the same. I braced for the worse: an old rickety airplane, upcharge for every small amenity with sparse and sub-par customer service.

I was pleasantly surprised. While they did charge for baggage and amenities they were very upfront about everything from booking forward. The plane was older, but clean and freshly painted. The customer service was helpful, knowledgeable, friendly, and polite. We even got “speedy” boarding because I was traveling with my family. All in all our EasyJet experience was much better than the majority of domestic carriers I experience regularly in the United States.

Does EasyJet do well in NPS? According to npsbenchmarks.com EasyJet ranks at a -16. Not good. This is far below the mainline carriers such as British Airways, Lufthansa, or other carriers In Europe. But when your point of comparison is US domestic providers they kick it out of the football arena (or at least through that net thing at the end of the field). Did they have a good day? Maybe. But based on my experience traveling both in the US and Europe, air travel in Europe is a dream compared to the United States.

According to the same site, all US providers are in the positive side of NPS with Southwest at 62, Jet Blue at 59, Delta at 41, United at 10, and American Airlines at 3. Would it be fair to conclude EasyJet has worse service than all the mainstream US providers? Based on NPS alone you might be tempted to say yes.

I would argue it is an unfair and unwise comparison. In fact, this is one of three fundamental reasons why cross-cultural comparisons of many attitudinal metrics (including NPS) are fraught with problems that make their comparison problematic.

Reason 1: Your Experience Sets the Baseline

First, as illustrated in my EasyJet example, your past experience will strongly influence the baseline for your future comparisons. If you have always had nearly flawless experience with shipping in the United States and Europe and then you move to a developing country, of course you are going to be disappointed. Reverse the situation and you will be ecstatic.

This baseline effect is one of the reasons why older car buyers are generally more pleased with the experience of purchasing a vehicle than younger buyers. I remember the youth-oriented Scion brand getting trounced in JD Power rankings when they launched. Their scores were much worse than even the Toyota parent brand. What was really strange was Toyota invested heavily in training and new customer friendly processes for a brand which was housed within Toyota dealerships. Was the Scion customer experience bad? Nope, they just focused on younger buyers who had different (higher) expectations.

This “experience” gap has been usefully exploited by startups such as Lemonade, Uber, Airbnb, and others to disrupt whole industries. Your baseline experience will influence your metrics. Are they real differences? They are to your customers. Can you compare them? You should do so with great care.

Reason 2: Lack of Language Equivalency

Another major issue is the concept of language equivalency. The word “good “ or “would recommend” in English does not always hedonically translate equivalently into other languages. Take for example the word “malo” in Spanish. I am not an expert in Spanish, but my understanding it is that is not good, but probably not as bad as “terrible” but probably not as good as “poor”. I am sure there are better examples, but you get the idea.

Even the most careful screening and testing of Likert based anchor points may not work out, as there may not be exact hedonic equivalents in other languages. There are also other subtle language differences that may introduce bias.

The NPS scale canon holds it should go from 0 (Definitely not recommend) to 10 (Definitely recommend) from low (on the left) to high (on the right). This is very good for Western participants, but what about other cultures? Many Middle Eastern languages go from top to bottom or even right to left. Some Asian languages also go from top to bottom. Does this influence how they may respond to a Western-based left to right approach? Probably. But there is still one even larger issue.

Reason 3: Cultural Response Bias

Different cultures tend to respond differently to Likert scale questions in general. For example, some cultures tend to be more generous in their grading, while others are harsher graders. In the research, I have conducted in the US (and from others globally) Hispanic responders tend to be much more lenient, (give higher scores), while Asian responders are much harsher graders. Is this due to the actual service provided? Nope. These are simply cultural differences in response style. This is further exasperated when we expand to look across different countries for some kind of global comparison. Can this be corrected statistically? Perhaps, but I have encountered no practitioners who has taken the trouble to do so (be happy to hear from you if you have!)

Another unhappy psychometric issue is this customer experience ratings are also impacted by where you live. Without fail, those who live in more densely populated areas are much harsher graders on nearly everything. This means that customers in Hong Kong will always score lower than those in Cheyenne, Wyoming even if the experience was identical.

What To Do Instead

So should you just give up hope comparing different regions of the world? While I would not recommend direct comparisons on NPS or other “NPS”-like measures there are many other practical options. After all, large multi-national organizations must have a way to understand the health of their customer experience globally and where to invest. The good news is there are many other ways by which you can judge where to allocate your time and effort rather than by simple (and misleading) direct comparisons between geographies.

Idea 1: Link Attitudes to Outcomes

A good way of doing this is by conducting Linkage Analysis within each geography. In linkage analysis you connect the exogenous attitudinal variables (perceptions of price, service, product, etc.) to business outcome variables. Since many times this is done at an aggregate level, it is useful to have mediator variables such as NPS used in the analysis. In this way you know what “score” is good by geography by connecting to actual business outcomes. What is important in Turkey may not be in Brazil. Knowing what drives outcomes is much more important than a simple index for comparison. While statistically sound, some front line operators might not trust the perceived voodoo of statistical analysis the underlies this approach. If this is an issue simpler approaches can also be applied.

Idea 2: Look at Improvement vs. Raw Scores

One very simple approach is to look at the amount of improvement a geographic unit has over a period of time. In this way you are not necessarily looking that the score by itself, but the improvement in the score over time. While not perfect (ceiling effects tend to put a damper on the party over time), it is a simple one to apply that everyone can understand.

Idea 3: Focus on Antecedent Behaviors

A third approach is to not focus on attitudinal measures at all, but focus on behaviors. How many cases were closed? How many action plans were implemented? How many complaints were registered? These are all antecedents to an attitudinal construct and usually are influential on business outcome variables (e.g., retention, share of wallet, etc.). While not perfect either, these behavioral measures are not plagued (as much) by the cultural issues.

Idea 4: Get to Language Sentiment

Probably the best approach if cross-cultural comparisons are needed is to start with the true voice of the customer: the verbatim. Build up native text taxonomies of positive and negative feedback in the native language. You can then build indices of the ratio to positive to negative relative the culture and language in which the experience is embedded. Many text analytics providers offer great solutions for this today. It will take a while for your stakeholder to get comfortable with this approach, but it has the added benefit of also being a bit more difficult to be the victim of the “coaching” customers to provide a specific answer. If you want to get really sophisticated hook this in with the linkage approach (Idea #1) and you have a very robust approach.

Practically Speaking

Country and global managers need to make comparisons. This is a business reality. If you really need to do these comparisons, I would strongly advocate a transition to one of four ideas above. At the very least, you should educate your management about the perils of cross-cultural NPS comparisons. Just like what is considered spicy in Calcutta is very different than what is considered spicy in Cincinnati, so too is your Customer Experience and how it is measured.

4 COMMENTS

Duncan Stuart July 15, 2018 At 3:57 pm
Relying on NPS is part of the problem. It is a very rough measure at best – and doesn’t enable true comparisons between different kinds of services, or different cultures: here the wisdom of the researcher, in this case yourself, have had to make the interpretation.
Apart from cultural nuances, I’ve come across big skews in banking NPS data. Would your recommend Bank X to a friend or family? The answer is: No Way – not because of the bank, which is fantastic, but because in life there are two things I won’t do to a friend: sell them a used car (its doomed to break down and wreck our friendship) or recommend a financial service (it is doomed to break down, lose money and wreck our friendship.)
Truly great service is usually characterised by imaginative, insightful service delivery that recognises the mindset and needs of the customer in a refreshing, unique way. A favorite hotel in Thailand offers me a hot towel for my face, and a drink of mango-juice when I check-in – and this little touch is transformative: it literally soothes my furrowed brow, and marks the triumphant ushering from 14 hours flying and airports and traffic to a new psychological place: arrival at a sanctuary.
NPS scores will never tell a service provider how to deliver magic customer service, and with the cultural and sector skews pushing and pulling the meaning of the data around, the whole exercise in seeking comparative measures, and benchmarking against industry standards is rather moot.
Your comparison of UK versus US scores for airline service serves as an excellent example. There’s a point where if we compensate strongly enough to account for inter-cultural differences, then we may as well be making up the data.
For that reason the lack of insight provided by NPS begs the question: why do we need such measures – why don’t we strive for insights?
Michael Lowenstein July 16, 2018 At 1:35 pm
I’m in violent agreement with Duncan Stuart on many of the points he makes about NPS. Not only is it an unreliable and superficial performance measure, when historic regional rating differences are added into the mix (having done global service research for many multinationals, I’ve seen this for years), a real actionability witches brew is created. My pity goes out to any research analyst or manager tasked with the responsibility of interpreting these kinds of results and explaining them to corporate leadership.
Dave Fish July 16, 2018 At 3:22 pm
Yea, I agree with both of you (Michael and Duncan). Unfortunately it is not always the best solution that wins…it is the one that is best marketed. NPS is here and execs focus on it. On the whole it is good they are focusing on CX…not so much on NPS. I have always tried to use NPS as an entry into larger conversations around CX. For example I had a large auto client whose new CEO was fixated on NPS (he came from another vertical). So we put the NPS in and then a bunch of other stuff that actually helped provide insight. Sometimes doing work in the corporate world is not so dissimilar to that of the political world. Sometimes you just gotta hold your nose to get to what really matters ;) Thank you both for your readership and comments.
Michael Lowenstein July 16, 2018 At 8:57 pm
Yes, but…..as I’ve been reporting for years (such as http://customerthink.com/comcasts-nps-gamble-can-the-metric-help-fix-the-customer-experience-culture/), if you start with a metric that is virtually non-actionable, i.e. “….we put the NPS in and then a bunch of other stuff that actually helped provide insight. “, what is the point? Regional scoring differences only compound the felony, making the already difficult to interpret virtually impossible to interpret. In my view, this is a pretty good way to undermine management and cultural focus on CX and EX.