I just read a wonderful, simple book about statistics that inspired me to think, once again, about support metrics. The book is The Numbers Game by Michael Blastland and Andrew Dilnot, two British economist and journalist that set out to de-mystify statistics bandied about by politicians, mostly, but much of their clear and often funny advice can be used in many other contexts, including the one that we all care about, technical support.
1. Know what you are counting
Are “cases per day” incoming (new) cases? Closed cases? “Resolved” cases (we gave the customer the solution, but s/he has not confirmed it’s working yet)? Cases that were “touched” or worked? Make sure each concept is carefully defined so everyone can measure the same thing. The people who write the reports will thank you, and you’ll get much better trust in the metrics from the entire organization.
And I would strongly advise to use the simplest possible definitions. If I can be permitted a rant so early in the newsletter: stay away from “resolved” cases and “worked” cases. No one really knows what they are and they are o-so-easy to fudge.
2. Use benchmarks or percentages to anchor the numbers
Is 1000 cases per day a lot? Gee, I don’t know. How many do you normally get? How many customers do you have? If you normally get 10,000 cases per day, 1000 would be a very slow day. On the other hand if you normally get 100, something went very wrong… You (should) know what’s normal for you but other people looking at your support metrics do not: offer a comparison to the average or a trend over time.
Is 7 cases per day per person a lot? It depends. If you run a high-complexity support organization that would be a lot (really: I have customers who resolve about 3 cases per day per head and are thrilled with their high productivity!) If you run a low-complexity group that would be very low.
Let’s go back to the first example again. Is 1000 cases per day a lot? It would be sickeningly high if you have 1000 customers (everyone called??) but normal for some products if you have 10,000 customers who log about 2 cases per customer per month or 120,000 customers who log 2 cases a year…
3. Expect clusters and coincidences – without an underlying cause
You got 40 installation cases yesterday when you normally get 2. Is it time to scream bloody murder about the installation process? Maybe, but look at the details first. Especially on a small base you can see spikes come and go with no apparent reason. Don’t get too excited without digging into the specifics.
4. What goes up often comes down
Customer satisfaction ratings were at an all-time low so you implemented a new coaching program, and sure enough numbers came back up. This proves that the coaching program is a success, right? Maybe, but perhaps it was just a normal variation. Now if you show that coached individuals got higher scores while un-coached ones stagnated, you would have a better “proof.” Ditto if you can see that the improvements are stable and long-lasting.
5. Averages are simple, but deceiving
As we were taught in that soporific statistics class years ago, long-tail distributions mean that averages can be distorted. So if you have 99 support reps closing 7 cases a day and one rep, perhaps allowed to cherry-pick for whatever reason, closing 80 (I have a customer that matches this exact profile), all support reps but one will have below-average productivity (The average will be 7.73, and 99 reps will be below it.) Don’t just look at averages: check the distribution of the data and use other measuring techniques.
6. Setting targets distorts results
If cases have to be responded within an hour, many will get a response between 50 and 60 minutes. If the target is 2 hours the peak will be right before that 2 hour mark. Perhaps that’s not too much of a problem for you but think of customers who would much prefer a response earlier in the timeframe.
And then there’s the cheating: if you have a target of X cases per rep per day, some reps may be tempted to cherry-pick the easy ones, and perhaps to invent a few extras (it’s so easy to just create a new case for that ten-second, “I just needed to check on something else” query from a customer.) If you set targets guard against manipulating the results.
And watch out for reshaping the definition of the target. For instance it seems that many support organizations invent a new “resolved” status so they can meet their (self-imposed) resolution targets despite customers’ opposition to closing cases outright. That could put the entire organization on the cheating slippery slope.
7. Sampling may be hazardous to your data integrity
I love sampling. For instance, I often advise my customers to sample cases worked by just two or three reps for a day to get a feel for case distribution, or the time it takes to resolve cases (assuming there are no measurement techniques in place, and the results are usually very good despite the small size of the sample.
Now sampling only works with properly randomized samples. So if you sample only Monday’s cases and Monday is “naïve customer day” (say), your sample won’t be true. Or if you only sample cases from the Australia office, which just happens to take all the emergency after-hours cases from US customers. Or if you sample cases from the backlog (being worked) queue, which by definition will be a little more complex than the average case.
8. Don’t draw straight lines
Case volume increased 20% this month so will continue to increase 20%. No (see also #3.) Case volume increased 20% this month and customer satisfaction also increased 20% so clearly increases in case volume cause increases in customer satisfaction. Silly, isn’t it? But we might be tempted to make the same (opposite) conclusion if customer satisfaction was down instead of up wouldn’t we?
9. Compare the comparable
Jane resolved 20 cases yesterday and Joe resolved 2. Let’s hire more reps like Jane. OK, but Joe is the only tier 3 rep we have. Isn’t he valuable to the entire organization? 5000 customers used the self-service support web site today, so that means we avoided (deflected) 5000 cases, right? Nope. For starters we don’t know what they did on the web site (did they all log new cases? Were they all checking on existing cases? Can you track whether these same 5000 visitors log any cases? That would be a much better approach than comparing cases and visits.
Here’s another angle. Reps used to resolve 5 cases per day; now they resolve 4. Did they get lazy? Perhaps incoming cases are truly more complex, thanks to a better self-service offering – or a new product, which is either more complex or less familiar to the reps. Be careful when you make comparisons.
There is more on this topic in the FT Works booklet Best Practices for Support Metrics.