Loyalty – Building a better mousetrap

January 15, 2015

1 Comment

Ask someone how many loyalty programs they are a member of and they can probably name 3 or 4 big programs they use regularly. However, in the US, loyalty program membership is now over 23 per household and between 2008 and 2012 it grew by 10 percent per year. These are great numbers until you realise that only around 1/3 of these memberships are being actively used.

More worryingly, a study by McKinsey back in 2013 suggested that for many companies, those with loyalty programs actually underperformed vs the market with “loyalty-focused companies surveyed [growing] revenues at a weighted-average rate of 4.4 percent per year – compared to 5.5 percent for companies with lower loyalty focus.”

This doesn’t mean loyalty doesn’t work – it does and McKinsey acknowledge that. What it does mean though is that like anything, there is no quick fix; no silver bullet. A “me too” loyalty program is likely to add little long term value if it isn’t designed well and there are all too many of these in the market.

However, what is interesting is that when talking about what a good loyalty program looks like, the discussion frequently looks at program features like partnerships or reward value without really considering what actually makes a loyalty program work – and why they also fail.

If we can understand what makes someone use and continue to use a loyalty program – what’s going on in their brain – then we can truly design a program that works harder and is more rewarding. As the famous quote says “Build a better mousetrap and the world will beat a path to your door”

A good place to start with this mousetrap is back in the month of May, 1938, in a cold Minnesota that had just experienced one of its heaviest snowfalls ever for that month with over 12 inches of snow falling in just one day. That same month, psychologist B F Skinner published his now famous book called The Behaviour of Organisms – and it’s that book, published almost 80 years ago which provides some answers as to both why loyalty programs work, and why they don’t.

Before discussing that though, it’s worth also considering something happening in the here and now.

Consider for a moment how often you check your phone for new email or to check your Facebook account. Whatever number you come up with, you’ll probably be way off the mark because half the time, we do it almost on auto-pilot. One article suggests we check our phones over 1500 times per week – thats more than once every 5 minutes during waking hours.

It’s not just the checking however, we are also at the beck and call of these devices.

A research study by Loughborough University in the UK found that people, on average, take just 1 minute and 44 seconds to respond to a new email notification – with 70% of these alerts getting a reaction within 6 seconds and 85% within 2 minutes.

We talk about the mobile phone being the remote control of life… it could equally be said that the mobile phone is actually the remote control of us.

What this all means however is that we are essentially re-wiring our brains. Our brains are wired to protect us from danger or to help us survive; when we see something in the corner of our eye we respond. This is known as our orientating responses and these are now constantly being triggered by the ping of a mobile or the flash of a notification meaning we’re becoming ever quicker at responding and anticipating a response.

There is more to this phenomenon however and this is where that cold May in Minnesota comes in.

The book published by B F Skinner introduced the world to the concept of operant conditioning. Simply put, operant conditioning describes any voluntary behaviour that is shaped by its consequences and it implies a creature (including you and I) will repeat an activity that produces positive rewards. Underpinning this operant conditioning are a number of reinforcements which, when repeated, serve to further in-grain the behaviour.

Skinner based his research on observing animals such as rats and pigeons, which when placed into a specific environment (known as the Skinner Box), would carry out a repeated behaviour based on the rewards offered (i.e. press a lever to get food). Using this environment, Skinner was able to vary how and when the reward was delivered in order to measure the ability to influence and maintain ongoing behaviour – called a positive reinforcer.

From this research, Skinner came up with three schedules of reinforcement, defined as continuous, interval and ratio based.

Continuous – Defined as a constant delivery of reinforcement for an action; every time a specific action is performed, the subject instantly and always recieves a reinforcement. With this type, the reinforced behaviour is prone to extinction and the behavior can become inconsequential (i.e., producing neither favorable nor unfavorable consequences) and so starts to occur less frequently
Interval – Based on the time intervals between reinforcements. These can be fixed time periods (FI) or variable (VI), with the variable being based on an average time that has passed since the last reinforcement. Both of these are not directly linked to the persons actual behaviour and so typically produce slow, methodical responses.
Ratio – Can be based on the behaviour of the person and be fixed or variable too. The fixed ratio (FR) is based on a specific number of responses (e.g. Coffee Stamp Card), whereas the variable ratio (VR) is based on a particular average number of responses (e.g Slot machines – pays out 10% of the time on average, but there is no guarantee when).

It’s these reinforcement schedules that are key to understanding why we’re so quick to react and respond to that email notification – and why some loyalty programs work and others do not.

Simplistically, where the reinforcement schedule is predictable, whether by being continuous or at set intervals, then the behaviour becomes in Skinners words “extinct” – so its passive and inconsequential and decreases or stops altogether over time. On the other hand, where the reinforcement is on a variable ratio – where both the timing and the value can’t be predicted – then we get the highest rates of response and the the higher the ratio, the higher the response rate tends to be.

In the book The End of Absence: Reclaiming What We’ve Lost in a World of Constant Connection, author Michael Harris highlights this saying “Animals, including humans become obsessed with reward systems that only occasionally and randomly give up the goods. We continue the conditioned behaviour for longer when the reward is taken away because surely the sugar lump is coming up next time.” This “variable interval reinforcement schedule” is really the critical factor in repeatable, ongoing behaviour.

This outcome can be seen in how we interact with email as discussed previously. One behavioural psychology training course describes this effect with email saying:-

“Receiving a message serves as a reinforcer, or reward for, checking. You might check your email at 9:00 a.m. and have 5 new messages, at 11:00 a.m. and have none, and then at 3:00 p.m. and have 7. As long as you periodically continue to receive messages, your checking behavior will continue; however, this behavior can be influenced by the number of messages received. If you don’t receive any messages for 5 days, you may check less often. On the contrary, if you receive several messages each time you check your email, you will probably check more often. In this case, your behavior is an effect of variable-interval schedules of reinforcement. You receive a reward (new messages) for a behavior (checking your email), and the reward is presented on a variable schedule (you can’t predict when it is coming).”

This is also something that the gambling industry relies on to keep punters coming. Co-Author of the book Mind Hacks and lecturer at the University of Sheffield, Dr Tom Stafford discusses this saying:-

“Both slot machines and email follow something called a ‘variable interval reinforcement schedule which has been established as the way to train in the strongest habits. This means that rather than reward an action every time it is performed, you reward it sometimes, but not in a predictable way. So with email, usually when I check it there is nothing interesting, but every so often there’s something wonderful – an invite out, or maybe some juicy gossip – and I get a reward.”

Another industry that uses Variable Ratio (VR) reinforcement schedules is video gaming. In the research paper “Video game structural characterisitics – A new psychological taxonomy” it describes how powerful operant conditioning techniques can be for players saying:-

“Players respond rapidly and persistently to the reward features in video games, such as XP and points, rare items, and meta-game rewards. These features are core components of the variable reinforcement schedule, which is known to create a persistent pattern of responding to a stimulus over time that is resistant to behavioural extinction.”

Video games build on this however using a number of different schedules – continuous, interval and ratio – in a combined way to create what is called a compound schedule which may superimpose two or more different and overlapping schedules to gain maximum effect. Just as a gamer is accomplishing one mission they have already started on another; in this way, using overlapping and compound schedules, game designers keep the players involved which leads to sticky and sometimes “addiction” like behaviours.

In the research study “Understanding and Assisting Excessive Players of Video Games”, authors King and Delfabbro (2009a) found that overlapping quests and objectives (i.e., concurrent schedules of reinforcement) in video games kept players playing for longer periods than games without these features. The report also detailed how the use of VR schedules could also cause game players to carry out behaviours that are repetitive or boring, simply to chase the reward saying:-

“The variable-ratio reinforcement schedules in video games and participants’ need to complete goals often produced what was termed ‘grinding’ behaviour. Grinding refers to the repetition of an action or series of actions in a video game in order to obtain a reward.”

Speaking about this behaviour, one player stated how he “played the same level 10 times to get the full set of armour. [It] gets frustrating but you have to do it if you want the items”

This is really interesting because it suggests that the power of the right mix of reinforcement schedules can actually mask the more mundane actions required to achieve it.

Whilst people obviously consider themselves unique and with their own individual decision making processes, the reality is that people tend to respond consistently to the same kinds of environment. Keeping with the video game theme, it’s interesting that research has shown that it’s more about the design of the game mechanics than the individual gamers characteristics that drives usage. The research paper entitled “The role of Structural Characteristics in Problem Video Game Playing” pointed this out saying:-

“In particular, ‘structural characteristics’, defined as those features that facilitate the acquisition, development, and maintenance of playing behaviour irrespective of the individual’s psychological, physiological or socioeconomic status, have been shown to play an important role in explaining the appeal of gambling activities.”

Further examining what makes video games “sticky”, the psychology book Mind at Play by Loftus and Loftus (1983) showed that the appeal of video games was a blend of variable-ratio and fixed-interval schedules which were intended by designers to be “addictive”. They noted that key aspects of this are that players are:-

Often reinforced almost immediately for correct play
These rewards for good game play are of large [perceived] magnitude (i.e., the provision of 150 points appearing more significant than 15 points)
Rewarded on numerous concurrent reinforcement schedules

So, back to the question at hand – If we can understand what makes someone use and continue to use a loyalty program – like they do a video game – then we can truly design a program that works harder and is more rewarding.

Loyalty programs, in part, already rely on the principle of positive reinforcement whereby when an event or stimulus is presented (e.g. points) as a consequence of a behaviour then the behaviour goes on to increase. It’s this behavioural psychology that underpins much of the change we see within customer loyalty.

However for the majority of loyalty programs, whether explicitly designed in or not, there is only one reinforcement schedule which is the continuous issuance of points in response to a purchase. From the customer perspective, every time I buy I get points which is much the same as the animal in the Skinner box which gets food every time the lever is pressed.

The problem for these loyalty programs is that we already know that this continuous reinforcement schedule is the least likely to result in long term ongoing behaviour – we’re essentially designing in program extinction from the get go.

The second issue is that as previously discussed, our brains are increasingly becoming used to managing constant distraction – honing our orientating responses. In a world where there is always another notification to respond to, another Facebook post to view, another email to read, any loyalty program has got to be able to cut through to compete with this. With so many things fighting for our attention, the larger the gap between the behaviour and the stimulus, the more likely our brain will not link these two activities and the more likely we won’t get the full benefit of this positive reinforcement.

This starts to manifest itself within loyalty program behaviour – customers will typically continue to swipe their card at point of sale because this is a learned (or prompted) behaviour – but it’s done passively. There is no stimulus driving this current behaviour and so its less likely we’re able to influence it at this point in time. Trying to get customers to then change or uplift their actual behaviour doesn’t work because there is no linkage between the behaviour and the reward/stimulus – the behavior has become inconsequential.

To create a loyalty program that works for consumers and works for brands, we have to ensure that the ’structural characteristics’ of the program provide a number of different engagement mechanics – reinforcement schedules – so that we create a persistent pattern of responding to a stimulus over time that is resistant to behavioural extinction

As the research shows, this behavioural design works in gambling and it works in video games – some would argue it works too well. However there is already evidence that it works within loyalty programs. Some programs that do work well have many of these characteristics with a good blend of compound schedules.

For example, in a classic frequent flyer program there can be a continuous reinforcement schedule around miles earned for flights, but these are overlapped with fixed ratio schedules such as collecting towards tiering and benefits like companion tickets. Whilst these are not necessarily using the most powerful variable ratio (VR) reinforcement schedule, they still manage to engage members through compound reinforcement schedules.

This also brings into focus gamification – that new entrant into loyalty program design – and starts to explain why, when implemented well it can truly accelerate program engagement. Whether expressed as access to time limited deals, unlocking of recognition or achievement of challenges, gamification allows the loyalty marketer to superimpose additional, overlapping reinforcement schedules including variable ratio to gain maximum impact and benefit. This isn’t just theory either – we’ve seen examples of this whereby simply switching a recognition mechanic from a fixed ratio to a variable ratio has resulted in a 33% increase in ongoing usage.

Yet loyalty programs continue to be launched and continue to under perform.

As loyalty marketers its imperative that we understand why loyalty programs work, why consumers respond to them and how to make them work better for all. In the words of B F Skinner “A failure is not always a mistake, it may simply be the best one can do under the circumstance. The real mistake is to stop trying”

Republished with author's permission from original post.

1 COMMENT

Clive February 24, 2015 At 5:13 am
I have around 30 loyalty programmes (more than half of which are currently active) across hotel groups, airlines, retail, survey sites etc. , so thought I would share my opinion on the subject. Reading through the article I was expecting to see what I think is the most important element of the loyalty programme, the ultimate reward. The ‘reward’ in the context of the article seems to be the earning of points, and the conditioning of collectors in doing so. But in my opinion the real reward is in the spending of points, and most importantly the value received when you do.
Whilst earning points is largely predicated on a fixed ratio, i.e. you spend money or undertake an activity and earn points it can also have a variable ratio element with double point promotions for example. But that in itself is not going to make a successful loyalty programme, the real secret is to ensure there is value, variety and a variable ratio when it comes to spending points.
If we take the most successful retail loyalty programme in the UK, the Tesco Clubcard scheme, this is generally a straightforward spend and collect scheme with a few variable offers. But on the reward side they are paired with many companies in the leisure sector offering days out, hotel stays, train tickets and air miles. The Tesco tie up with both the British Airways Avios scheme and the Virgin Atlantic Flying Club scheme means offers great arbitrage opportunities. Converting Tesco points to BA or Virgin miles means effectively buying air miles at around 0.5p each, when you can easily get 1p of value when booking flights. Add in a variable ratio with an occasional 25% conversion bonus and you have a winner.
So in my opinion it is the value and variable ratio when spending points that makes a successful loyalty programme. If you have that then you can overcome the fixed ratio nature of the collecting stage.