The technology behind recommendation engines
At any moment a suggested news story, vacation spot, movie, music, or book can pop up on our screens. Sales are driven by predicting the right match between items and users. Even as far back as 2013, 35% of the items customers purchased on Amazon and 75% of what people watched on Netflix came from recommended content.
If recommendations are way off target, they can alienate users, and make them move on to a different site. In order for suggestions to resonate with users, there needs to be a minimum amount of reliable data.
Here is a list of six top challenges faced when building a recommendation engine.
1. User cold start – Recommendation engines often use a form of collaborative filtering that collects and analyzes information about user preferences. For example, Amazon scans the user’s purchased or rated items and pairs them to similar items, which is called “item-based collaborative filtering”. If the user is new or there is very little activity on the web site, there can be insufficient data to make a prediction. In this case the system may rely on preferences of similar users. For instance, similar users could be based on friends and family, location, device, or even operating system which we all know may not be a good indicator of personal tastes. This problem is significant for e-commerce companies because if the early recommendations are off, users will leave the site without giving the company enough time to learn the user’s preferences to make better recommendations.
2. Product cold start – When a new item is added to inventory, there are no reviews, ratings, or previous purchases so the system has no information to use as a basis for recommendation. In this case the system needs to identify items that it considers similar and assume that the new product will have the same fans as similar products.
3. Need excessive computing power – If an item has a huge amount of interest, take for example the latest model of the iPhone, there can be a huge amount of data requiring a large amount of computing resources. In this case, data scientists may have to choose only a subset of data to limit cost of computation or computation time. This can negatively impact predictions.
4. Time delays – The relevance of the suggestion can be impacted by timing. The importance of speed differs based on the application. For example, movie and e-commerce recommendation systems can learn at a slower pace while a travel recommendation application is bound to change frequently, forcing a recommendation engine to make predictions in close to real-time based on new data. In this case, recommendations can fail because they just aren’t fast enough.
5. New item – Content-based recommendation systems make matches based on keywords for users and items. However, they typically are limited by a user’s existing tastes and cannot adjust for new items where there is no history of the user’s preferences. This approach keeps recommending items based on the user’s previous selections. This is especially problematic for news sites that need to recommend new relevant news articles.
6. Keyword confusion – Text analysis can introduce mistakes when the algorithm needs to identify keywords that are written differently for the same item. For example, a Jeep can be described as a car, SUV, or recreational vehicle. This is can especially problematic when defining different music genres. Recommendation engines are also challenged when the solution is multilingual and requires translating and comparing words and phrases in different languages.
Depending on the use case, recommendation engines need to be able to adapt quickly to new trends and have the ability to scale up quickly to process more data and output faster predictions. The best recommendation engines typically use a combination of collaborative filtering and content-based recommendations which helps the recommendation engine to overcome the weaknesses of having a single approach.
One way for developers to improve the accuracy of their recommendation engines is to use off-the-shelf pre-trained ML models that have already worked out all the bugs, and by investing in MLOps tools that can help speed up the process of operationalizing models that can regularly monitor these models.
Recommendations are here to stay, but to be productive and useful they require a minimum amount of reliable data. However, if we want to be offered content that truly interests us, we need to collaborate and make our preferences more transparent to the recommendation engines that are trying so hard to anticipate our needs. In order to make these recommendation engines more robust, we need to utilize the best programming tools, computational resources, and platforms to monitor and adjust quick adjustments to these engines.
Source of statistics: https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers