In the 21st century, aka the era of customer-centricity, it’s hard to think of a high-quality software product that does not use personal recommendations. This functionality is actively used for cross-selling and upselling and brings immense benefits to both the company and the customer.
What is recommendation system? Recommender systems makes the online shopping process much more pleasant. Fast-food chains offer us new menu items based on the ones that we already tried, browsers track our search history and show the corresponding ads, and streaming services know which show is going to be your next favorite one.
All that is possible due to the constant growth and development of the computational power and the Python machine learning technology.
"Why do I need to use machine learning, after all?" Developers often ask this question at an early stage of developing a software product architecture. There are many important things to consider: what part of the app the intellectual module will be responsible for, what kind of load is expected and (probably the most important question) whether it is possible to avoid using ML.
There is an unspoken rule: if the task can be solved without using machine learning algorithms and artificial neural networks, then do not use them. Judging by the experience of many well-established companies, an effective algorithm can solve the task faster and better than an ML&DL approach.
The method’s efficiency depends on the data that is needed to teach the mathematical model and on the model’s learning time which may take up to several days. So it may actually turn out that the simplest algorithm can efficiently resolve the issue with no ML applied.
But every rule has an exception. In the example above, the exception would be the deep learning recommendation systems.
The thing is, the approach with determined algorithms is not always suitable for resolving the allotted task. For example, if the system is very dynamic, meaning, the number of users or objects is growing exponentially, then the addition of new objects and search for associations between the users may take some time or even be impossible at all due to the nature of the used approach. However, such a task can be resolved automatically by using Machine Learning technology.
By uploading the data about the users and objects, a machine can search for the associations independently and propose a new object in the nearest time.
Recommendation systems using machine learning are used for both scientific and business purposes. In business, such systems are designed to give the customer information about the service or the item that they might be interested in. Depending on the business type, the machine learning algorithms used for recommendation systems can serve different functions: propose movies and shows, new clothes and goods, restaurants, etc.
The base of every recommender engine system is a table, where the rows contain the users of this system and the columns contain the items that are available in the system.
In the crossing point of the user with the item, there is a box with a rating. This rating shows the user’s level of interest in the item. The problem is, many users rarely rate the items that are available in the system. Because of that, the result table turns out to be very loose and mostly empty.
This way, a recommendation system has to combine the available data and consolidate the missing values by the assumed rating. In addition, not all services offer the opportunity to rate the available services or products, which complicates the task.
Considering the reasons stated above, the selection of users’ rates in the system can be done in a few ways: explicit and implicit.
The first method implicates the univocal rating of an item by a user, thus, displaying the obvious interest towards a certain product/service. An recommendation system example would be rating a product with a five-point score.
The second, implicit method implies the rating of the user’s actions by a system: i.e. estimating the number of purchases of this specific item or estimating the time of watching a certain video. This approach is the most popular and is almost all-purpose. It works well both independently and in conjunction with the explicit method to make the latter one more precise.
When comparing the two methods, the implicit one is simpler in terms of both data analysis and implementation. This is related to the fact that with an implicit approach, we can only make an assumption on whether the user liked a product or no.
For example, the view of a video does not always mean a person is interested in it and a few purchased products would not guarantee the interest either as they could just be a gift for another person.
Depending on the architecture of a software product, data structure and the analysis method, there are a few types of recommendation systems. The biggest ones are:
This is the simplest system from all the available algorithms. The systems working by it recommend the product by the following principle: if everyone likes it, you will like it as well.
Such an approach is not very precise and the results are quite questionable and not precise at all, as the system often shows completely irrelevant items. This algorithm is suitable for the systems that do not require user registration or that are used in addition to the other algorithm. Non-personalized recommendations are mostly used to provide recommendations for first-time users when the system knows nothing about their preferences yet.
This is the most popular system, which began developing in the 90ies. With collaborative filtering, recommendations are calculated on the base of related users or items.
The results are achieved by a cooperative comparison of multiple users and turn out to be very precise and accurate.
Even though the use of this method delivers very good results, its practical realization is quite complex due to the big number of calculations that are needed to search for similar users. This is related to the high asymptotic complexity of the algorithm which is an indicator of the algorithm’s efficiency. This, in turn, leads to the rapid growth of memory consumption and computational power. Such an approach is not very stable in relation to the dynamic systems and can be used for static systems only, like looking for food matching.
Luckily, there is a quicker algorithm, which became really popular in recent years. This algorithm is matrix factorization. This approach is based on the theorem that any matrix can be displayed by the composition of two others. In the
ML recommendation systems, that would be the user matrix and the object matrix. Each of these matrices display the special aspects of the users or the objects. Such an approach is considered incredibly efficient as it can precisely identify the “hidden” wishes of the users and delivers really precise results. For now, matrix factorization is one of the most efficient ways to implement recommendations.
In practice, the use of one algorithm for creating a deep learning recommendation system will not deliver the needed result so it’s better to use mixed recommendations to solve problems that are related to a certain algorithm type.
Any system that is based on Machine Learning principles demands initial data to teach the weighting factors. With the lack or absence of the data, the system will not be able to deliver results. In recommendation systems, this problem is known under the “cold start” name. Each algorithm requires a certain set of data, needed for proper work. The problem appears upon launching the service in production when the number of users is minimal and the used algorithms are not applicable. There are a few solutions to the problem. First, the items can be recommended by the date of adding to the cart or randomly choosing them from the whole item set, hoping the user will like some of them. Second, the system can be started after the service works for a while and there will be enough data to analyze.
Machine Learning is a powerful tool for solving numerous tasks. Such features as personalized recommendations, associations, and classifications of various items can become a valuable asset for almost any system. In future articles, we will have a closer look at Machine Learning and specifically on the recommendation system with implementation on the Python with numpy and scikit-learn libraries, their working principles and possible ways of implementing them for real-life projects.