It’s easy for e-commerce recommendation system to score 60, but it’s difficult to score 80 or 90


It's easy for e-commerce recommendation system to score 60, but it's difficult to score 80 or 90

Graduated from Institute of automation, Chinese Academy of Sciences. He has more than ten years of cutting-edge exploration and industrial practice experience in machine learning and recommender system. Many papers have been published in recsys, CIKM and other recommender system international academic conferences. He is now the head of recommender advertising algorithm in Jingdong.

With the development of Internet, recommendation system is everywhere, and it has become the revenue engine of many e-commerce platforms. JD’s personalized recommendation system has also brought great benefits to the company. With the more and more important role of recommender system in information distribution, we are also exploring how large-scale machine learning, deep learning and other technologies are applied in Jingdong’s commodity search and recommendation, and what conditions an efficient and valuable recommender system should meet.

How recommendation system drives business growth

In the digital information age, recommendation system has become the standard technology of to C Internet products, and recommendation algorithm also plays a crucial role in improving business income. Platforms such as Amazon and Netflix will gain huge business value through recommender system. According to statistics, recommender system can generate more than $1 billion of business value for Netflix every year, and about 40% of Amazon’s revenue comes from personalized recommender system.

For e-commerce, personalized recommendation system can meet the massive demand of thousands of people. In fact, its essence is to use machine learning or deep learning algorithm to build user interest model in the case that the user’s purchase intention is not clear, combined with user characteristics, commodity characteristics and scene characteristics, so as to find the goods that users are interested in from the mass of goods, shorten the distance between users and goods, and improve the user’s purchase efficiency and product experience. Peng Changping believes that personalized recommendation is an effective distribution mechanism in scenarios with extremely rich candidates. He explained the driving force of Jingdong recommendation system on business growth from two aspects of commodity quantity and quality.

First, quantity,The SKUs of e-commerce products are far more than the human brain can handle. For example, “jam” has more than 100000 SKUs in Jingdong. Scholars from Stanford University once conducted an experiment in the offline supermarket. Group A provides 24 kinds of jam, and only 3% of the users who stay in front of the shelf buy them. Group B provides 6 kinds of jam, and 30% of the users who stay in front of the shelf buy them, which is relative to group A It was 10 times higher in group A than in group B. “Less is more”, in the e-commerce scenario with too many candidates, the personalized recommendation of “goods looking for people” helps users screen out a small number of suitable choices.

Second, quality,Personalized recommendation is with platform values. Jingdong recommendation system integrates all the information of brand, attribute, price, evaluation, logistics and so on, and mainly promotes “good”, “provincial” and “fast” products. Therefore, while bringing better shopping experience to users, user stickiness will also increase, thus forming a virtuous circle and bringing better revenue effect.

With the development of large-scale machine learning, deep learning and other technologies, they are widely used in product recommendation. Peng Changping believes that in the current industry, recommendation system is the most widely used, in-depth and successful system of machine learning algorithm. In almost every link, we are using data and algorithm driven model to replace manual brain patting.

Maybe the most familiar application of deep learning technology in recommender system is click through rate and conversion rate estimation, but he also gives several other application examples: first, recall. Recall is difficult to solve all problems in one model. Therefore, JD also uses vector based, tree based and graph based in recall There are many types of deep learning models; second, the knowledge map of goods, the text, picture, video understanding of goods and the relationship between goods almost completely depend on NLP, CV and other machine learning algorithms; third, rerank reordering and recommendation is a multi-objective optimization problem, which needs to be done on the basis of the predicted click through rate to improve the user experience and browsing depth Global optimization guides users to drop-down business scenarios, which is very suitable for deep reinforcement learning.

What are the characteristics of a high quality recommendation system?

Due to the different user groups, business scenarios, regions and cultures, there are thousands of people in the recommendation system. Among the numerous details, the recommendation systems of different platforms are also different. Peng Changping said that compared with video, information, live broadcast and other media content platforms,The recommendation system of Jingdong e-commerce is relatively easy to achieve 60 points, but it is very difficult to achieve 80 points and 90 points.

From the perspective of framework, recommender systems are all doing user understanding, item understanding and matching of the two. The system has links of product selection, recall, click through rate estimation, rerank reordering, etc. However, the difficulties of e-commerce recommendation lie in the following three aspects:

First, from the perspective of the user side, the demand of users on the content information platform remains relatively unchanged for a long time, and the consumption process of content is completed online. The generation and consumption process of shopping demand are both online and offline, and online is just a transaction process. The offline process is difficult to track and digitize, and the e-commerce scene has great challenges to identify and stimulate user needs;

Second, from the point of view of the item side, content producers of the content information platform can update their content day by day with different patterns around the same interest topic. In the shopping scenario, if the user has already purchased, the same kind of goods can no longer be recommended, and the demand for expanding and stimulating users is higher;

Thirdly, from the perspective of the actions that the recommender system wants users to make, the content information platform mainly meets the entertainment needs of users, and the cost of consuming unreasonable recommender information is very low. In the shopping scene, the recommendation system expects users to click and browse, let users plant grass, and even let users spend money to buy. If the item quality is poor or the recommendation accuracy is not enough, users will abandon the recommendation function of the platform, or even lose it from the platform.

So what are the characteristics of an efficient and valuable recommendation system? Peng Changping believes that it is a good recommendation system to distribute the items that users like in front of them when they have no active demand expression. Such a system needs to meet the following three conditions:

First, it is to meet the needs of users, which is reflected in users’ willingness to see and long stay;

Second, it is growth oriented, which can expand users’ interests, drive the growth of high-quality goods or content providers, and be friendly to new users or new businesses;

Thirdly, the recommendation system can promote the survival of the fittest.

To achieve these three points, the recommender system needs to do several aspects of work: first, learn from the user behavior feedback and item information, and make the model match adaptively based on data; second, there is no silver bullet in the recall link, so it needs to use a variety of different types of algorithms to do the recall, and the models in each stage should have strong generalization ability, which is suitable for cold start users and items Third, the optimization objective function reflecting the platform values is mostly multi-objective optimization.

Application practice of e-commerce recommendation system

Recommender system is a kind of information filtering system, which is used to predict the “score” or “preference” of users for goods. Its goal is to produce meaningful recommendations for goods or content that users are interested in. In the Internet full of massive information and data, if there is no recommendation system, users want to get valuable content is like looking for a needle in a haystack. Recommender system can search a large number of dynamically generated information, provide personalized content and services for users, and effectively solve the problem of information overload. With the explosive growth of digital information and Internet visitors, recommendation system is more important than ever.

The development of Jingdong recommendation system has gone through the following four stages

1、 To meet the needs of users.In terms of meeting customers’ needs, the earliest system was transformed from the search system, which understood the products recently browsed by users as users’ needs, and item based CF was the most important means of recall.

2、 Expand the user demand stage.At this stage, whether from the perspective of data or algorithm, the richness of recall is improved from as many angles as possible. For this reason, JD has set up a project called “recall kaleidoscope” to continuously improve the diversity and coverage of recall. In the sorting process, the optimization goal is from emphasizing the click through rate and conversion rate of matching degree with users to optimizing the drop-down depth, novelty and diversity of users.

3、 Session global optimization and business ecological optimization phase.After entering this stage, JD’s optimization focuses on rerank, which regards the user’s pre order browsing behavior in session as a complete list. Rerank sorting is a process of list generation and list evaluation, that is, optimizing the overall user’s views and hits of the list. Another direction is the introduction of ecological optimization mechanism. The model quantifies the long-term value of an interaction between users and goods to users and businesses, and introduces the estimated quantitative value into the ranking mechanism.

4、 Cross user group and cross business group joint optimization stage.With the development of Jingdong’s business, the coverage of user groups has expanded from a relatively single group to a very diversified group, and the proportion of users in the third to sixth tier cities has exceeded 60%. Whether it is within the Jingdong app, or the Jingdong express version and Jingxi, which are specially customized for the sinking market, the expansion and customization of user groups have become a new app With the rapid growth of the Internet, the recommendation algorithm for thousands of people and thousands of faces is facing more challenges. In this stage, knowledge mapping, transfer learning and other technologies play an important role.

In different periods, Jingdong recommendation system has made a lot of efforts to improve the accuracy, precision and coverage of recommendation. Peng Changping said that in order to improve several seemingly contradictory optimization objectives of the recommendation system at the same time, we need to start from three dimensions: diversification of recall algorithms, from calculating user item pair level optimization to session level global optimization, and ecological optimization to escort the growth of high-quality businesses. JD has done the following work from these three perspectives: first, recall Kaleidoscope: from the recall granularity, we have established hierarchical representation with different coarse and fine granularity on user and item, and we have done the matching of the two from different granularity. In terms of recall algorithm, Boolean matching model, embedding based retrieval and knowledge-based retrieval all account for a large proportion of our recommendation results. 2、 Session Global Optimization: from the perspective of single recommendation candidate, there is a contradiction between accuracy and surprise, while from the perspective of maximizing the overall session hits, the two are unified, that is, CTR model has changed from pointwise to listwise. 3、 Business Ecology Optimization: the quality classification and cold start mechanism of new businesses and new products effectively guarantee the exposure and order quantity of high-quality parts on the platform. A steady stream of new businesses and new product launches are important driving forces to improve coverage and surprise.

According to Peng Changping, there are many sub scenes on the JD platform, and each sub scene has a lot of subdivision search and recommendation. For the joint optimization of these sub scene recommendation, the main use is the migration learning algorithm. The user behavior of each sub scenario is inadequate, but each scenario has its own unique user behavior pattern. JD uses the data of the main scene and several sub scenes to train the model, and designs a set of multi-layer network structure, so that the model can transfer knowledge from both the main scene and similar sub scenes. Through migration learning, a single model of sub scene can be applied to multiple terminals such as Jingdong app, Jingxi app, Jingdong express app, wechat shopping, QQ shopping, etc.

With the increasingly fierce competition among e-commerce platforms, how to attract more new users and increase the activity of old users and platform stickiness is the key factor affecting the development of the platform. Therefore, the continuous iteration and upgrading of the recommendation system is particularly important. In the future, Jingdong recommendation system will be optimized in three technical directions: shopping guide content recommendation, scene recommendation and ecological optimization mechanism.

From the perspective of shopping guide content recommendation, with the e-commerce content represented by live delivery, Jingdong platform has accumulated a large number of content producers, and their high-quality content and goods together become candidates for recommendation system Item, with different types of materials and different optimization objectives, presents a greater challenge to the algorithm, and the richer content also brings users a better shopping experience of “shopping” and “buying”.

From the perspective of scene recommendation, when it comes to the experience of “shopping”, many people have deep feelings about the scene layout of “IKEA” stores. JD is developing an understanding based on user’s commodity consumption scenarios, recommending a complete set of commodities required by the scenarios, presenting them to users in a more three-dimensional way, and providing online scene shopping experience.

Finally, from the perspective of ecological optimization mechanism, what we need to do in the future is to strengthen the business survival of the fittest mechanism in the recommendation system and the growth mechanism of high-quality new businesses and new products.

Technical problems and breakthrough

Although the recommendation system has greatly alleviated the problem of information overload and met the personalized needs of users, there are still some problems hindering the development of recommendation system. Peng Changping believes that the biggest difficulty is still the problem of “data”. Specifically reflected in two points: first, how to comprehensively acquire and quickly process data; second, how the model can learn more efficiently from massive data.

Then, in solving the problem of comprehensive acquisition and rapid processing of data, we should first figure out how to solve the problem of “comprehensive” and “rapid”. “Comprehensive” requires the fusion of online and offline Omni channel data from every contact point interacting with users; “fast” requires a quasi real-time streaming data processing mechanism to improve the timeliness of data to model and model parameter update. With the diversification of IOT terminals and the improvement of terminal computing capacity, the combination of end-to-end computing and cloud computing can further improve the timely response of recommendation system to user feedback.

In the face of massive and complex data, we should not only improve the absolute computing power of the model system, the absolute amount of data processed by the system and TB level complex model services, but also improve the adaptability of the model structure to massive data. On the latter issue, Peng Changping said that he is more optimistic about the maturity of automl technology, such as our current NAS In the network structure search work, the effect has been equal to the model structure optimized by professional Algorithm Engineers for a long time. In the near future, I believe it can replace the alchemists who adjust the model structure.

Peng Changping believes that:

The recommender system of industry has no single core technology. In the recommender system, the algorithm is dominant and people are relatively passive. Both users and businesses have a low tolerance for algorithm errors. Only when the system collects as complete and as efficient data as possible, adopts more efficient algorithms and polishes every detail, can users and businesses trust the recommender system.

With the progress of technology, clothing, food, housing, transportation, entertainment, every field will enter the state of oversupply. It can be predicted that with the popularity of 5g and IOT, people dealing with electronic devices will increasingly rely on recommendation technology, not even a platform level recommendation system, but everyone needs a personalized recommendation “assistant” in every field.

Live broadcast notice

If you want to have a direct communication with Mr. Peng Changping, then the opportunity is here!

At 20:00 p.m. next Monday evening (September 7), Peng Changping will visit InfoQ online open class and bring a wonderful sharing of “application practice of Jingdong e-commerce recommendation system”. Small partners who are interested in user interest development in e-commerce scene must come to see it!

It's easy for e-commerce recommendation system to score 60, but it's difficult to score 80 or 90

Recommended Today

Rust and python: why rust can replace Python

In this guide, we compare the rust and python programming languages. We will discuss the applicable use cases in each case, review the advantages and disadvantages of using rust and python, and explain why rust might replace python. I will introduce the following: What is rust? What is Python? When to use rust When to […]