This article is from oppo Internet technology team. If you need to reprint it, please indicate the source and author. Welcome to our public address: OPPO_tech
Press: This article mainly introduces the machine learning application scenarios in the advertising scenarios, and the different requirements for the algorithm in different scenarios, which can be used as an introductory article to understand the effect of the algorithm in advertising.
In the whole advertising process, data plays an important role, but the ultimate value of data maximization, the core depends on the algorithm in each key process.
Let’s first look at what machine learning can do in the whole advertising process, and then talk about what we need to learn and understand in the subsequent planning logic.
Ranking of advertisements
In the framework of advertising technology, a very important part is the ranking of advertisements. Therefore, when it comes to the application of algorithms in advertising, the first thing we can think of is to achieve the optimal ranking of advertising through machine learning.
In the logic of advertisement ranking, there are several factors: bidding, context matching and CTR estimation. This is a problem of multi-element combination and optimal sorting, in which CTR prediction is the most core problem to be solved in the field of advertising.
Because in many advertising systems, context understanding may not be achieved, and the bidding logic may be a little rough, but for CTR estimation, it is the most important problem, which needs to be solved first. In essence, CTR prediction is to calculate the click probability of each advertising candidate pool for candidate exposure users. So it’s a very typical partial regression problem.
Continue the above topic, context understanding or context matching. In short, it is the calculation of the matching degree of environmental factors and advertising factors, or the calculation of correlation degree. Of course, matching degree is only a way to solve the context understanding or this kind of machine learning model can solve this problem.
In fact, there are other ways to solve this problem. For example, when there are enough samples, we can not only calculate the relevance of the content, but also think from the idea of the recommendation system, that is, a large number of advertising environment and advertising exposure relationship data, and the CTR data of users under this combination. This is a very typical scenario of association analysis. The context of the environment and advertising, through historical information, do relevance analysis, rather than content-based relevance analysis.
From the point of view of purpose, users are expected to click on advertising ultimately, so whether it is correlation analysis or correlation analysis, it is a way.
Looklike population expansion
Crowd expansion is a typical demand scenario in the advertising field. To be more straightforward, what should we do when there are not so many target user groups? Give me a prediction and expand it.
Therefore, looklike essentially gets the core users (the so-called core users, that is, the high transformation population that has been verified), and then calculates the similar users (not the content similarity in a strict sense, but the ultimate transformation goal similarity), so as to achieve the purpose of expansion.
Generally, the original users are the users who circle and select the targeted group, but there is a gap between the targeted group and the actual demand exposure level. There is also a more common extension scenario, that is, advertisers import their own accumulated high transformation core population, which is the most accurate orientation, and then the platform is responsible for finding the same people.
Back to machine learning, you can think that this is a similar user computing scenario, binary judgment, probability calculation, the simplest, you can use LR to meet your needs. But it’s not only that simple, because many times you will find that the training samples may be millions, even tens of millions of datasets, and then if the dimensions are not careful to achieve tens of millions, hundreds of millions, or even tens of millions, then you have more problems to solve.
We know that orientation is the stage of crowd recall in the advertising system, from the basic gender of men and women to the higher business interests, such as whether you want to lose weight, whether you want to borrow money or not, and these most concrete forms are user tags.
Every mature advertising platform has a complete and relatively accurate label system, and each user in the system can more or less put on several labels. Therefore, there is always a suitable advertising scene, suitable for making corresponding recommendations.
The essence of advertising is traffic and data distribution. It’s a little too straightforward, but that’s the ultimate nature of tagging different groups of people.
The label manufacturing is simple and difficult. For a simple example, the most basic gender label, if there is a scene where you can get the ID number, is it easy, but if not, guess? Therefore, this is a typical binary classification scenario (there are also three categories, such as Weibo, in addition to men and women, as well as accounts with institutional attributes).
In addition, there are hundreds of other labels, at all levels, in various fields of subdivision, which can be labeled in the way of behavior rules. Judging by behavior rules, as long as the rules are reasonable, the accuracy is certainly predictable. However, on the one hand, the number of explicit behaviors is small, which can not meet the needs of large-scale exposure; on the other hand, the recall ability is too poor, which is called generalization ability in the terminology of machine learning.
Therefore, from the perspective of machine learning, this is a typical discriminant model scene, and it is a super multi classification scene. Of course, you can also convert it into binary classification, which is a yes or no answer for each tag.
Of course, in the actual processing, most of them may only calculate a probability value, and then do further calculation and judgment. Based on the user’s behavior, we are sure to encounter a large number of texts, and natural language processing is indispensable.
The so-called anomaly analysis should belong to the category of partial anti cheating. For example, how to deal with the false volume caused by large-scale machine click advertising? Even if it’s not a machine, there are always some people who like to play advertising. What’s more, it’s the type of advertising scene or the second-class e-commerce that links to release and collect information.
(Note: a type of e-commerce refers to Taobao / Jingdong / tmall, etc., which are generally familiar. Strictly speaking, it refers to the e-commerce in the form of shop promotion of online payment, with online payment, typical mall structure, online shelf sales mode, etc. The core of class II e-commerce is the promotion form of cash on delivery, which usually does not have a mall or shelf in a strict sense. The typical mode is the logic of single product arrival, online order, and sign in payment.)
It’s normal to fill in an empty number, and it’s normal to call someone who is not myself. In the second-class e-commerce, the address has arrived at the house number, and the name is written “Jay Chou”. Should you hesitate whether to deliver the goods or not, in case of rejection of return postage, it’s useless.
The advertiser is worried about sang. Every click is to deduct money. Every rejected list not only wastes the energy of delivery, but also the cost of postage back and forth. It’s a small business and can’t stand the toss.
Therefore, the second kind of e-commerce must control the main dirty list rate (the so-called dirty list is that address can’t be found and all kinds of rejected lists), and the CPC layer, a large number of malicious invalid clicks are also not feasible, and these pressures will eventually be implemented on the platform, and the platform must solve this kind of problem.
Distinguish those users who have malicious advertising behavior (easy to handle with historical stains), but those who have this potential also need to be distinguished.
From the perspective of machine learning and algorithm, this is a typical classification scenario, but it is not a typical classification scenario. You know, there are always a few bad guys. If there are bad guys on the platform, the platform doesn’t need to play.
This is a “needle in a haystack” job, that is, to get out of tens of millions, or even hundreds of millions of users, only a few hundreds of thousands, at most millions of “possible bad guys”. This is a typical scene with unbalanced positive and negative samples, which is a big taboo in the classification scene and belongs to one of the very difficult scenes to deal with.