# 1、 Introduction

Guaranteed delivery is a common way to purchase brand display advertisements. The existing technical solutions are usually to abstract and model the problem in the crowd granularity. On the one hand, this modeling method ignores the differences of user behavior under the same crowd, on the other hand, it can not accurately control the constraints of user granularity.

At present, the academic research on contract advertising traffic allocation usually abstracts this problem as a bipartite graph matching problem of contract side and supply side. However, the current allocation strategy is to stay on the granularity of population and label, which requires that the division of population and label must be orthogonal. In addition, there are many limitations in contract guaranteed allocation at the crowd level.

First, because only distribution is made at the crowd level, it is impossible to match the personalized behavior of users to the correct advertisement through accurate user behavior prediction, which will reduce the return on investment of advertisers and further reduce the future income of advertising platform. Secondly, advertisers usually put forward complex requirements for the control of the frequency of users, such as the frequency control constraints of user granularity. A typical approach is that in order to improve the UV touch under the fixed budget, advertisers often limit the exposure frequency of a single UV. Therefore, the traditional low efficiency of the granularity distribution of population label makes it difficult to apply to the current contract advertising products.

In this paper, we try to establish a large-scale distributed contract advertising placement allocation algorithm, and introduce the user personalized delivery index. On the basis of considering the user interaction behavior, we carry out the contract advertising placement allocation in the user granularity. Our algorithm can deal with complex constraints, such as advertising priority, display frequency control and advertising placement capacity limitation. On this basis, we also develop a real-time budget smoothing strategy to further optimize the advertising effect (such as CPC per click). At present, our system has actually carried a billion scale offline computing task and applied online in Ali mom brand display advertisement. We will also give the offline and online experimental results to verify the accuracy of the scheme.

# 2、 Problem definition

This is a classic online assignment problem. When the data scale is small and the global information is known, the Hungarian algorithm, network flow algorithm and mixed integer programming method can all obtain the ideal optimal solution. But for the Internet advertising delivery system, the data scale is very large, and because of the performance requirements, it is impossible and impossible to determine the global information for a request.

The above is the definition and description of the problem. As we all know, the daily average scale of Taos users is more than 100 million levels, and the daily advertising scale is hundreds of thousands. In such a large-scale solution background, the difficulty lies in how to solve this problem and maintain the provision of high concurrency and low latency advertising services. Therefore, we solve this problem in a large-scale distributed way in the offline phase, while in the online phase, we develop an independent pacing module to smooth the traffic control in order to adapt to the change of traffic.

# 3、 System implementation

**3.1 system description**

The figure above shows the overall architecture and data flow of our system, mainly including CRM management system for order signing of advertisers, offline algorithm and online engine. As a bridge between the management platform and advertisers, advertisers are mainly used for the formulation of order information, including crowd selection, traffic volume, online contract signing, creative information binding, etc. The offline processing framework will synchronize these contract information, and together with the offline logs, solve the offline distributed algorithm through the allocation optimization model based on PS architecture, and import the calculation results into the model and synchronize to the online system. Finally, when the real-time request comes, merge, as an online engine, will request RTP service, pacing service, and distribution model service. After offline allocation algorithm model, the final advertisement is obtained and returned to the front-end client for display.

**3.2 offline optimization**

The offline part of the algorithm is divided into two stages. In the first stage, the original problem is transformed into its dual form, and the dual variables of the problem are obtained by parallel solution. In the second stage, due to the impact of the actual delivery priority, we propose a parallel acceleration scheme through offline topology sorting, which greatly improves the efficiency of solving large-scale data sets.

3.2.1 stage 1 distributed solution

For our problem, the order of users is much larger than the number of contracts, so the computational cost of solving contract side duality is much higher than that of supply side. We extend the supply side dual variables on the worker node of PS by means of distributed computing. For the supply side, the update process is put on the server, and Walker obtains the latest dual variable by pull. By observing the dual variable constraint on the contract side, we find that the solution scale of the equation is consistent with the size of the advertising host group, which is too large for the usual calculation. Therefore, we use an approximate method to calculate and update the dual variable. It is noted that this variable is a monotone increasing process in the process of updating

The above pseudo code is the algorithm description of updating dual variables. On the solution of supply side dual variables, we can easily notice a conclusion

.png”)

Moreover, the objective of this equation is monotone, so we can get the result by the way of dichotomy. We will not describe the specific process here.

3.2.2 phase 2 parallel priority acceleration

It is noted that the original calculation process is calculated separately and sequentially. When the data scale is small, the calculation amount is acceptable. However, when the number of contracts reaches a certain scale, such as the scale of tens of thousands, the calculation efficiency is obviously low. However, the previous properties tell us that for any two orthogonal populations, their calculation process has no overlapping priority, so it can be parallel computing. To give a simple example, for two user groups in Beijing and Shanghai, even if the priority is different, due to the fact that there is no overlap between them, the share of one dual will not affect the inventory of the other, so the two tasks can be calculated in parallel.

Under this premise, when we build the bipartite graph offline, we can get the priority of each user’s actual orders because we save the contract advertisement mount of each user. As shown in the figure above, we can further get the DAG map of all orders according to the original order allocation order. Finally, it is easy to get our optimized parallel execution sequence by topological sorting. Then take the figure above as an example. The original update order is [a] → [H]. Through the construction of DAG graph and the final topological sorting result, we finally get [a, b] → [C, D, e] → [F, G] [H] in this way, only 4 batches need to be processed. Compared with the original 8 batches, the efficiency can be nearly doubled. In the actual experiment, the efficiency improvement of this optimization will be more obvious.

**3.3 online pacing**

Most of the previous display delivery, especially contract guarantee, many systems use roulette or greedy algorithm to launch online. Since the dual variables of our model have been determined before the launch, the probability of delivery in the future is always unchangeable, which will lead to a problem. The result of delivery will depend heavily on the result of traffic prediction, which will lead to online shortage or over investment. In order to adapt to the change of traffic distribution in time, we add a pacing control module after the contract is put into operation. We notice that compared with the day level traffic fluctuation, the minute level fluctuation is often smaller, so we can carry out real-time control on the minute level.

Different from HWM algorithm and shale algorithm, we improve the online delivery mode. The original roulette way essentially selects ads randomly on the effect level, which will cause information loss for the display effect. If we are greedy to choose the best delivery, there will be a risk of shortage due to the problem of allocation. By combining the two methods of pacing and xshale, we introduce real-time pacing control to make our method better adapt to the changes of online traffic.

# 4、 Experimental results

We perform offline validation on Taobao’s data set, which is collected by taobanner and guessing your favorite contract ads. In the offline phase, we will verify the correctness and efficiency of solving dual variables, and verify the correctness of our offline model and online packaging service by using the actual online a/b test.

**4.1 offline experiment**

Compared with the classical algorithm HWM algorithm and Shele algorithm of Gd, in addition to solving time, we evaluate the algorithm index from four indicators: delivery completion rate, penalty term, L2 norm and average click cost. The four indicators are defined as follows:

The offline index evaluation is shown in the table below and the figure below. In various cases, compared with other methods, our system can reduce CPC by nearly 50%. Although we have some loss of revenue in the completion rate, penalty items and L2 norm, we can accept these losses compared to the reduction of CPC. In terms of time performance, HWM is the fastest because it runs only one iteration.

However, compared with Shele and our method, the performance of HWM is relatively poor, and it is not possible to use it online. Because we consider the multi-objective allocation, Shele is slightly better than our method in the completion rate. However, our method has a significant improvement in improving the delivery index. As mentioned above, one of the bottlenecks in advertising allocation problem is the time to solve the dual variables α and β. We adopt the parallelization scheme and the accelerated scheduling method based on the topology sorting in DAG, which can calculate the dual variables α and β more quickly. This acceleration has more obvious advantages than in big data.

Further analysis shows that the speed-up scheduling method of topology sorting in DAG can greatly improve the solution performance. We test on the real 7-day daily tasks. We use originbatch to represent the number of iterations that are not accelerated by DAG, and use reduce batch to describe the number of iterations after acceleration. It is obvious that compared with the serial mode, the task using DAG can effectively increase the computing speed by about 14 times.

We give the parameter adjustment comparison of the hyperparametric learning rate. It is obvious that the convergence rate is slow when the learning rate is small. When the value of learning rate is gradually increased, the convergence speed will increase significantly, and when the learning rate is small, the convergence rate will increase significantly When the rate is too large, there will be oscillation around the optimal solution. In the extreme case of L = 2.0, the distance from the optimal solution is very different. As far as our system is concerned, we usually set the learning rate between 0.5 and 0.7.

Finally, we test the effect of the parameter λ on the delivery completion rate and average cost per click. It can be seen that the average CPC decreases monotonously with the increase of λ. That is to say, if you want to get more platform revenue, increasing λ seems to be a good choice. However, when λ is too large, it will not only cause the decrease of completion rate, but also the click cost, which will lead to shortage. Therefore, we generally control the loss of completion rate in a certain range Within the range of 1% – 3%.

# 4.2 online experiment

We conducted a real online test of our scheme in the hand Taoling scenario, and we used the real online click through rate to evaluate our delivery index. In the actual online a / b experiment, the test mechanism of budget bucket is used to ensure that each experiment is carried out under the same budget. We compare it with benchmark algorithm based on greedy delivery and traditional shale algorithm. In order to verify the effect of smooth packing, we add the shele-click algorithm, that is, considering the click effect, but not considering the pacing, using roulette to launch.

From the curve of figure a, the delivery curve obtained by our strategy is the smoothest. The greedy benchmark algorithm costs more than half of the budget at the beginning. The next good performance is the Shele algorithm and the shele-click algorithm. To a certain extent, the roulette Roulette is used to smooth the control. However, the online traffic distribution is not completely consistent with the offline traffic distribution, resulting in the effect is not the optimal result.

Figure B shows the real-time cumulative average click rate of different algorithms in bucket division. It can be seen that the shel click algorithm considering click modeling performs better than greedy benchmark algorithm and Shele algorithm. Greedy algorithm has the worst performance because of its fast delivery, which limits the space of traffic optimization. Under the smoothing of pacing algorithm, our algorithm performs best, and shows obvious advantages in the process of launching. As shown in Figure C, the experimental results show that the click through rate of our method is 22.7% higher than baseline, 21.2% higher than shale and 10.6% higher than shale-click. The experimental results show the effectiveness of our algorithm.

# 5、 Summary

By incorporating the interaction index of users and advertising into the allocation target, the offline distributed algorithm and online real-time control algorithm can greatly improve the delivery efficiency and contract completion of contract guaranteed advertising. The online bucket sharing experiment further shows that our algorithm can provide more revenue for the contract advertising platform and improve the brand marketing effect of advertisers. The above work has been employed by ICDM’19 in long text. Please refer to “large scale personalized delivery for guaranteed display advertising with real time packing”

Author: Ling, Zheng

Read the original

This article is from Alibaba cloud partner “alitech”. If you need to reprint it, please contact the original author.