Baidu app fluency whole process quality monitoring practice (2) fluency index selection

Time:2021-6-16

preface

In series (1), we learned the necessity of fluency monitoring, and some indicators and methods of fluency evaluation in the industry. Next, we will introduce series (2), the selection of fluency index of Baidu app.

Baidu app fluency index selection

1. Unsmooth scene and weight consideration

The actual test found that in the baidu app web browsing scene, there will be two types of fluency problems, namely “jitter / frame drop” and “stuck”.

Let’s illustrate the difference between the two problems by giving an example of pointer walking as shown in the following figure. Suppose that our pointer will walk one grid every 16.6ms in the ideal frame time under the ideal state, and it takes 8 steps to complete a circle and return to zero.
Baidu app fluency whole process quality monitoring practice (2) fluency index selection

1.1 jitter / frame drop:

The specific frame length sequence is: the first two ideal frames (16.6ms), and the last three ideal frames (16.6 * 2ms). Actual pointer walking effect: 0,1,1,2,2,4,4,6. The above effect gives the user the feeling of moving forward two steps at a constant speed first, and then one step at a time (frame dropping). In this clock example, it gives the user the feeling of frame dropping. When browsing a web page, especially when the page is scrolling, it gives the user the feeling that the page is “shaking”.

1.2 carton:

The specific frame length sequence is: the first two ideal frames (16.6ms), the last frame length covers six ideal frames (16.6 * 6ms), the actual pointer walking effect: 0,1,1,1,1,2. The above effect gives users the feeling that they first move forward 2 steps at a constant speed, then stop all the time (stuck), and then take a sudden step.

To sum up, both problems may affect the experience. In the baidu app web browsing scenario, they are not smooth problems that we need to recall. In the case that the total duration of the two kinds of unsmooth problems is the same (for example, the above example actually lasts for 6 ideal frames), now we expect to get fair recall as far as possible, so as to optimize and solve the two kinds of problems.

In the subsequent selection of specific algorithms, we will follow this idea and try to balance the recall of two kinds of unsmooth problems. The reason is that once there is bias, under the effect of the accumulation of a large amount of data on the actual line, the final data will be more and more inclined to one type of development.

2. The selection of the threshold value of Caton judgment

In the previous section, the description of “stuck” and “jitter / drop frame” is based on the assumption of the scene. In the actual use of the product, what kind of threshold should be set for the frame length to judge whether the phenomenon of unsmooth will appear?

Through long-term test experience, the test students initially proposed two thresholds: 30ms and 70ms

  1. 30ms corresponds to 1 frame (16.6ms * 2) dropped in the actual drawing process, and slight jamming or jitter will occur at this time — when the frame length sequence in 1s is [33,33,33,…, 33], the naked eye will not see significant jamming phenomenon, and when the interface appears dynamic effect or the user actively slides, the whole interface jitters.
  2. 70ms corresponds to (≈) 3 frames (16.6ms * 4) dropped in the actual drawing process. At this time, the visible short-term jamming will occur. The frame length sequence within 1s is [16.6, 16.6, 16.6,…, 70, 80, 90]. When the interface is active or the user actively slides, the last 3 frames will make the user feel the phenomenon of pausing and recovering.

The development students provided a demo that can set the frame length for a period of time. After several rounds and many people’s experiments with the product students, the conclusion is as follows:

  1. When the frame length is 30 ms for five consecutive frames, the jitter can be clearly felt when the web browsing interface slides.
  2. When the single frame length is 70ms, the user can feel the interface stumbling when the web browsing interface slides.

Therefore, our monitoring threshold is 30 ms and 70 Ms. In order to unify the appellation, we did not choose to name the threshold through different scenes and different degrees of performance, such as “jitter”, “slight stuck” and “pause”. Instead, we directly defined “stuck” for more than 30 ms and “big stuck” for more than 70 Ms.

The reason for selecting the two thresholds is to further focus on the optimization target. When the big Caton problem of the corresponding scene is obvious, the 70ms threshold is used to extract the big Caton index for processing; When the corresponding scene jitter problem is obvious, the 30 ms threshold is used to extract the Caton index for processing. In this way, combined with the idea in 2.1 that “under the condition that the total duration of two kinds of non fluency problems is the same, now we expect to get a fair recall as far as possible”, we can achieve both fair recall and rapid differentiation and focus.

Note: the above threshold selection is set for the sliding scene of web browsing by Baidu app students in combination with experience and artificial experiments. In fact, the threshold selection of different product forms and different concerns is different, which is more accurate or in line with the threshold selection method of your product, You can refer to “scientific research on response delay” and “completion delay” in “how to obtain the real quality of intelligent terminal” by teacher Zhu Shaomin
The conclusion of “the scientific research of” is as follows
Baidu app fluency whole process quality monitoring practice (2) fluency index selection
Baidu app fluency whole process quality monitoring practice (2) fluency index selection

3. The selection of fluency index calculation scheme — the conversion rate per second

Excluding the FPS value which clearly indicates that there are many disadvantages in fluency tracking in series (1), we further discuss the reliable Caton rate monitoring algorithm for other indicators.

In addition to SF, SM and long frame mentioned in series (1), we further add some optional calculation methods on the basis of these variables. In the discussion of the algorithm, we first assume that the Caton decision threshold is 30ms, so as to calculate the results of the application case (perfdogjank still uses its original threshold setting standard). In addition, we give the following definitions:

  1. Long frame = stuck
  2. Jamming time = sum of long frame length in 1s = total jamming time in 1s
  3. Converted frame number of jamming = jamming time / 16.6ms
  4. Conversion rate per second = conversion number of kartons / 60 = Karton time / 16.6 / 60 = Karton time / 1000

In our algorithm, we first compare the two cases of “jitter / drop frame” and “stuck frame” according to the frame length sequence within 1 s
Baidu app fluency whole process quality monitoring practice (2) fluency index selection
Baidu app fluency whole process quality monitoring practice (2) fluency index selection
In addition to “whether the two types of non fluency can be recalled fairly”, we also consider “whether the boundary value is 0 and 100%”. The reason is that while fair recall can be achieved, if the result range can be controlled between 0-100%, a simple percentage value can be more clearly used to measure the quality of non fluency, which is easy for everyone to understand, and also conducive to the rapid evaluation of online market data.

To sum up, referring to the comparison results in the above table, we finally selected“Conversion rate per second” (= conversion rate per second / 60 = time consumption per second (unit: ms) / 1000ms)This calculation method is simple.

4. Fluency index calculated by stages — stage converted Karton rate

After selecting the indicator of “conversion rate per second”, how to implement this indicator in products? We can simply divide by seconds, calculate the “converted Karton rate” of each second, and then calculate the overall mean value, that is, the mean value of the sum of the “converted Karton rate” of each second. In combination with the “monitoring precautions” introduced in series (1), “it is recommended to distinguish scenarios and phases for monitoring”, so our calculation method is divided into phases. For example, in the baidu app search scenario, we can divide a “search results page” stage and a “common landing page” stage, and even further divide them into “sliding stage” and “static stage” in more detail, so as to carry out targeted monitoring and optimization. Here, we introduce a definition of “stage converted Karton rate”, that is, the overall converted Karton rate result in the current stage.

In the ideal state, the stage conversion of Karton rate

=Phase stuck time / total phase time

=Time spent on phase jamming / (1000ms * total phase seconds)

=(phase stuck time / 1000ms) / phase total seconds

=(sum of “stuck time / 1000ms” per second in current phase) / total seconds of phase

=The average value of the sum of “converted Karton rate per second” in the current stage

That is to say, the “stage conversion rate” should be equal to the average value of the sum of “conversion rate per second”.
But in fact:

  1. During the operation, the unit monitoring time of each stage can not be controlled to a complete 1s (that is, the denominator may not be accurate when calculating separately), and there may be positive and negative fluctuations, with a small amount of error;
  2. When the ideal state of 60 frames can be achieved in one second, a value of 0 must be uploaded in order to average the current second; In the algorithm of phase stuck time / total phase time, only when there is a stuck, can the statistics be uploaded.

Therefore, the final scheme is selected as “stage conversion rate” (= stage time consumption / total stage time consumption) for the fluency index statistics of Baidu app.

reference material

  1. Tencent perfdog: perfdog performance dog help document
  2. Zhu Shaomin: how to get the real quality of intelligent terminal at MTSC China Internet test and development conference Shenzhen station

The author of this paper:
MQA-sherryshare


Enter “Baidu app technology” in wechat – Search page to pay attention to wechat official account; Or use wechat to identify the following QR code, you can also pay attention to it.
Baidu app fluency whole process quality monitoring practice (2) fluency index selection