[ymfe] how to reach 60fps

Time:2020-9-16

[ymfe] how to reach 60fps

Wang Yu

He joined qunar in 2016 and is currently working as a front-end Engineer in the front end architecture group (ymfe) of qunar platform division. Welcome to the team blogYMFE( http://ymfe.tech )See more technologies.

Not long ago, I shared some content about building fluent animation on ymfe conf. this article is the text version of my shared content. You can see the corresponding ppt here ().

What is FPS and what does 60fps mean?

FPS (frames per second) refers to the number of screen refreshes in one second or the number of animation frames updated in one second. Most modern browsers refresh 60 times per second. In order to keep the refresh rate consistent with the device, animation should also ensure that 60 frames are updated per second. If it is lower than 60 FPS, it is said that the animation has dropped frame. If the frame drop is serious, the user can obviously feel the jamming. High frame rate means more coherent animation and smoother scrolling, which always brings excellent user experience.

This paper first discusses the rendering process of modern browser, and discusses the skills and precautions of constructing process animation combined with each process.

The structure of this paper is as follows

  • From HTML / CSS to web page
  • What the browser does in each frame
  • Skills and precautions of constructing process animation

From HTML / CSS to web page

In order to operate DOM efficiently and complete smooth animation, it is necessary to understand how the browser renders HTML / CSS / Java and other resources into web pages. This process is described below:

After receiving the HTML document, the browser will start to parse the document and establish a document object model tree, in which all nodes of the current document are recorded. At the same time, the browser uses the inline style tag or the external loaded CSS document to build the CSS object model tree, in which the style rules of each node are recorded. Then, a render tree is constructed by combining DOM tree and cssom tree, in which the actual styles of all visible nodes in the current page are recorded. The reason for the actual style is that there may be such writing methods as width: 50% or color: inherit in CSS, and the browser needs to calculate the actual style of a node according to its parent node from top to bottom.

The whole procedure is shown in the following figure:

[ymfe] how to reach 60fps

The construction process of rendering tree (image from chrome developer)

  • DOM tree: records the structure and content of the document
  • Cssom tree: records the style rules of DOM nodes
  • Render tree: represents the real style of each node in DOM

After getting the rendering tree, the browser can’t start drawing because there are too many elements on the page. If one element in the page is changed, it will be a waste to redraw the whole page. After all, only a small part of the page is changed. In order to draw efficiently, the browser puts forward the concept of layer. According to some rules, DOM nodes are divided into different layers. If such a node changes, the browser will intelligently redraw those affected layers instead of all layers. When the browser draws, the layers are taken as the unit.

The process after subdivision is roughly as follows:

[ymfe] how to reach 60fps

△ web page rendering process

The drawing process is that the browser calls the drawing API to complete the drawing of layers. The drawing process is the process of filling pixels. The browser will call some similar to moveto, A drawing API like lineto draws each layer to get a collection of pixels, similar to a bitmap. These bitmaps are then uploaded to the GPU, which helps the browser merge these bitmaps to get the final picture displayed on the screen.

To sum up, the process of rendering web pages by browser can be divided into the following steps:

  1. Parsing HTML / CSS to generate DOM tree and cssom tree
  2. Combine DOM tree and cssom tree to get rendering tree
  3. Divide the render tree into multiple layers and draw layers
  4. Upload the data of each layer to GPU
  5. GPU merges the layers to get the final picture displayed on the screen

It can be imagined that the internal implementation of the browser is originally thousands of times more complex than the previous discussion. The above is just a very macro description of the process of the browser rendering the page. Java is not involved yet, but if you know the above contents, you can at least have a general understanding of the browser’s rendering process.

What the browser does in each frame

Java modifies DOM tree and cssom tree through API. Animation or transition in CSS will change the rendering tree. Every time the rendering tree is changed, the browser needs to recalculate the style. Style calculation will involve multiple DOM nodes, because some styles have inheritance relationship, and others are relative to the parent node.

In every frame, the browserprobablyTo do some or all of the following steps:

[ymfe] how to reach 60fps

What the browser may do at each frame

Give a brief explanation of the steps in the figure above:

  • Java: during java code running, DOM nodes may be added and the style of nodes may be modified, which will affect the DOM tree and cssom tree, and finally affect the rendering tree. In addition, CSS animation and CSS transition will modify the rendering tree.
  • Recalculate Style: this node calculates the final style of the node based on the CSS selector.
  • Layout: once you know the associated style of each node, you can calculate the actual size of the node and its position on the screen. Because inheritance and relative units may be involved, the change of a node may affect multiple nodes. For example, if the width of < body > is modified, many of the following elements will be affected.
  • Update Layer Tree: the layer tree records the stacking relationship between layers, which will affect the final who those elements are on the top and those elements are below.
  • Paint: fill the pixels, draw the text, border, shadow, etc. on the layer. The drawing is based on the layer. Draw the layer to be drawn, and finally get a bitmap, which records the visual performance of the current layer.
  • Composite Layer: after you get the layers, you need to merge them according to the correct stacking relationship, and finally get a whole picture that needs to be displayed on the screen.

You can clearly see these steps in chrome devtools

[ymfe] how to reach 60fps

Some steps can be skipped

If you modify an attribute that will affect the size or position of an element, such as width, height, or top, you need to perform the layout operation again, and then redraw, and then merge the layers to get a new frame. This will perform all of the above steps.

However, if you only modify attributes such as color that do not involve node size or positioning, you do not need to perform the layout step. Because the modification of color does not affect the size and position of the element, it only needs to redraw once. At this time, the layout in the above steps will be skipped.

[ymfe] how to reach 60fps

There is no need to rearrange

Similarly, if you modify an attribute that does not require redrawing, you can skip the two steps of layout and paint. At this time, you only need to merge the layers to get a new picture.

[ymfe] how to reach 60fps

There is no need to rearrange and redraw

Without the need for layout and repaint operations, it will naturally take less time, and the browser needs to do less work in each frame, which can improve performance to a certain extent. From this point of view, the cost of modifying DOM tree and DOM node attributes or styles is different. Some operations may trigger rearrangement and redrawing operations, while some operations can completely skip the above steps.

law

However, we can also draw the following law:

  • Layout:When it comes to DOM operation, the modification of the size and position of DOM nodes will trigger layout, which will lead to repaint and layer merging. For example, modify the width, margin, border and other styles, or modify the properties such as clientwidth.
  • Paint:Attributes involving the color of DOM nodes will cause redrawing, such as color, background, box shadow, etc
  • Composite:Currently, the modification of opacity, transform and filter only needs composite operation. To change these attributes, GPU only needs to make some changes to the layers before merging them, such as the opacity attribute. GPU only needs to change the alpha channel of the layers before merging. GPU can also directly perform some matrix operations to get the transformed layer by modifying the other two attributes.

reference material

Paul iris lists the operations that trigger rearrangements, as you can see here: what forces layout / reflow ()

In addition, in the https://csstriggers.com/ Well, a team from the chrome team listed the changes to CSS properties that would trigger those actions.

In practice, you can always refer to these two lists, combined with debugging tools, to avoid unnecessary rearrangement and redrawing.

Skills and precautions of constructing process animation

The previous part introduces a lot of basic knowledge about the browser rendering process, aiming to help those who are not clear about it to sort out the rendering process of web pages from a macro perspective.

To achieve coherent animation, smooth rolling, understanding the above basic knowledge has great benefits for subsequent coding and optimization. According to the principle of browser rendering, combined with each step of each frame browser needs to do, this paper gives some practical optimization schemes, and puts forward some matters needing attention.

I would like to introduce the following contents in five points

  1. Avoid unnecessary rearrangements
  2. Avoid unnecessary redrawing
  3. Accelerating rendering with GPU
  4. Building a smoother animation
  5. Handle scrolling events correctly

Avoid unnecessary rearrangements

Each front-end engineer is told that DOM is slow, the cost of using script to operate DOM is very expensive, to modify DOM in batch, and so on. Many works have discussed the topic of DOM operation. High performance Java () is highly recommended. I think this book should be read by front-end engineers.

Although there is already a lot of content about DOM operation, here I still want to mention a note:Avoid mandatory synchronous layoutBecause I often see this word, I might as well bring it up and talk about it.

Avoid mandatory synchronous layout

Forced synchronous layout occurs when the attributes of DOM elements are changed by using Java, and then the attributes of DOM elements are read. Generally speaking, the dirty DOM is read. For example, change the width of DOM elements, and then use clientwidth to read the width of DOM elements. At this time, in order to get the true width of DOM elements, the style needs to be recalculated. That is, recalculate style and layout operations will be performed again.

Imagine the following example: there is a group of DOM elements whose height should be set to be consistent with the width. Novices can quickly write the following code:

Solution 1 – simple and crude:

[ymfe] how to reach 60fps

When executing this code, the DOM is dirty (changed) at the beginning of each iteration, and the layout is recalculated in order to get the true DOM size. This loop will cause multiple mandatory synchronization layout, which is very inefficient and must be avoided.

[ymfe] how to reach 60fps

Delta leads to mandatory synchronous layout

It is easy to find this inefficient operation from chrome devtools. You can see that the browser has done many recalculate style and layout operations, also known as reflow operations, and this frame takes a long time.

Solution 2 – separate read and write:

You can easily solve this problem by using two loops, reading the width of DOM elements in the first loop and saving the results, and modifying the height of DOM elements in the second loop.

[ymfe] how to reach 60fps

[ymfe] how to reach 60fps

After separation of reading and writing

Separate read and write, read only at one time and rewrite at another, which can effectively avoid mandatory synchronization layout.

In the actual project, it is often not as simple as mentioned above. Sometimes, although read and write have been separated, there is still an inevitable read operation after the write operation. At this time, you might as well put the write operation in the requestanimation frame, and the browser will perform the DOM rewriting operation in the next frame. There is a detailed explanation of the request animation frame.

Supplementary information

  • “High performance Java” – Nicholas C. zakas() explains more about DOM operations, including how to minimize redrawing and rearranging, and how to use CSS selectors efficiently.
  • What forces layout / reflow(), the gist lists those operations that result in mandatory synchronization layout.

Avoid unnecessary redrawing

Before you start, you need to review when you need to redraw:

  1. When the attributes (color, background, etc.) that trigger redrawing of DOM nodes are modified, they will be redrawn
  2. When the attributes of other elements in the layer of DOM node that trigger redrawing are modified, the whole layer will be redrawn
  3. After the image loading is completed, redrawing occurs in every frame of GIF image

After you check the painting flashing option in the rendering tab of chrome devtools, you can observe the area on the page that is being redrawn.

Avoid redrawing fixed positioning elements while scrolling

A common scenario is that a web page has a fixed navigation header or sidebar. The problem is that after each scrolling, the position of the fixed elements relative to the entire content area changes. This is equivalent to that the position of an element in a layer is changed. In order to obtain the scrolled layer, it needs to be redrawn. Therefore, each time you scroll, you need to redraw.

For example, on Tencent’s home page, there are fixed positioning elements as follows:

[ymfe] how to reach 60fps

Unfortunately, these fixed location elements are on the same layer as the entire page:

[ymfe] how to reach 60fps

After scrolling, the entire document needs to be redrawn because the position of the positioning element changes with respect to the entire document. The way to solve this problem is to promote the fixed elements to a separate layer. use transform:translateZ (0); in this way, the element can be forced to be promoted to a separate layer, which is explained in detail later.

Note:Chrome will automatically promote fixed elements to separate layers on high DPI screens, but not on low DPI screens. Therefore, many developers will not find problems when testing on MacBook Pro, but problems will occur when users access the low DPI screen.

Promote some elements to separate layers to avoid large area redrawing

use transform:translateZ (0); such CSS hark writing will promote elements to separate layers. Before you do this, you should consider why you want to do this. The purpose of creating a new layer should be to avoid large area redrawing caused by the change of an element. For example, the color change of a small label will cause large area redrawing. Therefore, it is promoted to a separate layer.

[ymfe] how to reach 60fps

This is a panel where the text in the content area is constantly flashing (the color of the text will change) if the text is used transform:translateZ (0); if the text is promoted to a separate layer, the color change of the text will only cause the layer where it is located to be redrawn, instead of redrawing the whole panel. This is the right use transform:translateZ (0); in the same way. Therefore, if there are small areas of DOM nodes in the page that need to be redrawn frequently, you can consider promoting them to a separate layer. You can see the demo here – avoid large area redrawing ().

Correct handling of moving pictures

When a page is loaded, a loading is often used for better user experience, but how to handle loading after page loading? One wrong way is to set the Z-index to a smaller value and hide it. Unfortunately, even if the loading is not visible, the browser will still redraw it at every frame. So for dynamic graphs like loading, it’s best to use them when they don’t need to be displayed display:none Or visibility: hidden; to hide it completely, or to remove the DOM altogether.

Using GPU to accelerate web page rendering

Front end engineers should have heard of hardware acceleration, which usually refers to the use of GPU to speed up the rendering of pages. Early browsers relied entirely on the CPU for page rendering. Now, with the enhancement and popularization of GPU, the vast majority of devices running browsers have integrated GPU. Browsers can use GPU to speed up Web page rendering.

GPU contains hundreds of thousands of cores, but the structure of each core is relatively simple. The structure of GPU determines that it is suitable for large-scale parallel computing. Layer merging requires a large number of pixels, which can be completed more efficiently by GPU than by CPU. Here is a video (), which clearly shows the difference between CPU and GPU.

Often see the article points out the use transform:translateZ (0); such harks can force hardware acceleration on to improve performance, which is a mistake. Let’s talk about the essence of hardware acceleration.

What is hardware acceleration

GPU can store a certain number of textures, that is, a rectangular set of pixels. Usually, this set will correspond to a certain layer on the web page, and GPU can efficiently transform these pixels (displacement, rotation, stretching). In the implementation of animation, using this feature of GPU, if we only need to transform the original pixel set in the GPU once, we can get the new frame layer. Then all the operations of animation are completed efficiently in GPU, and there is no redrawing operation.

After getting the transformed layer, we only need to merge the transformed layer once more, merge the transformed layer with other layers, and finally get the whole picture displayed on the screen. This feature of GPU is often called hardware acceleration.

It is also conditional to use hardware acceleration blindly transform:translateZ Without knowing the principle, it will only make things worse. The essence of hardware acceleration is that the layer of the next frame can be transformed in the GPU. However, if some operations cannot be completed by GPU, the width and color of DOM nodes must be modified by animation. This still requires software redrawing on the CPU side. In this case, the hardware acceleration mechanism cannot be used.

use transform:translateZ (0); it will force the browser to create a new layer, and each layer will consume extra memory. If there are too many layers, a lot of memory will be consumed, which will lead to insufficient memory of the device and may lead to application crash. In addition, these layers need to be uploaded to the GPU for layer merging. If there are too many layers, the bandwidth between GPU and CPU will not be enough, which will affect the performance.

At present, only filter, transform and opacity can be changed in the GPU. As mentioned above, these attributes should be used as much as possible to complete the animation.

There will be more examples of taking advantage of this feature of GPU. Let’s first look at a point that needs attention:

Avoid creating new layers unnecessarily

A real case:

[ymfe] how to reach 60fps

Each list item is a layer

This is a city selection page. Every item in this page is used transform:translateZ (0); forced to a separate layer, scrolled the list, and recorded a timeline.

[ymfe] how to reach 60fps

Before optimization

As can be seen from the above figure, the performance is quite poor. A lot of time is spent on merging layers. Each frame needs to merge thousands of list subitems, which is not a very easy thing.

In order to reflect, misuse transform:translateZ (0); to see the effect after removing the attribute. After removing the attribute, it is green and there is no performance problem.

[ymfe] how to reach 60fps

Δ after optimization

Therefore, when we talk about hardware acceleration, we must know what hardware acceleration is, how hardware acceleration works, what it can do and what can’t be done. Reasonable use of GPU can help us build a 60fps experience.

4 build more fluent animation

As mentioned above, using transform and opacity to create animations (filter support is not good enough) is the most efficient. Therefore, whenever you need to use animation, you should first consider using these two properties to complete.

Avoid animating with properties that trigger layout

Sometimes it seems impossible to do this with these two properties, but when you think about it, you can often come up with a solution. Consider the following Animation:

Demo address: expand cord ()

The general idea may be to modify the top, left, width, and height of each card to achieve this function. Of course, this can achieve the effect, but changing these attributes will trigger the layout and then trigger the paint operation, which is bound to cause stuck in complex applications. Here’s a way to use transform to complete this animation.

[ymfe] how to reach 60fps

The above idea is to use getboundingclientrect to calculate the size and position of the initial state and final state of the animation, and then use transform to carry out the transition. The idea has been explained in the code annotation.

After such processing, the animation that needs to use top, left, width and height should be finished by using transfrom, which will greatly prompt the performance of animation.

Use transform, filter and opacity to complete the animation

Using the above three attributes to complete the animation can avoid redrawing at each frame of the animation. But if you change other properties in the animation, you can’t avoid redrawing. Use these attributes as much as possible to complete the animation. Translation is used for displacement, scale for size and opacity for color.

Here is a case study. When instagram’s Android App logs in, it has a color gradient effect, which is often seen.

[ymfe] how to reach 60fps

Background color gradient effect of △ instagram landing page

By constantly changing the background color, it can be realized quickly. After testing, it will be found that the low-end device will feel stuck and the CPU utilization rate will soar. This is because modifying the background color will lead to page redrawing. In order to achieve the same effect without redrawing, we can use two divs, set two different background colors for them, and change the transparency of the two divs in the animation. In this way, the two divs with different transparency can be superimposed together to get a color evolution effect. However, the entire animation only uses opacity to complete, which avoids redrawing completely.

For example, you can see here: gradient with background vs gradient with opacity ()

Do not mix transform, filter, opacity and other attributes that may trigger rearrangement or redrawing. Although using transform, filter, opacity to complete animation can have good performance, if other attributes that trigger rearrangement or redrawing are mixed in the animation, the high performance still cannot be achieved.

Use the requestanimation frame to drive the animation

Most of the animations mentioned above use CSS animation and CSS transition. CSS animation is usually defined in advance and cannot be controlled flexibly. Sometimes Java may be used to drive animation. Novices often use setTimeout to complete animation. The problem is that the callback set by setTimeout will be called when the main thread is idle. Imagine the following scenario:

[ymfe] how to reach 60fps

The setTimeout is triggered in the middle of a frame, which then causes the style to be recalculated, resulting in a long frame. SetTimeout / setinterval has the following limitations:

  1. Call when the page is not visible (power consumption)
  2. The execution frequency is not fixed (may trigger multiple times in a frame, resulting in unnecessary rearrangement / redrawing)

SetTimeout / setinterval is called periodically, even if the current web page is not active. In addition, because of the uncertain call timing, the same callback may be called multiple times in the same frame. If multiple redraws are triggered in the callback, there will be multiple redraws in the same frame. This is unnecessary and will lead to frame dropping.

Requestanimationframe, an API specially used to drive animation, has the following advantages:

  1. Ensure that the callback is called in the next frame
  2. Adjust the execution frequency according to the refresh rate of the machine
  3. The callback is not executed when the current web page is not visible

Although requestanimationframe is an API that has existed for many years, there are still many misunderstandings. The most serious one is that the use of requestanimationframe can avoid re layout and redrawing, and the browser can start optimization measures to make the animation smoother. This is wrong. What the browser can guarantee is only the above 3 In the callback of the requestanimationframe, the forced synchronization layout will still trigger the rearrangement.

When writing Java driven animation, the DOM writing operation can be put in the next frame by using the requestanimation frame, so that the DOM reading operation after this frame will not cause mandatory synchronization layout, and the browser only needs to rearrange once at the beginning of the next frame.

5 handle scrolling events correctly

Modern browsers use a separate thread to handle scrolling and input. This thread is called compositing thread. It can communicate with GPU to tell GPU how to move layers and scroll pages. If the page is bound with such events as touchmove and MouseMove, the composition thread needs to wait for the main thread to execute the corresponding event listening function, because preventdefault may be called in these functions to prevent scrolling.

One of the most important suggestions for optimizing scroll, touchmove, MouseMove and other events is to control the execution frequency of callbacks for such high-frequency events. When it comes to controlling frequency, we naturally think of two functions: debounce and throttle. For a time, we have been puzzled. We may as well give a brief introduction to these two functions

Use debounce or throttle to control the trigger frequency of high frequency events

Debounce and throttle are two similar (but different) techniques for controlling the frequency of function execution within an event.

debounce

Multiple consecutive calls, the last call is only one

Imagine yourself in the elevator and the door is going to close. At this moment, another person comes and cancels the closing operation. After a while, the door will close again. Another person comes and cancels the closing operation again. The elevator will delay closing until no one comes back for a certain period of time.

throttle

Limit frequently called functions to a given call frequency. It ensures that a function can only be called once within a given event, no matter how frequent it is. For example, when scrolling, you need to check the current scrolling position to show or hide the back to top button. At this time, you can use throttle to limit the scrolling callback function to be executed once every 300ms.

What should be noted is the usage of these two functions. They accept a function and then return a function after throttling / de dithering. Therefore, the following second usage is correct

[ymfe] how to reach 60fps

Use the requestanimationframe to trigger a callback for the rollover event

If DOM operation is carried out in the event listener function, it may consume a lot of time. The execution time of the event listener function becomes longer, and the composition thread communicating with GPU will not receive the notification, and the browser will not know how to scroll the page, which will cause a stuck. For such synchronous events (the browser waits for the event execution to be completed), the information such as the size and position of DOM elements to be obtained can be read first when the event is triggered, and then other operations of rewriting the DOM are arranged in the requestanimation frame. The browser can complete the event callback faster and avoid the subsequent reordering when reading the dom.

In addition, sometimes you want the event to be executed once in each frame. At this time, using throttle cannot meet the requirements. Using the requestanimation frame can ensure that every frame will be called. It should be noted that some events may be triggered several times in a frame. Therefore, when using the requestanimation frame, we should pay attention to judge whether the callback is triggered multiple times in a frame.

[ymfe] how to reach 60fps

summary

These two articles briefly describe the rendering process of the browser, and then according to the principle of browser rendering, it analyzes the aspects that need to be paid attention to in order to achieve smooth animation, and gives several practical skills to achieve smooth animation.

However, the rules are constantly changing, and browsers are constantly updating. A year ago, it was a performance bottleneck, and now it may not be a bottleneck. In the development process, we should combine debugging tools to analyze each rearrangement and redraw, analyze the time-consuming of each stage, and find out the real problem. Instead of just remembering the rules.

Welcome to share your knowledge with us.