In order to understand the core principles, the following processes have been simplified.
I Complete process
The above figure shows Android passing a layoutHardware renderingThis process can be simplified into two steps: application side rendering and system side rendering.
- Application side
The UI elements are organized in a tree structure. All views of the application window are drawn on the canvas through the three stages of measure, layout and draw. Finally, a buffer — displaylist that caches the drawing commands is generated.
This process is calculated in the CPU.
- System side
In order to improve rendering performance, Android 3.0 has provided hardware acceleration and left part of the rendering work to the GPU. GPU is mainly responsible for rasterization, which can be simply understood as the process of converting the image represented by vector graphics format into bitmap (pixel) for display device output.
Therefore, on the system side, the displaylist is first converted into an OpenGL instruction recognized by the GPU, and then the GPU uses the OpenGL es API to render the image to the surface. Next, it can be understood (simply) as a production and consumption model of graphic data buffer exchange:
Bufferqueue is a graphic data buffer queue for production consumers; As a producer, surface provides the ability for applications to render images on the screen. The graphics buffer generated by GPU rendering is finally transferred to the bufferqueue through the surface; As a consumer, surfaceflinger receives data buffers from multiple sources. After receiving the notification from bufferqueues, it takes out the available graphics buffers and sends them (or sends them after composition, depending on HWC) to HWC.
HWCIt is a hardware hybrid renderer, which is responsible for synthesizing all layers and directly writing and outputting them to the frame buffer. The frame buffer data will eventually be converted into appropriate signals and sent to the display to complete the final screen.
Vsync in the above figure refers to synchronization, not Vsync signal in the following figure.
II Graphic display of display
The following content covers the basic hardware knowledge of the display image.
The display can only be completed through the display control unit in the graphic display subsystem. The common graphic display subsystem includes the raster scanning subsystem. Here, how to display images on the display through the raster scanning subsystem will be introduced.
Raster scanning subsystem has two important components: frame buffer memory (i.e. video memory) and display controller.
The frame buffer memory is used to store the color (gray) of pixels, which can be accessed directly by the display controller to refresh the screen at any time. The frame buffer mentioned above is stored here.
The display controller can be regarded as a local processor independent of the CPU (in fact, it is still controlled by the CPU). Its main function is to independently and repeatedly read the image lattice data in the video memory according to the set display mode, convert them into R, G and B tricolor signals and send them to the display with synchronous signals, so as to refresh the screen.
Due to the large amount of calculation for calculating pixel data and writing it into the corresponding frame buffer unit, in order to reduce the burden of CPU, an independent display module was born: the graphics card. In addition to the display controller and frame buffer memory (video memory), it also added an independent display processor (GPU) and a display processor storage area (mainly used to temporarily store the programs and data during display processing).
III. * Opengl rendering pipeline
Through opengl rendering pipeline, a bitmap of a specific size can besamplingAnd display on the display with different size resolution.
The basic process of opengl rendering pipeline is as follows, in which rasterization will map the entities to the corresponding pixels on the final screen, and the fragment shader will shade each pixel unit to be rendered.
Android aside, if you are interested in learning how to use OpenGL to load pictures, process pictures (add various effects to pictures) and render them, you can see this code demonstration–GitHub link。
In addition, for the rendering of bitmap, bitmap holds pixel data, and the pixel data can be converted to the texture required by OpenGL, and then the sampling is completed to render to the surface.
IV Project Butter
In order to improve the smoothness of Android screen rendering, the display system has been reconstructed from Android 4.1, which is called the butter plan.
The butter plan introduces three core elements: Vsync, ripple buffer, and choreographer.
A timing interrupt mechanism is used to enable the CPU to process the data of the next frame immediately after receiving the Vsync signal (combined with the synchronization barrier mechanism of the handler), and ensure the synchronization of the refresh rate and the frame rate.
It has two advantages: when the frame rate is greater than the refresh rate, the frame data generated by the GPU will be held because it waits for the refresh information of Vsync, so that the actual new data can be displayed every time it is refreshed, so as to prevent the picture from tearing; When the frame rate is less than the refresh rate, it can increase the priority of rendering tasks to reduce frame loss.
- Tripple Buffer
Triple buffering mechanism: when the double buffering mechanism is used and the GPU or CPU calculation is timed out, because the screen display occupies one cache and the GPU or CPU under calculation occupies another cache, there must be a CPU or GPU idle in a single frame time. The triple buffering mechanism can make up for this vacancy and further improve the fluency.
It is used to receive timing pulses (such as Vsync signal) from the display subsystem, and then schedule work to render the next display frame. It can coordinate the three UI related operations of animations, input and drawing.
V Trigger of drawing
The first section briefly describes the process of single frame rendering data from the application side, the system side to the screen after rendering starts.
Since the drawing starts only after receiving the Vsync signal, and the drawing request can be made at any time point (such as manually calling invalidate), this section will supplement what happens in the node from the time of requesting drawing to the time of drawing.
- When the interface remains unchanged, the bottom layer will switch the picture of each frame at a fixed screen refresh rate (such as 16.6 MS). However, for an app, only after the next Vsync signal is registered can it receive a callback and redraw. If the interface remains unchanged, the app will not receive Vsync events, and the cpu/gpu will not go through the drawing process.
- When drawing is requested, the system will add a synchronization barrier to prevent the execution of synchronization messages in the messagequeue. Thereafter, whether the choreographer registers the next Vsync signal or the choreographer receives the Vsync signal and executes the callback, the asynchronous message will be sent for priority execution.
- Since the synchronization barrier is not removed until the Vsync signal arrives and is drawn, the execution of the synchronization message is delayed by up to one frame. Sometimes, in order to execute our tasks as soon as possible, we can send asynchronous messages so that we can execute the tasks while waiting for the Vsync signal. However, it should be noted that asynchronous tasks cannot guarantee the sequence relationship with the drawing tasks.
- When drawing is requested, the drawing task will not start immediately, but will not start until the next Vsync signal arrives; When the cpu&gpu drawing process is completed, the interface will not be refreshed immediately, but will not be switched and displayed until the next Vsync signal arrives.
- Synchronization barrier + Vsync does not guarantee that the drawing will be performed at a fixed Vsync signal point, because a time-consuming synchronization task may be performed before the time point when the synchronization barrier is added. Therefore, there are two reasons for frame loss: 1 When the main thread is executing a time-consuming task, the drawing task cannot start; 2. cpu&gpu drawing view exceeds the time of a single frame.
Vi Reference links
Android display refresh mechanism, Vsync and triple cache mechanism
Android butter plan and display refresh mechanism learning notes
Android rendering mechanism – principle (analysis of the whole process of display principle)
Master the principle of Android image display
Fundamentals of computer graphics — Third Edition
Android choreographer principle