Android reconstruction of good-looking video — reconstruction practice around player


Android reconstruction of good-looking video -- reconstruction practice around player

Reading guide: as the most important component of short video application, player plays a vital role; For short video apps, playback related performance directly affects the core user experience. A series of optimization around the player is a cost-effective technology investment.

The full text is 3680 words and the expected reading time is 12 minutes.

1、 Background introduction

As the most important component of short video application, player plays a vital role; For short video apps, playback related performance directly affects the core user experience. A series of optimization around the player is a cost-effective technology investment. After continuous reconstruction and optimization, the video playback experience of good-looking video Android basically achieves the second on effect, starts broadcasting instantly, and the sliding fluency is also significantly improved. Let’s look at the comparison before and after refactoring.

Before reconstruction: it is easy to observe that there is an obvious pause before the video is broadcast, especially for medium and low-end machines.

Android reconstruction of good-looking video -- reconstruction practice around player

After reconstruction: you can hardly feel the pause of playback switching.

Android reconstruction of good-looking video -- reconstruction practice around player

This time, I’ll share with you some ideas and experiences in the reconstruction of Android end of good-looking video, mainly focusing on architecture and performance optimization.

2、 Good looking video history review – Single Player

Good looking video originated from the graphic information flow of hao123 in 2016. After four years, the caliper playback mode of information flow mode has been established. In Q3 in 2020, good-looking video started the immersive full screen playback project, and finally pushed the whole.

Due to historical reasons, good-looking video has always been a single player architecture: a global player floats on the top of all views, following the sliding left and rightViewPagerAnd page up and page downRecyclerView(vertical scrolling is used with pagersnaphelper) move at the same time.

mViewPager.addOnPageChangeListener(new ViewPager.OnPageChangeListener() {
    public void onPageScrolled(int position, float positionOffset, int positionOffsetPixels) {
mRecyclerView.addOnScrollListener(new RecyclerView.OnScrollListener() {
     public void onScrolled(RecyclerView recyclerView, int dx, int dy) {

Android reconstruction of good-looking video -- reconstruction practice around player

The core feature of the old architecture is,The player has a very high life cycle and scope (consistent with the current activity)And the components with low scope and life cycle directly hold the context to control and operate the player. Although the player global singleton is logically simple, its implementation is not simple, and there are several obvious problems

1. Serious business coupling and low development efficiency
  • Serious coupling between player and business code, more than 10000 + lines of core class codes, high maintenance cost and extremely unfriendly to newcomers. When the player is initialized, there are 221 views. The logic of hiding and displaying between views is complex. The nesting level of function brackets is very deep, and the maintenance cost is very high. The feed list only carries the video cover map, resulting in that third-party businesses such as advertising / live broadcasting should not only be responsible for the display of the holder, but also independently create a high-level player for control, and the code complexity is very high.
  • Player state control complex disorder, from activity, fragment, viewpager, recyclerview, recyclerviewadapter, recyclerviewholder, each view can directly control the global single instance player. The life cycle is difficult to track, and it is very difficult to locate related bugs and user feedback.
2. Performance issues persist
  • Because the player is floating on the top layer of all views, if the views of some businesses need to be on the top layer, they can only be re implemented inside the player.RecyclerViewHolderSome views in the must be in both the holder and the player’s view. In addition to the old code of history, a large number of anr and Caton during the initialization of the player’s view appear on the line

Android reconstruction of good-looking video -- reconstruction practice around player

The complexity of the player itself makes it very difficult and risky to optimize the performance. In addition, due to historical reasons, the sliding performance of the feed of the old architecture is very poor. The sliding Caton is very obvious on the midrange machine, which is far from the competitive products. Take the flame diagram as an example:

Android reconstruction of good-looking video -- reconstruction practice around player

  • The feed list sliding needs to synchronize the player to slide the caliper (including player reset, etc.), resulting in artificial degradation of the start speed.
  • Low level components need to hold activity level handles, which is very prone to memory leakage.
  • The business that cannot directly obtain the activity handle distributes a large number of messages and control logic through eventbus, resulting in confusion of playback control (confusion of eventbus events and conflict of component life cycle events, etc.). Eventbus not only aggravates the risk of memory leakage, but also leads to performance problems of some columns.

3、 Good looking video reconstruction project – multi player

there ‘s no making without breaking

We finally decided to reconstruct the existing code around the player and sink the global singleton player into each holder to facilitate business isolation and flexible call of the player. While improving the rationality of the architecture so as to improve the development efficiency of the team, some performance problems are solved: at the time of project approval, only architecture optimization may not be enough to convince people, but the optimization of basic experience brought by performance optimization should not be underestimated. Maybe for the client,Refactoring with performance optimization is often the perfect refactoring.

Android reconstruction of good-looking video -- reconstruction practice around player

New architecture: multiple player instances, each holder has a unique player, and the player slides with its own holder; Player and business are decoupled.

The remarkable feature of the new architecture is to reduce the player scope and life cycle:

  • Self consistent management of player status is realized in the holder. Live broadcasting / advertising and other services can realize their own services (including playback control) only in the holder, so as to reduce useless logic and code coupling.
  • Distribute and play related events through lifecycle Lite to reduce the dependence on eventbus and reduce the risk of coupling between components and memory leakage.
  • Use custom pagesnaphelper and other components to centrally optimize the core playback experience such as feed list start / preload.

”How fast and economical“, it can be said that the new architecture only implements the first three, but for the client, deliberately “saving memory” is not a good idea. A player occupies about 10m of virtual memory. In most cases, there are 2-3 player instances in the app at the same time, exchanging 20-30m of “space” for “time” (time saved in development + time for performance improvement) , it looks like a big profit and small loss business.

Android reconstruction of good-looking video -- reconstruction practice around player

No matter from the flame diagram, or from the frame drop rate, business Caton and anr of online statistics, the new architecture obviously has better sliding performance and experience. Moreover, even the time-consuming part is relatively easy to repair and alleviate.

Comparison of frame drop rate before and after reconstruction:

Android reconstruction of good-looking video -- reconstruction practice around player

Optimization of start time

Video start time is very important for short video apps. If users have to wait for a long buffer to see the video start playing, the probability of leaving the app will increase. Under the old architecture, the single player instance will make it difficult to realize the special optimization for the start time, while the new architecture provides architecture level support for this optimization.

1. About the timing of player creation

According to the RecyclerView mechanism, the holder of the next video will be called ahead of time when the video is playing.onBindViewHolderPrepare pages and data, so you canRecyclerViewHolderofonBindInitializes the player of the next video to be played.

public void onBindViewHolder(@NonNull RecyclerView.ViewHolder holder, int position) {
    if (holder instanceof ImmersiveBaseHolder) {
        ((ImmersiveBaseHolder) holder).onBind(getData(position), position);
2. About the time when the player starts

In general, we willRecyclerViewofonScrollStateChangedDetermine the sliding state of the list whenRecyclerViewPlay again when the slide stops, and end the playback of the previous video.

mRecyclerView.addOnScrollListener(new RecyclerView.OnScrollListener() {
    public void onScrollStateChanged(RecyclerView recyclerView, int newState) {
           if (newState == SCROLL_STATE_SETTLING) {

This method is feasible, but can we play the player earlier?

When we put our fingers on the screen, view slides up and down with our fingers; When we let go of our fingers,PagerSnapHelperThe accountant calculates the video to jump, and calculates the time according to the speed and the remaining sliding distanceSmoothScrollerDo inertial scrolling animation – let’s consider if at the moment you release your finger,In other words, when we know the next video to be played, what effect will it have?

Almost second broadcast

Call prepareasync from the player to the first frame rendering(onInfoofMEDIA_INFO_VIDEO_RENDERING_STARTCallback), it takes about 300-500ms, and it is also close to 200-300ms from the finger opening the screen to the end of sliding. Generally speaking, the starting speed is about 200ms, and users can almost think of it as “second on”, so starting broadcasting in advance greatly improves the user experience.

protected LinearSmoothScroller createSnapScroller(RecyclerView.LayoutManager layoutManager) {
    return new LinearSmoothScroller(mRecyclerView.getContext()) {
         protected void onTargetFound(View targetView, RecyclerView.State state, Action action) {
             int nextPosition = state.getTargetScrollPosition();
//The focus is on ontargetfound. At this time, the selected holder has been successfully located

Even on machines with average performance, deeper optimization can be consideredonBindViewHolderAfter you create a player in, immediatelypreparePlayer, but do not callstart。 Such optimization needs to master the life cycle of the player very skillfully. Improper handling can easily lead to multiple videos playing at the same time or other hidden bugs, so you need to be extra careful.

3. Start broadcasting earlier

Some students may ask, why not start broadcasting when the holder just appears on the screen? For example, at holder’sattachToWindowin

This will lead to a problem. Due to the premature start of broadcasting, when the screen stops, the video has been played for 1-2s, which will be strange for users. They will never see the complete short video, and the experience is not better, but worse.

But this idea is not useless. It is applicable to a special format of video stream: when there is a live stream in the feed. Start broadcasting the live stream in advance (assuming flv or RTMP), but do not play the live sound. Wait until the sliding is over and then start playing the live sound. The effect is very good. The feature of live broadcasting is that users do not need to start from the first frame, and the start of live broadcasting is often slower than short video. Starting broadcasting in advance is a perfect solution for live broadcasting, and this idea is also the implementation method of many apps.

About the overall benefits of the new architecture

In terms of development efficiency, the general subjective reflection has improved the development efficiency by at least 20%, the code is easier to find, and the “historical burden” is much less.

In terms of technical indicators, the start time perceived by users has been greatly increased by 150ms, and the video can be turned on in seconds when the network is not poor; And the subsequent optimization is simpler. We are constantly optimizing every detail.

In terms of business indicators, the retention rate, the number of videos played per capita and the time of use have increased to varying degrees, the commercialization income has also increased, and the business income is very obvious.

4、 On player preloading

The so-called “preloading” refers to downloading a certain length of video in advance. When you want to play the video, the player only needs to download a small part, or even start playing immediately.

1. About preloaded file size

Loading too little, completely lost the effect of second on; Loading too much is a waste of bandwidth (of course, if the user is very active and willing to slide, the next complete video can be preloaded). Generally speaking, 300k-500k is a common choice.

$pip install qtfaststart
$qtfaststart - l used to be you. Mp4

ftyp (32 bytes)
moov (6891 bytes)
free (8 bytes)
mdat (3244183 bytes)

For short videos of ordinary length, 300K can contain several frames of data. As shown in the above figure, the file header accounts for less than 100k. Let’s look at the frame.

$ffprobe used to be you. MP4 - Show_ frames | grep -E 'pict_ type|coded_ picture_ number|pkt_ size'

The first frame of the video is the key frame I frame, which accounts for about 30K. The volume of B and P frames is relatively small, so it is not difficult to estimate that 300K can render a lot of frames, which basically meets our needs; If the video is long or the code rate is large, the preload length should be increased moderately. The best scheme is that the back-end transcoding end calculates the data in advance, and the end loads the corresponding size according to the recommended value.

For the sake of playing fluency and audio and video synchronization, most players will have a buffer locally. Some logic is set according to the number of frames, such as 20 frames, and some are set according to the time, such as 1-2 seconds, which does not necessarily start at 300K. The specific value needs to be tested locally. If necessary, the configuration of the player kernel needs to be modified. In most cases, when the good-looking video is 300K, the self-developed player can start broadcasting smoothly.

2. On the timing of Preloading

If the preloading time is too late, it has little effect; If it is too early, it may grab valuable bandwidth resources from the currently playing video, which may lead to diminishing returns on the starting speed and even serious jamming. Imagine that if we quickly slide the three ABC videos, the AB video is preloaded, the C video is playing, and the D below C is also preloaded. The starting time of C will not be reduced, but will be increased!

We must abide by a principle: Video preloading must not affect the playback of the current video. A simple solution is to buffer the current video to a certain proportion, and then preload the next video. Of course, there are more detailed schemes, such as dynamically calculating the current buffered progress (onbufferingupdate callback) and playback progress (getcurrentposition). If the current buffered time is enough to support subsequent playback, you can start the preloading of the next video in advance; If the current video has finished playing (oncomplete callback), you can safely and boldly load more length for the next video.

In addition,After the video is slid away, the player and preload should stop immediately to release the useless bandwidth;Otherwise, in the case of network jitter, the preload competition will significantly aggravate the Caton.

After the new architecture of good-looking video gradually increases in volume, the playing Caton rate increases a lot. After urgent investigation, we put the suspected objects on the video preloading strategy. After re combing and determining the preload details, the Caton rate drops to the previous level, and the user’s perception of the start time has not degenerated, with great results.

3. About preloading the library androidvideocache…

This library has not been updated in recent years. There are many bugs and issues. It is not suitable for production environment,However, it can be regarded as a good resource for learning the design and implementation of preload library. The design of the library verifies a proverb in the computer industry:

Any problem in the field of computer science can be solved by adding an indirect middle layer.
Any Problem in computer science can be sovled by another layer of indircetion.

Videocache, as the middle layer of player and remote resource (CDN), on the one hand caches video from the remote to the local, on the other hand, the local machine starts a server to respond to requests from the player.

Android reconstruction of good-looking video -- reconstruction practice around player

However, the library does not support preloading a fixed number of bytes, but only full download. We can simply implement the function of partial preloading first.

// in
static final int PRELOAD_CACHE_SIZE = 300 * 1024;

public void preload(Context context, String url, int preloadSize) {
    socketProcessor.submit(new PreloadProcessorRunnable(url, preloadSize));

private final class PreloadProcessorRunnable implements Runnable {
    private final String url;
    private int preloadSize = PRELOAD_CACHE_SIZE;
    public PreloadProcessorRunnable(String url, int preloadSize) {
      this.url = url;
      this.preloadSize = preloadSize;
    public void run() {
      processPreload(url, preloadSize);
private void processPreload(String url, int preloadSize) {
    try {
      HttpProxyCacheServerClients clients = getClients(url);
    } catch (ProxyCacheException | IOException e) {
public void stopPreload(String url) {
    try {
      HttpProxyCacheServerClients clients = getClientsWithoutNew(url);
      if(clients != null) {
    } catch (ProxyCacheException e) {
    } catch (Exception e) {

public void processPreload(int preloadSize) throws ProxyCacheException, IOException {
    try {
    } finally {
      ProxyLogUtil.d(TAG, "processPreload finishProcessRequest");

public void processPreload(int preloadSize) throws IOException, ProxyCacheException {
    long cacheAvailable = cache.available();
    if (cacheAvailable < preloadSize) {
      byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
      int readBytes;
      long offset = cacheAvailable;
      while ((readBytes = read(buffer, offset, buffer.length)) != -1) {
        offset += readBytes;
        if (offset > preloadSize) break;
      ProxyLogUtil.d(TAG, "preloaded url = " + source.getUrl() + ", offset = " + offset + ", preloadSize = " + preloadSize);
//Only for learning, not applicable to production environment

In this way, we have the basic preload and stoppreload functions. The implementation of this library is not complex. We can develop a set of basic libraries suitable for our business with time and manpower.

Recently“Terminal intelligence”The concept is very popular. Compared with various ideas trying to enhance the recommendation effect, such as estimated click through rate (CTR), player based preloading strategy tuning may be the easiest direction for end-to-end intelligence. To the contrary, if the former has benefits, it can be transplanted to the recommendation and algorithm side. In theory, the latter can achieve real “more, faster, better and more economical”: on the basis of significantly improving the starting speed perceived by users, it can ensure that it will not degrade the Caton and will not increase too many bandwidth resources due to preloading.

5、 Discussion on player Caton

For apps with many playing scenes, the player must occupy a place in anr and Caton data。 Taking ijkplayer as an example, the release function is time-consuming, and you can even see it by grasping trace.

Android reconstruction of good-looking video -- reconstruction practice around player

The creation and destruction of the player are time-consuming operations, which can easily block the main thread, resulting in app jamming and even ANR. The most intuitive solution is to put the release function into the sub thread for execution, and the effect is immediate; However, if it is an application with a daily life of more than one million, many native errors such as sigabrt will be added, and the crash rate will be greatly improved. The reason is also very simple. For the same player, the main thread and sub thread operate the player at the same time, there must be thread conflict; A thread has released the player, and other threads are still calling the player interface.

1 long ijkmp_get_duration(IjkMediaPlayer *mp)
2 {
3     assert(mp);
4     pthread_mutex_lock(&mp->mutex);
5     long retval = ijkmp_get_duration_l(mp);
6     pthread_mutex_unlock(&mp->mutex);
7     return retval;
8 }

adoptaddr2lineperhapsndk-stackLocate that a large number of crashes occur in line 5, and MP is a null pointer, resulting in crash. It’s not hard to guess. Since the app crashes later without the assert statement in line 3, it indicates that there must be a thread problem outside the control of this lock. A simple solution is to add empty judgment processing again, but this solution still can not completely eliminate crash.

static long ijkmp_get_duration_l(IjkMediaPlayer *mp)
    if (mp == NULL) {
        return 0;
    return ffp_get_duration_l(mp->ffplayer);
//Note: there are still thread conflicts in this scheme

For a complete solution, a more elegant solution is:

1. Put all operations of the same player, including creation and destruction, in the same sub thread. It is recommended to put them in a handlerthread;

2. Add variables separately on the business sideisPlayerReleased, set this variable to before the player is destroyedtrue, all subsequent operations on the player should be ignored directly;

It should be noted that in addition to the service side actively calling the player, ijkplayer itself also has a thread operating the kernel, and the player itself will actively call back to the service periodically.

// in

private static class EventHandler extends Handler {
        public void handleMessage(Message msg) {
            switch (msg.what) {
            case MEDIA_PREPARED:


            case MEDIA_BUFFERING_UPDATE:
                long bufferPosition = msg.arg1;
                if (bufferPosition < 0) {
                    bufferPosition = 0;

You also need to join hereisPlayerReleasedControl, after recompiling the player kernel, the online crash related to the player almost disappears

6、 Significance of architecture and performance optimization

Large scale performance optimization can significantly improve the user experience of applications and further improve the core indicators of business, especially for user growth and business realization. Simple technical indicators, such as cold start speed and sliding fluency, can not convince everyone, but if we can translate them into the gain of business indicators, the recognition will be greatly improved. For example, the optimization of startup speed often leads to the improvement of retention rate, and the optimization of fluency may improve consumer indicators; Since technology serves the business, technology is likely to be able to prove its benefits through data. If we are responsible for e-commerce app, we may increase the order conversion rate by one thousandth; If we are responsible for the information flow and information app, it may increase the number of videos viewed per capita, the amount of graphic feed reading, and the use time. It may even directly increase the commercialization income due to the increase of display, consumption and use time.

Therefore, we might as well put technology aside first, regard ourselves as a product manager, and think about technology optimization instead of ordinary business iteration requirements. We carefully designed the AB experiment, actively added the AB experiment switch in the code, kept an eye on the AB experiment background, checked the user feedback at any time, and reported the revenue with business data. If we can’t prove that the user experience has obviously improved, we can at least prove that it hasn’t become worse. Once we make the benefits of technological optimization deeply rooted in the hearts of the people, even small-scale optimization immediately has a legitimate reason for its existence.

Of course, not all optimizations are suitable for ab experiments. If business iterations are very frequent and optimization cannot be completed in a short time, we will have endless and ruthless code conflicts waiting for us. “Bold assumptions and careful verification. “As long as we have the right opportunity, we should design the AB experiment and keep an eye on the AB experiment market to see the data, whether for business Rd, performance Rd, or schema Rd.

Recommended reading

On the Na end typesetting technology of Baidu reading / Library

Continuous delivery practice under cloud native architecture

Architecture and data science behind hundreds of thousands of experiments a year

| target design of Baidu short video recommendation system

———- END ———-

Baidu geek said

The official account of Baidu technology is on the line.

Technology dry goods, industry information, online salon, industry conference

Recruitment Information · internal push information · technical books · Baidu peripheral

Welcome to pay attention

Recommended Today

On the mutation mechanism of Clickhouse (with source code analysis)

Recently studied a bit of CH code.I found an interesting word, mutation.The word Google has the meaning of mutation, but more relevant articles translate this as “revision”. The previous article analyzed background_ pool_ Size parameter.This parameter is related to the background asynchronous worker pool merge.The asynchronous merge and mutation work in Clickhouse kernel is completed […]