Short term outbreak or future trend? Technological development behind Tencent cloud’s massive audio and video communication services


In the face of special period, whether remote office, or “no school suspension” has become the inevitable choice. Many video conference platforms have also opened a short-term free opening and functional services in the special environment. But at the same time, it is inevitable to bring sudden large-scale online video and collaboration demand and traffic impact. Facing the challenges of high concurrency, high availability and high performance, how to support the technology behind the platform? Where is the future trend? We invited Li Yutao, general manager of Tencent cloud video communication business, to share Tencent cloud’s technical optimization in coding and decoding, video network transmission, etc.

By Tommy

Planning / livevideostack

Livevideostack: Hello Tommy, over a year since your last interview, what changes have taken place in Tencent’s video cloud products?

Li YutaoVideo cloud has changed a lot this year. First of all, in terms of business, we have basically connected with the direct and on-demand customers of most customers in the market. The business volume of the market has increased significantly. At present, the overall bandwidth and the duration of audio and video calls have basically doubled.

In terms of product matrix, while continuously strengthening our traffic based PAAS platform, we have also launched a number of new business products, including fast live broadcasting products with lower delay performance, slow live broadcasting products used by leishenshan cloud supervisors, Tencent cloud scissors and cloud guide Station products for production domain, and live SaaS products for enterprise training and commercial live broadcast scenes We have also upgraded our video AI product capabilities to the “video intelligence” product family.

Within Tencent, video cloud has become the middle ground of Tencent’s video business. Most of the video products such as wechat, enterprise wechat, Tencent conference, Tencent education and so on have been running in our video platform. Internally, we also actively promote various open-source video collaborative projects within the company to promote efficiency improvement and technology sharing. In addition, in the next generation of audio and video coding, audio and video transmission protocol, audio and video pre-processing algorithm, we are also promoting the construction of open-source communities of various technologies.

Integrating live broadcast technology and wechat ecology to open up the whole platform PAAS + SaaS service

Livevideostack: we have observed that small programs support the launch of courses, greatly simplifying the process of live teaching, and can quickly join Tencent conferences. In addition, what are the expectations of the app’s multimedia services?

Li YutaoThe small program cloud live broadcast solution is that we integrate Tencent cloud live broadcast technology and wechat ecological advantages, provide cloud live broadcast capability and live broadcast PAAS service through the whole platform, inject live broadcast capability into various business scenarios, and help enterprises quickly realize small program live broadcast. Provide the plug-in of small program live broadcast which is officially certified by wechat, and the introduction of plug-in can quickly and conveniently implant the live broadcast ability into its own wechat applet. During the epidemic period, the live broadcast not only included the previous pan entertainment scenes, but also entered the production and life scenes of people.

In addition, we also cooperated with our partners to launch a SaaS solution for enterprise internal training and other scenarios: Tencent yunhuanju live broadcast, which provides enterprises with a lighter one-stop SaaS live service of small programs, helping enterprises obtain multi-directional portraits of the audience and improving the effect of online training and sales follow-up. In the live broadcast, we support simultaneous display of PPT and other documents to the audience, which is also supported in the form With red envelope and other functions, it is more interactive. After the live broadcast, the playback video will be automatically generated, and the PPT demonstration will play synchronously with the video during the playback.

Livevideostack: will Tencent’s cloud video cloud capabilities be opened to overseas users, such as live applet, live H5?

Li YutaoAt present, Tencent’s video cloud related capabilities have been opened to overseas users. With [multi center deployment], the construction of multiple central computer rooms has been completed in Hong Kong, China, Thailand, Singapore, Germany, Toronto, Silicon Valley, Russia, South Africa and South Korea, and the construction of central computer rooms is gradually expanding to cover national regions.

In order to cope with the complex internal network of overseas countries and the different quality of cross-border network in overseas countries, reduce stuck and provide stable and reliable services, Tencent cloud has optimized the structure, network, security, resources and other aspects of overseas live broadcast scenes.

Relying on Tencent cloud’s sea going strategy and long-term overseas investment, it has built more than 1300 transmission nodes in more than 50 countries and regions, with a total bandwidth reserve of more than 100t, cooperating with more than 50 global operators, and overseas acceleration point of 200 +.

If you want to experience or use it, you can open it directly in the cloud live console.

Tencent self-developed network transmission protocol + v265 coding technology to comprehensively improve user experience

Livevideostack: I saw the launch of Tencent conference in December last year. In early February, we met the challenge of “people all over the country are working remotely”. How can we give better support to Tencent conference in a short time? At the same time, how to ensure that the quality of other services will not be reduced, and what experience can be shared with you?

Li YutaoIn order to ensure the audio and video transmission quality in the multi terminal access mode under the complex network environment, Tencent conference adopts the cloud flow control engine technology that we have accumulated for a long time in the audio and video field. By integrating the classic signal processing, psychological hearing and deep learning theory, Tencent conference inherits the complex characteristics of the end-to-end audio and video communication link, and inherits the large-scale services such as wechat, QQ, King glory The key technologies, such as detection, route selection, scheduling and transmission, have been accumulated for many years. The network transmission environment of audio and video in the cloud and user terminal is detected and calculated in real time, and the optimal network path is selected for transmission.

At the same time, the network transmission protocol and v265 coding technology developed by Tencent can reduce the packet loss rate in the process of low-frequency video transmission, at the same time, it can also ensure the video clarity of sharing scenes on the screen and on different terminals, and ensure the user’s conference experience.

In terms of quality evaluation, Tencent conference also used Tencent multimedia laboratory hundreds of indicators in line with itu / 3GPP / AVS and other domestic and foreign standards for evaluation. In order to better measure QoE, Tencent multimedia laboratory built a large-scale audio and video subjective quality database, and based on these data to develop objective quality evaluation algorithm, and then deployed to the business line to monitor the quality of user experience of the whole network in a closed loop.

In addition, users generally feel that the environment noise is less when using Tencent conference. This experience also benefits from Tencent cloud audio and video technology accumulation in noise reduction. Considering the diversity of remote conference environment, Tencent conference aims at the 3A problem (noise reduction, echo suppression, gain control) in multi person and multi scene real-time communication system, and makes the participants overcome the noise interference and obtain a focused conference environment.

Livevideostack: how does Tencent cloud video cloud team cope with the sharp increase of user usage? For example, through “degraded services” to ensure the basic services of users, such as using RTMP with higher latency to replace webrtc low latency lines?

Li YutaoTencent video cloud live provides users with TRTC (Tencent real time communication) + webrtc fast live broadcast (go up RTMP streaming or flv, HLS, RTMP back to source, downlink support standard webrtc protocol output) + CDN (flv / HLS / dash) For large integration solutions, such as education, conference and interaction, the TRTC access (Global delay < 300ms) is preferred. At the same time, TRTC access will mix streaming in real time and RTMP will be pushed to the live broadcast platform. Users can automatically cut part of the traffic to webrtc fast live broadcast (with a delay of about 500ms) or ordinary C according to the online concurrency configuration exceeding a certain concurrency threshold (such as 100W) DN access (the delay is related to the user GOP and CDN buffer configuration, and the general delay is about 2-5 seconds). Video cloud CDN has built more than 1300 transmission nodes in more than 50 countries and regions around the world, with a total bandwidth reserve of more than 100t, and has cooperated with more than 50 global operators and overseas acceleration points of 200 +.

Livevideostack: I know that some customers are already testing the SRT solution and webrtc CDN products of Tencent video cloud. What are the advantages of SRT and webrtc CDN compared with quic, RTMP and low latency HLS?

Li YutaoCompared with quic, SRT has special optimization for live scenes, and better combines real-time bit rate evaluation and pacing in transmission control Rate transmission interval calculation; in addition, SRT can be configured to allow packet loss during transmission. Tencent video cloud supports selective packet loss according to the coding characteristics of audio and video, which ensures the smoothness of playback as far as possible without affecting the frame rate without affecting the picture quality. Compared with RTMP and HLS based on TCP, SRT effectively solves the problems of high delay and poor jitter resistance in long-distance link transmission scenarios. After actual test, SRT has obvious characteristics of low delay and low stuck compared with RTMP. In terms of the scalability of SRT, we support all TCP based protocols, including RTMP / flv / HLS. Quic is mainly used in flv and HLS.

Livevideostack: what’s new about video codec?

Li YutaoIn September 2019, Tencent officially announced that it would join the open media alliance aomedia, promote the commercialization of video AV1 standard, and become one of the board members and the only Chinese enterprise among the board members. A few days ago, Tencent video cloud live (flv / HLS / dash) and VOD have supported AV1 / avs2 standards. It is reported that Tencent cloud is also the first public cloud manufacturer in China that supports both live and on-demand video processing services. VVC / avs3 Tencent video cloud is also in the stage of development and engineering verification, and will be open to customers in 2020.

Livevideostack: about deep learning, from AI enhanced codec, content understanding and automatic editing, what is Tencent video cloud doing?

Li YutaoTencent video cloud bright eyes · super HD – intelligent dynamic coding technology, through intelligent scene recognition, industry-leading audio and video coding, image deep learning and image quality enhancement technology accumulation, carries out dynamic perception coding and image quality enhancement processing for video, improves video viewing experience, and provides higher HD streaming media services with lower bit rate for live broadcast, on-demand and other industries. Ultra high speed HD can save bandwidth costs of more than 30% + and 40% + respectively under the same image quality compared with open source software under the same image quality under on-demand and live broadcasting services; or, under the same bandwidth conditions, it can provide a more high-definition and high-quality video quality experience, and feel the change of “quality”. High speed HD AI enhancement technology in 4K super score, HDR, color enhancement, intelligent bullet screen, inserting frame and other technologies has been successfully applied by customers in game live broadcast, radio and television HD Channel and 4K / 8K sports event live broadcast.

Based on the latest research results of Tencent laboratories, Tencent video provides video content understanding editing, intelligent recognition, intelligent editing, intelligent auditing and other functions for radio and television new media, education, live broadcast and online video scenes, automatic editing of the 70th anniversary CCTV National Day parade, real-time automatic generation of live game live game highlights, and live real-time voice / OCR recognition and reference The real-time audit of Huang, violence and politics is involved. Combined with the video content understanding and atomic capabilities of editing, editing, identifying and auditing, Tencent video cloud recently launched Tencent cloud scissors and broadcasting station products to provide online video creators and institutions for media platform, PGC / UPGC, MCN, live platform and E-sports content producers and institutions in the new epidemic stage With a complete set of pipeline process solutions, such as, edit, push flow.

The next outlet of audio and video: comprehensive transformation from offline to online?

Live videostack: from live broadcast, short video, live answer, online education, video conference, enterprise collaboration, multimedia applications have experienced penetration from the Internet to the industry, from 2c to 2B. Where do you think the next outlet is?

Li YutaoIn the short term, as I mentioned earlier, driven by the epidemic, offline business, offline content, and offline commerce will accelerate to move online, including office collaboration, audio and video communication, video marketing, entertainment experience, etc.; there is a data, in the past month, Tencent cloud’s voice and video call minutes have reached an average daily scale of 3 billion minutes In addition, VR has not been popular before. In addition, if offline businesses migrate to online, such scenes as VR house watching and car watching can be implemented. In addition, there will be more and more intelligent content production, virtual anchor and intelligent customer service based on audio and video AI.

In the medium and long term, we are optimistic about the development of online office collaboration and online education. After returning to work and classes at home during this period, we feel the great help of science and technology for emergencies, and experience more and more abundant and stable audio and video communication services, which can replace the traditional offline communication mode to a certain extent. I believe that in the future, we can also accept this kind of online education and office life style.