Introduction:For the technical team responsible for the construction of video processing system, such a business scenario leaves them a series of challenges.
In recent years, the rapid development of online education industry provides unprecedented convenience for the dissemination of knowledge in the whole society. Through various forms of online education platform, students and teachers can carry out teaching activities even if they are thousands of miles apart. With the help of rich network courseware, students can learn anytime and anywhere, really breaking the time and space constraints. In various forms of network courseware, video courseware is naturally the most intuitive and expressive form, so the market share of video courseware is also increasing year by year.
Demand analysis of video processing
For the producers of video courseware in the field of online education, they have to process a large number of video content every day. The following figure shows a typical scene:
(1) After the user uploads a video to the platform, the video source file will be temporarily stored in the object storage.
(2) The video is preprocessed and watermarked on the platform.
(3) The platform converts video files into other formats and adjusts the resolution to meet the requirements of different terminal devices.
(4) The processed video files are saved back to the object storage and synchronized to CDN for acceleration.
Although this scenario is relatively simple in terms of process, the technical challenge is actually very big. The original author of the video courseware comes from the majority of users of the online education platform, who may be the internal users responsible for the content output of the platform, the teachers who signed the contract, or the sharing users who have been certified by the platform. Users upload video without fixed frequency, often concentrated in a few time periods, there are obvious peaks and troughs. At the peak of business, the demand for video processing is very large. Some online education enterprises need to complete tens of thousands of video transcoding every day. For the technical team responsible for building the video processing system, such a business scenario leaves them a series of challenges
(1) How to ensure the high availability of this system in the peak period of business?
(2) How to make every uploaded video processed as soon as possible?
(3) How to reduce the cost of resources as much as possible?
(4) How to efficiently deal with the frequent changes of demand?
Based on these demands, we combine the characteristics of cloud computing to analyze the feasible solutions.
Using SaaS cloud service to complete video processing
With the increasing product lines of major cloud computing manufacturers, we can easily find out of the box solutions to solve such typical video processing needs. Taking Alibaba cloud as an example, VOD products provide a one-stop solution that integrates video capture, editing, uploading, media resource management, transcoding processing, video audit analysis, and distribution acceleration.
For the technical team, without preparing any computing resources in advance, or even writing any code, we can have a complete set of video processing system from scratch without considering the problem of resource planning. Such a solution is very suitable for the scenario where the system needs to be put online quickly at the primary stage of business development.
However, with the continuous development of business, the SaaS solution out of the box still has many limitations. For the following reasons, most technical teams will choose to build their own video processing system
(1) For the video processing services that have been implemented by ffmpeg technology before, it is difficult to directly migrate to SaaS solution because of the complex business logic.
(2) High order video processing needs must be realized by code, such as audio noise reduction, dynamic GIF watermark insertion, frame cutting by fixed frequency and so on.
(3) It is an industry trend to use large video with high resolution. For large video processing, such as 1080p video over 10g, it is often necessary to calculate and optimize it through custom means to ensure the timeliness of processing.
(4) In many scenarios, self built video processing system will bring obvious cost advantages.
(5) Frequent changes in business requirements require more fine-grained iterative management of the whole system, such as Canary strategy to reduce the risk of new version release.
So how to build a video processing system with high performance, high availability, high flexibility and low cost?
Based on Distributed Cluster
The most typical solution is to apply for a set of cloud virtual machines, and deploy video processing applications on each virtual machine to form a scalable service. When there is a new uplink, a processing task can be triggered and distributed through load balancing or message queuing. The application node receiving the task is responsible for completing the corresponding task.
Through this architecture, users upload video more intensively during the peak period of business, and the number of instances in the service cluster can be increased to improve the processing capacity. In the low peak period, the number of service cluster instances can be reduced to reduce the resource cost.
This solution can achieve a variety of high-level video processing requirements through customized code logic, with very high flexibility. With horizontally scalable computing cluster and load balancing mechanism, it can meet the requirements of performance and cost at the same time. It is a widely adopted solution. However, in the case of large-scale operation in the production environment, this solution will still expose many problems
(1) The maintenance work is heavy.
The maintenance workload of the whole system covers virtual machine, network, load balancing components, operating system, application and other aspects, which requires a lot of time and energy to ensure the high availability and stability of the system. For the simplest example, when an application instance fails, how to locate the fault at the first time and remove it from the computing cluster as quickly as possible, and how to ensure that the unfinished tasks can be dealt with again after removal? These need to be combined with a complete monitoring mechanism, fault isolation and recovery mechanism to achieve, and even involve the optimization of business logic in the code layer.
(2) The elastic expansion ability lags behind.
There are two ways to achieve the elastic scaling of the computing cluster: through the timing task trigger, or through the index threshold (CPU utilization, memory utilization, etc.). No matter which way is adopted, there is no way to fine manage based on user behavior. When the task density fluctuates greatly, it will face the problem of lagging flexibility. When the video upload requests from users suddenly increase, adding an application instance needs to go through multiple stages, such as applying for cloud resources > initialization > deploying application Image > application startup > adding to the load balancing list. Even though it is optimized by kubernetes + reserved resource pool and other technologies, it often takes more than 10 minutes.
(3) The utilization rate of resources is low.
The lagging flexibility will lead to the relatively conservative formulation of scaling strategy, resulting in a large waste of computing resources and increasing the use cost, as shown in the figure below:
Is there a solution that can help the technical team focus on the implementation of business logic, and can make fine resource allocation according to the user’s actual upload request, so as to maximize the utilization of resources? With the rapid development of cloud computing, major cloud manufacturers are actively exploring new solutions to solve the cost and efficiency problems in a more “cloud native” way. The functional computing + serverless workflow provided by Alibaba cloud is a very representative solution in this field.
Alicloud function computing is an event driven fully managed computing service. Through function calculation, developers do not need to manage the server and other infrastructure, just write code and upload. Function calculation will automatically prepare computing resources, run code in a flexible and reliable way, and provide log query, performance monitoring, alarm and other functions to ensure the stable operation of the system.
Compared with the traditional way that the application server keeps running and provides services to the outside world, the biggest difference of function computing is to pull up the computing resources on demand to process the task, and automatically recycle the computing resources after the task is completed. This is a scheme that really conforms to the concept of serverless and can maximize the utilization of resources, Reduce the workload and cost of system maintenance. Because there is no need to apply for computing resources in advance, users do not need to consider the problems of capacity evaluation and elastic scaling at all. They only need to pay according to the actual usage of resources.
The figure below shows how function calculation works
For users, uploading the code that implements the key business logic to the function computing platform can trigger the function execution in an event driven way. Function computing has supported a variety of mainstream programming languages. For existing code, it can be deployed to function computing through a few very simple steps. For all the development languages supported by the function, please refer to the list of development languages.
Every allocation of computing resources is based on the triggering of events. An event often corresponds to a task in the business. Function calculation supports a variety of triggers. For example, the event source of HTTP trigger is HTTP request. After receiving an HTTP request, function calculation will allocate corresponding computing resources to process the HTTP request according to the preset specifications. After the request is processed, function calculation will decide whether to recycle the computing resources immediately according to the user’s settings. The OSS trigger can monitor all kinds of events that occur on the object storage OSS. When a user uploads a new file or modifies the file, it will automatically trigger the function execution. This method is just suitable for the business scenario of video processing. For more supported function triggers, refer to the list of triggers.
In the scheduling of computing resources, a lot of optimization has been carried out in function computing. In the face of the sudden increase of user requests, a large number of computing resources can be pulled up in milliseconds to work in parallel to ensure the user experience.
Video processing by function calculation
Based on the characteristics of function computing, it is very simple to build a video processing system. Just configure an OSS trigger and upload the core code of video processing to function computing
Through this solution, users no longer need to consider a series of complex problems such as resource management, load balancing, system high availability, elastic scaling, system monitoring, etc. the function computing platform will schedule computing resources according to the user’s upload behavior in an optimal way, and complete video processing tasks with low cost and high efficiency. The specific operation steps and code implementation can refer to the python implementation demo of video processing. In this demo, we demonstrate how to convert the video uploaded by users into MP4 format video with 640 * 480 resolution based on function calculation.
Each created function will correspond to a specified entry, and function calculation will start from this entry, similar to the main() function in local development. Taking Python as an example, a simple entry function is as follows:
def handler(event, context): return 'hello world'
When an event is triggered, it will be executed from the entry function. The event parameter carries the information related to the event source. For example, in the video processing scene, the event parameter carries the information such as the bucket and file name uploaded to the OSS. The context parameter carries the running information of the function, including function name, timeout, access credentials, etc. With this information, the executable code can perform various predefined operations.
Function computing supports various mainstream programming languages. In this programming language, scripting languages such as node.js and python contain rich class libraries, which have high development efficiency. Moreover, the starting speed of operation instance is very fast, and they can support tasks that are particularly sensitive to delay. They are the most matching languages for function computing. Java, go and other languages can’t create a function by uploading code directly like scripting languages. They need to be compiled in advance, which will be slightly more complex to use. However, with funcraft and other tools provided by function calculation, the efficiency of development and deployment can be greatly improved. No matter which development language is used, it is recommended that users download the official funcraft tools to make it easier to develop, build and deploy. Please refer to funcraft.
Languages like Java need to load more class libraries when the virtual machine starts. They can’t start the operation instance in milliseconds and enter the execution state. They can’t be directly used in some business scenarios that are particularly sensitive to latency. However, with the new functions of reserved instances and single instance multi concurrency provided by function computing, the impact of cold start on business can be eliminated, and the impact of waiting for downstream service response can be reduced, so that Java language running on function computing can also realize business scenarios with extremely high delay requirements such as API gateway. Please refer to reserved instance and single instance multi concurrency.
Through the scheme described above, we can easily complete all kinds of customized processing of short video. But each function computing instance is not unlimited in terms of resource specification and total running time. At present, the function computing instance can have 3G memory resources and 10 minutes of execution time, which means that when a video processing task needs to occupy more than 3G system memory, or the total execution time is more than 10 minutes, the processing task will fail.
In 5g era, super large video courseware is a very common demand. How to deal with such a large video through function calculation? At this time, we need to launch another weapon, serverless workflow, to complete this task together with function calculation.
Serverless workflow is a fully managed cloud service used to coordinate the execution of multiple distributed tasks. You can arrange distributed tasks in order, selection, parallel and other ways. Serverless workflow will coordinate task execution reliably according to the set steps, track the state transition of each step, and execute user-defined retrial logic when necessary to ensure the smooth completion of workflow. Serverless workflow monitors the execution of workflow by providing logging and auditing, which makes it easy for you to diagnose and debug applications.
You can use serverless workflow to arrange a series of function resources, define the input and output of each step in the process, and use built-in control steps to arrange complex logic, initiate parallel execution, manage timeout or terminate the process. In addition, through the console, we can use the graphical interface to display the task status and execution order. At the same time, the console will display the real-time status of each step, and provide a detailed history of each execution. Through the combination of serverless workflow and function calculation, we can break through the limitation of time and space and deal with video files of any size.
Big video processing
In short, the basic idea to deal with a large video is as follows:
(1) The video is sliced first, and the size of each slice is controlled to a reasonable size, so that a single function calculation instance can process it quickly.
(2) Pull up a number of function calculation examples to parallel process each slice.
(3) Merge the processing results.
The process of video processing through serverless workflow + function calculation is as follows:
Through the visual interface provided by serverless workflow, we can easily view the information of each step in the process of workflow execution, and cooperate with the customized dashboard to realize the comprehensive monitoring of the whole video processing system
Reduce maintenance costs and resource costs, and greatly shorten the project delivery time.
In the field of online education, there is a great demand for video processing, and there are high requirements for processing speed, concurrent throughput, resource utilization and so on. The combination of functional computing and serverless workflow scheme helps users easily build a flexible and highly available video processing architecture, which is the optimal solution to realize these complex requirements. With the continuous development of cloud nativity, serverless related technologies will go deep into more business scenarios, with unlimited possibilities in the future!
This article is the original content of Alibaba cloud and cannot be reproduced without permission.