Author Zhao Qingjie (Lu Ling)
Source|Alibaba cloud native official account
1、 Achievements of large-scale implementation of serverless group
In 2020, we made a very big upgrade on the underlying infrastructure of serverless. For example, the computing was upgraded to the fourth generation Shenlong architecture, the storage was upgraded to Pangu 2.0, and the network entered the Baig Luoshen network. After the overall upgrade, the performance was doubled; The baas level has also been greatly expanded. For example, it supports event bridge and serverless workflow to further improve the system capability.
In addition, we also cooperated with more than a dozen Bu in the group to help business parties implement serverless products, including the application scenario of double 11 core, and help them successfully pass the double 11 traffic peak test, which proves that serverless is still very stable in the core application scenario.
2、 Two backgrounds and two advantages – accelerating the landing of serverless
1. Two backgrounds of serverless
Why can we quickly achieve large-scale implementation of serverless within the group? First of all, we have two preconditions and backgrounds:
The first background isUpper cloud, it is an important prerequisite for the group to go to the cloud. Only when it goes to the cloud can it enjoy the elastic dividend on the cloud. If it is still an internal cloud, the subsequent effect and cost reduction is actually very difficult to achieve. Therefore, on the double 11 day of 2019, Alibaba realized 100% going to the cloud for the core system. With the premise of going to the cloud, serverless can play a very important role.
The second background isComprehensive Yunyuan biochemistry, it has created a powerful cloud family of cloud native products, empowered the internal business of the group, and helped the business achieve two main goals based on the cloud: improving efficiency and reducing cost. In 2020, tmall’s double 11 core system will be fully cloud original and biochemical, with efficiency increased by 100% and cost reduced by 80%.
2. Two advantages of serverless
- Improve efficiency
A standard cloud native application, from R & D to launch to operation and maintenance, needs to complete all the work items marked in orange in the figure above before it can complete the formal launch of microservice application. First, CI / CD code construction, and then the visualization work project of system operation and maintenance. It not only needs to be configured and connected, but also needs to carry out traffic assessment, security assessment, traffic management, etc. on the overall data link, which obviously requires very high manpower threshold. In addition, in order to improve resource utilization, we also need to mix various businesses, and the threshold will be further raised.
It can be seen that for the overall cloud native traditional applications, the work items that need to be completed to realize the launch of micro services are very difficult for developers and need to be completed by multiple roles. However, in the era of serverless, developers only need to complete the blue box coding in the above figure. For all the remaining work items, the R & D platform of serverless can directly help the business to complete the launch.
- cost reduction
Improving efficiency mainly refers to the saving of human cost, while reducing cost is aimed at the utilization rate of application resources. In ordinary applications, we need to reserve resources for the peak, but the trough will cause great waste. In the serverless scenario, we only need to pay on demand and refuse to reserve resources for the peak, which is the biggest advantage of serverless in reducing costs.
The above two backgrounds and two advantages are in line with the trend of cloud technology, so the business parties within the group hit it off immediately. Some large Bu have upgraded the serverless landing to the battle level to accelerate the serverless scenario of business landing. At present, the serverless scenarios implemented in the group have been very rich, involving some core applications, personalized recommendation, video processing, AI reasoning, business inspection and so on.
3、 Serverless landing scenario – front end light application
At present, the front-end scenario within the group is the fastest and most widely used scenario of serverless, including more than 10 + Bu of Taoxi, Gaode, Feizhu, Youku, Xianyu, etc. So why is the front-end scenario suitable for serverless?
The above figure is the capability model diagram of the whole stack of engineers. In general, there are three roles in micro applications: front-end engineer, back-end development engineer and operation and maintenance engineer, who jointly complete the online release of applications. In order to improve efficiency, the role of full stack engineer has emerged in recent years. As a full stack engineer, he should have the ability of these three roles. He needs not only the application development technology of the front end, but also the development skills of the back-end system level, and should pay attention to the underlying kernel and system resource management, which is obviously very high for the front-end engineer.
In recent years, node The rise of JS technology can replace the role of back-end development engineer. As long as the front-end engineer has the front-end development ability, he can play two roles, namely front-end engineer and back-end development engineer, but the operation and maintenance engineer can not be replaced.
The serverless platform solves the bottom three layers in the triangular structure above, which greatly reduces the threshold for front-end engineers to become full stack engineers, which is very tempting for front-end business developers.
Another reason is that the business characteristics are consistent. Most front-end applications have the characteristics of peak traffic, which requires business evaluation in advance, and there is an evaluation cost; At the same time, the front-end scene update iteration is fast, fast up and down, and the operation and maintenance cost is high; Moreover, it lacks the ability of dynamic expansion and contraction, and there are resource fragments and waste of resources. If you use serverless, the platform will automatically help you solve all the above worries, so serverless is very attractive to the front-end scene.
1. Front end landing scene
The above figure lists several main scenarios and technical points of front-end landing:
BFF to SFF layer: BFF is mainly backend for frontend. Front end engineers do the main operation and maintenance, but in the serverless era, the operation and maintenance is completely handed over to the serverless platform. Front end engineers only need to write business code to complete this work.
Slimming down: sink the front-end business logic to the SFF layer, reuse the logic by the SFF layer, and hand over the operation and maintenance capability to the serverless platform to realize the lightweight and efficiency improvement function of the client.
Cloud integration: a code multi terminal application, which is a very popular development framework, also needs SFF as a support.
CSR/SSR: server-side rendering and client-side rendering are met through serverless to realize the rapid display of the front-end first screen. Serverless combined with CDN can be used as a solution for front-end acceleration.
NoCode: it is equivalent to packaging on the serverless platform. Just drag and drop several components to build a front-end page. Each component can be packaged and aggregated with serverless to achieve the effect of Nocode.
Background scene: it is mainly the rich application scenario of a single application. The single application can be completely hosted in serverless mode to complete the launch of middle and background applications, which can also save operation and maintenance capacity and reduce costs.
2. Front end coding changes
What are the changes in coding after the application of serverless in the front-end scenario?
As we all know about the front end, the front end is generally divided into three layers: state, view and logic engine. At the same time, it will sink some abstract business logic into FAAS layer cloud functions, and then use cloud functions as FAAS APIs to provide services. In terms of coding, we can abstract all kinds of aactions, and each aaction can have FAAS function APIs to provide services.
Take a simple page as an example. On the left side of the page are some rendering interfaces, which can obtain commodity details, receiving address, etc., which are implemented based on FAAS API; On the right are some interactive logic, such as purchase and addition, which are also tasks that FAAS API can continue to complete.
In page design, all FAAS APIs are not only used for one page, but can be reused for multiple pages. After reusing these APIs or dragging, the front-end page can be assembled, which is very convenient for the front-end.
3. R & D and efficiency improvement of front-end light application: 1-5-10
After applying serverless on the front end, we simply summarize the effect of serverless on the R & D efficiency of the front end as 1-5-10, which means:
1 minute quick start: we summarize all kinds of main scenarios and classify them as application templates. When each user or business party starts a new business, it only needs to select the corresponding application startup template to help users quickly generate business code. Users can start quickly by writing their own business function code.
Online application in 5 minutes: completely reuse the operation and maintenance platform of serverless, and use the natural ability of the platform to help users complete gray publishing and other capabilities; And cooperate with front-end gateway, cutting flow and other functions to complete Canary test.
10 minutes troubleshooting: Based on the serverless function after going online, it provides the display of business indicators or system indicators. Through indicators, you can not only set alarms, but also push error logs to users on the console to help users quickly locate and analyze problems, and master the health status of the whole serverless function within 10 minutes.
4. Front end landing serverless effect
What is the effect of the front-end implementation of serverless? We compare the performance and man hours required by the three apps in the traditional application R & D mode with the FAAS scenario. It is obvious that on the basis of the original cloud native,The efficiency can also be improved by 38.89%, this is very effective for serverless applications or front-end applications. At present, the serverless scenario has almost covered the whole group to help business parties realize serverless and realizeImprove efficiencyandcost reductionTwo main objectives.
4、 Technology output and expand new scenes
During the implementation of the group’s serverless, we found many new business demands, such as how to quickly migrate the stock business and save costs? Can the execution time be increased or extended? Can resource allocation be increased? And so on. We propose some solutions to these problems. Based on these solutions, we abstract some functions of the product. Next, we introduce some important functions:
1. User defined image
The main purpose of user-defined image is to realize the seamless migration of stock business, help users realize zero code transformation, and completely migrate business code to serverless platform.
The migration of stock business is a very big pain point. In a team, there can be no two R & D modes for a long time, which will cause great internal friction. If you want the business party to move to the serverless R & D system, you must launch a thorough transformation scheme to help users realize the transformation of the serverless system. You not only need to support the use of serverless for new businesses, but also help stock businesses realize zero cost rapid migration. Therefore, we have launched the self-defined container function.
Characteristics of traditional web monomer application scenarios：
- Apply modern fine-grained responsibility splitting, service governance and other operation and maintenance burdens;
- The historical burden is not easy to be serverless: the business code on and off the cloud is not unified in dependence and configuration;
- Capacity planning, self built operation and maintenance, monitoring system;
- Low resource utilization (low traffic services monopolize resources).
Function calculation + container image advantage：
- Low cost migration monomer application;
- Operation and maintenance free;
- Automatic scaling without capacity planning;
- 100% resource utilization and optimize idle costs.
The custom container function allows traditional single web applications (such as springboot, WordPress, flask, express, rails and other frameworks) to migrate to function computing in a mirror manner without any transformation, so as to avoid the waste of resources caused by monopolizing the server by low traffic services. At the same time, you can also enjoy the benefits of no capacity planning, automatic scaling, freight free and so on.
2. Performance examples
High performance instances, reduce use restrictions and expand more scenarios. For example, the code package is increased from 50m to 500m, the execution time is increased from 10 minutes to 2 hours, and the performance specification is more than 4 times higher than the original. It can support large-scale instances of 16g and 32g to help users run some very time-consuming and long tasks.
Function computing services many scenarios. In the service process, we have received many demands, such as many constraints, high use threshold, insufficient computing scenario resources and so on. Therefore, for these scenarios, we have launched the performance instance function. The goal is to reduce the use restrictions of function computing application scenarios and reduce the use threshold. In terms of execution time and various indicators, users can configure flexibly and on demand.
At present, the 16 core 32g we support has the same computing power as ECS of the same specification, and can be applied to high-performance business scenarios such as AI reasoning, audio and video transcoding, etc. This function is very important for the subsequent expansion of application scenarios.
- Elastic instances have many constraints and have certain use thresholds, such as execution time, instance specifications, etc;
- In traditional single application, audio and video and other heavy computing scenarios, the business needs to be split and transformed to increase the burden;
- Vcpu, memory, bandwidth and other resource dimensions are not explicitly promised by elastic instances.
- Reduce the use restrictions of function calculation and reduce the use threshold of enterprises;
- Compatible with traditional applications and recalculation scenarios;
- Give users a clear commitment to resources.
- Launch performance examples with higher specifications and clearer resource commitments;
- In the future, performance instances will have higher stability SLAs and richer functional configurations.
Computational tasks, long running tasks, elastic scaling insensitive tasks.
- Audio and video transcoding processing;
- AI reasoning;
- Other computing scenarios requiring high specifications.
In addition to relaxing the restrictions, the performance instance still retains all the capabilities of the current function computing product: pay as you go, reservation mode, single instance multiple requests, integration of multiple event sources, disaster recovery in multiple availability areas, automatic scaling, application construction and deployment, operation and maintenance free, etc.
3. Link tracking
Link tracing functions include link restoration, topology analysis and problem location.
A normal microservice cannot complete all work in one function and needs to rely on upstream and downstream services. When the upstream and downstream services are normal, link tracking is generally not required, but how to locate the problem if the downstream service is abnormal? At this time, you can rely on the link tracking function to quickly analyze the upstream and downstream performance bottlenecks or locate the occurrence point of the problem.
Function computing has also investigated many open source technology solutions inside and outside the group. At present, it supports X-Trace function, is compatible with open source solutions, embraces open source, and provides opentracing compatible product capabilities.
The above figure is the demo diagram of link tracking. By calculating tracing, you can visually see the database access cost of back-end services, avoid the complex verification relationship between a large number of services and increase the difficulty of troubleshooting. Function calculation also supports function code level link analysis capability to help users optimize cold start, key code implementation, etc.
Serverless products have brought huge benefits from a business perspective, but packaging also brings a phased problem – the black box problem. When we provide users with link tracking technology and expose black box problems to users, users can also improve their business capabilities through these black box problems. This is also the direction of improving the user experience of serverless in the future. In the future, we will continue to increase investment in this area and reduce the cost of using serverless.
- Serverless products have great benefits from a business perspective, but packaging brings black box problems;
- Serverless is connected to the cloud ecosystem, and a large number of cloud services make the calling relationship complex;
- Serverless developers still have requirements for link restoration, topology analysis, problem location, etc.
Main advantages of FC + X-Trace：
- Function code level link analysis to help optimize the implementation of key codes such as cold start;
- Service invocation level link tracking helps connect cloud ecosystem services and analyze distributed links.
4. Asynchronous configuration
In the serverless scenario, we provide functions such as offline task processing and message opposite consumption. In function calculation, the utilization rate of such functions accounts for about 50%. In a large number of message consumption, there are many asynchronous configuration problems, which are often challenged by the business side. For example, where do these messages come from? Where are you going? By what services? Time spent? What is the success rate of consumption? wait. The visualization / configurability of these problems is an important topic to be solved at present.
The above figure shows the working principle of asynchronous configuration. First, the asynchronous call is triggered from the event source specified by the user. The function calculation immediately returns the request ID. at the same time, you can also call the execution function to return the execution result to the function calculation or message queue MNS. Then, the triggers can be configured through the event source, and these effects or theme consumption can be used for message re consumption. For example, if a message processing fails, you can configure it for secondary processing.
Typical application scenarios：
- everythingEvent closed loopFor example, analyze the delivery results (such as collecting monitoring indicators and alarm configuration); In production events, customers can not only use FC consumption events, but also use FC to actively produce events.
- Second isDaily exception handlingFor example, failure handling, retry strategy, etc.
- Three isrecycling, users can customize the inventory time, discard useless messages in time and save resources, which is a great optimization for asynchronous scenarios.
Introduction to the author：
Zhao Qingjie (Lu Ling), currently working in Alibaba cloud’s native serverless team, focuses on serverless, PAAS, distributed system architecture and other directions, and is committed to building a new generation of serverless technology platform to make the platform technology more inclusive. Once worked in Baidu, responsible for the largest PAAS platform in the company, and undertook 80% of online business. He has rich experience in PAAS direction, back-end distributed system architecture and other fields.
This article is compiled from [serverless live series live broadcast] on January 26
Link to live broadcast review:https://developer.aliyun.com/topic/serverless/practices