Application deployment architecture evolution [reprint]


Reproduced in《You call this shit load balancing?


I believe everyone has heard such a classic interview question: “please tell us the whole process from entering a keyword on Taobao to finally displaying the web page. The more detailed, the better.”

This problem is actually very difficult. It involves a series of related concepts and working mechanisms such as HTTP, TCP, gateway and LVS. If you can master each of these knowledge points, it will greatly light up your skill tree and you will know how the network operates. Even if you can’t fully master it, knowing how the traffic flows will be very helpful for you to troubleshoot and locate the problem, I used this knowledge to locate many problems before. In order to clarify the whole process, I consulted a lot of materials and consulted many people. I believe I can explain this problem clearly. However, it is found that the length is too long, so I divided it into two parts. This part first introduces the overall architecture of traffic in the back end, The next article will deeply analyze the details of fine nodes, such as LVS, which will involve the working mechanism of switches and routers

Li Daniu started a business. Because there was no traffic in the early stage, he only deployed a Tomcat server to let the client call the request directly to the server

There was no problem with this deployment at the beginning, because the business volume was not very large and the single machine was enough to carry it. However, later, Li Daniu’s business stepped on the air outlet and the business developed rapidly, so the performance of the single machine gradually encountered a bottleneck. Moreover, because only one machine was deployed, the machine hung up and the business fell to zero. This is not good, Therefore, in order to avoid the single machine performance bottleneck and solve the hidden danger of single point of failure, Li Daniu decided to deploy several more machines (assuming three), so that the client can call one of them randomly. In this way, even if one of the machines hangs, the other machines still survive, and let the client call other machines without downtime

Now the question arises. Which of the three machines should the client call? It is certainly inappropriate for the client to choose, because if the client chooses a specific server, it must know which servers there are, and then randomly connect one of them by polling. However, if one of the servers goes down, the client cannot perceive it in advance, It is likely that the client will connect to the suspended server, so it is best to choose which machine to connect in the server. How to do it? There is a classic consensus in the architecture design: there is nothing that can not be solved by adding one layer. If there is, add another layer, so we add another layer on the server side and name it lb (load balance), LB receives the client’s request uniformly, and then it determines which server to communicate with. Generally, nginx is widely used as lb in the industry

The adoption of this architecture design finally supported the rapid growth of business, but soon afterwards, Li daniou found that there was a problem with this architecture: all traffic can be connected to the server, which is obviously problematic and not very safe. Can we do another layer of authentication before the traffic reaches the server? After the authentication is passed, we can let it connect to the server, We call this layer gateway (in order to avoid single point of failure, the gateway should also exist in the form of cluster)

In this way, all traffic must pass through the gateway layer before hitting the server. Only after authentication is passed can the traffic be forwarded to the server, otherwise an error message will be returned to the client. In addition to authentication, the gateway also has the functions of risk control (preventing wool party), protocol conversion (such as converting HTTP to Dubbo), traffic control, etc, To ensure that the traffic forwarded to the server is safe and controllable to the greatest extent.

This design lasted for a long time, but later, Li daniou found that there were still problems with this design. Both dynamic requests and static resource (such as JS and CSS files) requests hit tomcat, which will cause great pressure on Tomcat when the traffic is large. In fact, Tomcat is not as good as nginx in the processing of static resources, Tomcat loads files from the disk every time, which affects the performance. Nginx has proxy cache and other functions, which can greatly improve the processing capacity of static resources.

Voice over: the so-called proxy cache means that after nginx obtains resources from the static resource server, it will be cached in the local memory + disk. If the next request hits the cache, it will be returned directly from the native cache of nginx

Therefore, Daniel Li made the following optimization: if it is a dynamic request, call Tomcat through the gateway; if it is a static request, call the static resource server

That’s what we call itDynamic and static separation, separate static requests from dynamic requests, so that Tomcat can focus on the dynamic requests it is good at. Because static resources make use of nginx’s proxy cache and other functions, the processing capacity of the back-end has been raised to a higher level.

In addition, it should be noted that not all dynamic requests need to go through the gateway. For example, the background of our operation center is used by internal employees, so its authentication is different from the API authentication of the gateway. Therefore, we directly deployed two servers of the operation center and directly asked nginx to call the requests of the operation center to these two servers, bypassing the gateway.

Of course, in order to avoid a single point of failure, nginx also needs to deploy at least two machines, so our architecture becomes the following. Nginx deploys two machines in the form of primary and standby. The standby nginx will timely sense the survival of the primary nginx through the keepalived mechanism (sending heartbeat packets), and act as the primary nginx when it finds downtime

It seems that this architecture is really good, but it should be noted that nginx is a seven layer (i.e. application layer) load balancer, which means that if it wants to forward traffic, it must first establish a TCP connection with the client, and also establish a TCP connection with the upstream server to which it is forwarded. We know that establishing a TCP connection actually consumes memory (TCP socket, receive / send buffer, etc. need to occupy memory). To send data, both the client and the upstream server need to send it to nginx for temporary storage, and then send it to the other party through the TCP connection at the other end.


Therefore, the load capacity of nginx is limited by a series of configurations such as machine I / O, CPU memory, etc. once there are many connections (such as millions), the anti load capacity of nginx will drop sharply.

Through analysis, it can be seen that the poor load capacity of nginx is mainly due to the fact that the seven layer load balancer must establish two TCP upstream and downstream respectively. Can we design a load balancer like a router that only forwards packets but does not need to establish a connection? In this way, it is only responsible for forwarding packets because it does not need to establish a connection, There is no need to maintain additional TCP connections, and its load capacity must be greatly improved, so the four-layer load balancer LVS was born. Simply compare the difference between the two

3 (1)

It can be seen that LVS simply forwards packets without establishing a connection with upstream and downstream. Compared with nginx, LVS has strong load resistanceperformanceHigh, up to 60% of F5 hardware; The consumption of memory and CPU resources is relatively low

So how does the four layer load balancer work

When the load balancing device receives the first syn request from the client, it selects the best server through the load balancing algorithm and modifies the target IP address in the message (changed to back-end server IP) and directly forwarded to the server. TCP connection establishment, that is, the three-time handshake is established directly between the client and the server. The load balancing device only plays a forwarding action similar to that of the router. In some deployment cases, in order to ensure that the server’s packets can be returned to the load balancing device correctly, the packets may be forwarded at the same time Modify the original source address.

To sum up, we added a layer of LVS on nginx to allow it to undertake all our traffic. Of course, in order to ensure the availability of LVS, we also deployed LVS in the way of active and standby. In addition, if the capacity of nginx is not enough, we can easily expand the horizontal capacity. Therefore, our architecture is improved as follows:

network (1)

Of course, if there is only one LVS, it can’t be found in the case of large traffic. What should we do? Add several more. It’s OK to use DNS load balancing to randomly call one of them when resolving the domain name


In this way, we can finally make the traffic flow stably. Maybe some friends will have questions. Let’s have a look

Since LVS can deploy multiple servers to avoid a single point of failure, nginx can also be used, and nginx has begun to support after 1.9Four layer loadBalanced, so it seems that LVS is not very necessary?

If LVS is not used, the architecture diagram is like this

The method of deploying multiple nginx is indeed feasible when the traffic is not so large, but LVS is the kernel module of Linux and works in the kernel state, while nginx works in the user state, which is relatively heavy. Therefore, nginx is not as good as LVS in terms of performance and stability. This is why we should adopt the deployment method of LVS + nginx.

In addition, I believe you have also noticed that if the traffic is large, the static resources should be deployed on the CDN, and the CDN will automatically select the node closest to the user to return to the user. Therefore, our final architecture improvement is as follows


The architecture must be designed in combination with the actual situation of the business. Talking about architecture without business is actually playing a rogue. It can be seen that the evolution of each architecture above is closely related to our business development. For small and medium-sized companies with less traffic, it is enough to use nginx as load balancing. After the rapid growth of traffic, consider using LVS + nginx, Of course, for the huge traffic like meituan (tens of Gbps traffic and tens of millions of concurrent connections), LVS doesn’t work (although LVS is used in the actual measurement, there are still many packet losses). Therefore, they have developed their own set of four-layer load balancer MGW

In addition, after reading this article, I believe you should have a more thorough understanding of the concept of layering. There is nothing that layering can’t solve. If so, add another layer. Layering enables each module to perform its own functions, decouple functions, and facilitate expansion. TCP / IP, which you are familiar with, is a good example. Each layer is only responsible for its own affairs, As for the implementation of the lower layer, the upper layer is not care.


Talk about load balancing again

Reproduced in

First look at the architecture diagram shown in the previous article

Application deployment architecture evolution [reprint]

The question of some friends here is whether nginx is superfluous and whether it can directly hit the site layer from LVS? That is, change to the following architecture

Application deployment architecture evolution [reprint]

The answer is no, why? In fact, some points I mentioned above have been implied, but they are not so obvious. I’ll pick them out alone

LVS is a four layer load balancer

Nginx is a seven layer load balancer that can forward traffic according to the URL

First of all, we need to understand why it is so important to forward requests according to the URL. Assuming that there are two clusters, namely “marketing” and “operation center”, using nginx is very simple. We can decide which cluster to forward requests according to the URL

Application deployment architecture evolution [reprint]

Since LVS cannot forward the request according to the URL, who should LVS forward to after receiving the request

So why can’t LVS forward according to the URL? Because it is a four layer load balancer. What are four layers and seven layers? Here’s a simple review of the ISO seven layer reference model

Application deployment architecture evolution [reprint]

It can be seen that layer 7 corresponds to the application layer and layer 4 corresponds to the transport layer. If a request is initiated from the application layer, the header of each layer will be added to the “transport layer”, “network layer” and “data link layer”. For example, now computer a wants to send an “I’m deep” data to computer B, the conversion process at each layer is shown in the figure below

Application deployment architecture evolution [reprint]

However, the final packets to be transmitted on the Internet (packet frames transmitted at the data link layer, collectively referred to as packets) are limited in size, as shown in the figure below

Application deployment architecture evolution [reprint]

The packets transmitted on the Internet cannot exceed 14 + 20 + 20 + 1460 + 4 = 1518 bytes, and the application layer (i.e. payload) data contained therein cannot exceed 1460 bytes at one time, that is, if an HTTP request has 2000 bytes, it must be sent in two packets before it can be transmitted on the network. Let’s look at the format of HTTP

Application deployment architecture evolution [reprint]

If an HTTP post request is large and exceeds 1460 bytes (the maximum value of a packet payload), it must be divided into two packets before transmission, which means that one packet may contain a URI and the other packet does not contain a URI. Since none of the packets contain a URI, how can LVS forward it to the corresponding cluster according to the URL, Therefore, after understanding the working mechanism of TCP / IP, I believe it is not difficult for you to understand the problem at the beginning: LVS is a four-tier load balancer and cannot forward requests according to URLs.

In fact, the key reason is that the lower layer of layer 4 is only responsible for packet forwarding. Just take out the packet header and check the IP address to know where to forward. It is very efficient. If you also need to match according to the URL, you need to get the application layer data and match according to the regularity. Obviously, it will consume more performance, so professional people do professional things, LVS should be responsible for carrying all traffic, and nginx is responsible for forwarding to the corresponding cluster according to the URL, because it is a seven layer load balancer and establishes a TCP link with the upstream and downstream



Therefore, if there are multiple subcontracts, because nginx has established a TCP connection with the client, you can first get all the subcontracts sent by the client in nginx, and then assemble them into complete messages. Then select one of the servers according to the URL to establish a TCP connection with it, and then send the data to the upstream server in batches

In addition, it should be noted that it is not enough to only use nginx for forwarding in large factories. What is openresty

"Openresty is a high-performance web platform based on nginx and Lua. It integrates a large number of sophisticated Lua libraries, third-party modules and most dependencies. It is used to easily build dynamic web applications, web services and dynamic gateways that can handle ultra-high concurrency and high scalability.

The goal of openresty is to make your web service run directly inside the nginx service, make full use of the non blocking I / O model of nginx, and make consistent high-performance responses not only to HTTP client requests, but also to remote back ends such as mysql, PostgreSQL, memcached and redis. "


Note that the above sentence “provides the ability to interact with MySQL, redis, etc.” is very critical. Didn’t we say that nginx can decide which cluster to call according to the URL? Suppose there is a scenario: all requests containing operation are forwarded to the cluster of the operation center, you need to write the following configuration

upstream backend {
server {
  location /operation {
    proxy_pass http://backed

There are many rules like this in our group. Should we write all the rules in the configuration file of nginx like the above? Obviously, it is not feasible. A more reasonable way is to save these rules (which URL corresponds to which cluster) in mysql, and then when nginx starts, take these rules from MySQL and save them in redis and local cache, and then nginx should take them from local cache according to URL matching (if it is not taken from redis, it will be taken from MySQL when redis expires) Take these rules and forward them to the corresponding cluster according to the matching items. Nginx has no such ability. Openresty integrates Lua and introduces modules that interact with MySQL and redis, so it is feasible to use it. Therefore, the final architecture is as follows (replace nginx with openresty)

Application deployment architecture evolution [reprint]

About the specific practice of using openresty as the access layer / gateway, we wrote an article before. You can see it if you are interested^_^

Original link: 41385912/article/details/118886594