Design of load balancing architecture for high concurrency systems


In the early stage of development of a system, it is often a single machine system. Applications and databases are on a server. With the development of business and the increase of access, the performance of a server will appear ceiling, which is often difficult to support the business. At this time, we should consider separating the database from the application server. If access continues to increase, we will consider dividing the database into databases and tables, and load balancing the application server. In fact, this also belongs to the category of distributed system. The core concept of distributed system is a word of “division”. If one server cannot support it, then two, three, four… Of course, Division will bring other problems, such as the most common data consistency problems, call chain monitoring and other problems, which are not within the scope of today’s discussion. If you are interested, please move to Baidu.

Many projects do “distributed” deployment to improve system performance, and the load balancing strategy is often adopted in the first phase.

load balancing

The English name of load balance is load balance, which means to balance the load (work task) and allocate it to multiple operating units for operation, such as FTP server, web server, enterprise core application server and other main task servers, so as to complete the work tasks together. Load balancing is built on the original network structure. It provides a transparent, cheap and effective method to expand the bandwidth of servers and network equipment, strengthen network data processing capacity, increase throughput, and improve network availability and flexibility.

Since load balancing is a form of “split” strategy, it will involve task assignor, task executor and allocation algorithm. The task allocator here is what we often call the load balancer, the task executor is the server that processes the task, and the allocation algorithm is often called the allocation strategy such as rotation training. In fact, it is incorrect to call the task allocator load balancer here. The concept of load balancer focuses more on evenly distributing tasks to make the workload of each task’s computing unit reach a balanced state. In reality, task allocation is more based on the performance or business of each computing unit. Having each cell handle almost the same number of tasks is only part of a distributed equalizer.

Take HTTP request as an example. In the process of HTTP request, there are many load balancing processes. The stage at which a system performs load balancing depends on its request volume, which is directly related to the commonly used QPS / TPS / dau. Assuming that the system has very few requests, it is not necessary to perform load balancing at all, Of course, sometimes load balancing is also done for the purpose of high availability, which is not discussed here. What load balancers can an HTTP request pass through? The HTTP request process is shown in the following figure
Design of load balancing architecture for high concurrency systems

DNS load balancing

When a client sends a request to a URL (the case of directly requesting an IP address is not considered here), the first step is to request the DNS server to do domain name resolution and convert the requested domain name into an IP address. DNS resolves the same domain name and can return different IP addresses according to the source. This feature can be used for DNS load balancing. The fastest way for a client to request resources closest to itself is to deploy the system in computer rooms in different regions. After DNS resolution, each client only requests resources closest to itself, which is much faster than requesting resources in remote computer rooms. For example, a website can be deployed in Beijing computer room and Shenzhen computer room at the same time. When users in Hebei request the website, they will be directed to Beijing computer room, which is much faster than visiting Shenzhen.

DNS load balancing is limited to the timing of domain name resolution, so its strength is very rough, and the corresponding load balancing algorithms are also limited. However, this scheme is relatively simple to implement, the cost is also very low, and to a certain extent, it shortens the user’s response time and speeds up the access speed. Because DNS information is cached for a long time, there will be a period of information difference during update, which will lead to the error of normal business access of some users.

Hardware load balancing

When a request knows the target IP to be accessed, it will reach the computer room of the target IP through layers of gateways and routers. Before that, it belongs to the category of network transmission, and it is generally difficult to intervene. Many computer rooms achieve the purpose of load balancing through hardware facilities, which is similar to routers and switches, and can also be understood as underlying equipment. At present, the most commonly used is F5. Such hardware equipment is generally produced in large companies. Its performance has been strictly tested and has powerful functions, but it is very expensive. Generally, small and medium-sized companies will not need to use this local tyrant equipment.

The performance of hardware load balancing is very powerful. The concurrency supported is generally millions per second, and there are many load algorithms supported. Generally, there are supporting security protection measures, such as firewall, anti attack and other security functions.

Software load balancing

Compared with hardware load balancing, software load balancing is more common in every company. The basic process is to separate a load balancing server or cluster and install software with load balancing function for distribution. LVS, the most commonly used 4-layer load balancing software, can be used for load balancing in almost all application layers. At present, LVS has been integrated into the Linux kernel module. The project implements the IP based data request load balancing scheduling scheme in the Linux kernel. In addition, nginx in layer 7 can also realize load balancing. Nginx supports HTTP and E-mail protocols. Of course, there are corresponding nginx modules for layer 4 load balancing.

Compared with hardware, the throughput of software load balancing is much smaller. Even the performance of layer 4 LVS is only hundreds of thousands, and nginx is tens of thousands. However, this is enough for the business of ordinary companies. When the business volume of a company reaches millions, it is estimated that there is money to buy F5 hardware. The biggest advantage of software load balancing lies in flexible configuration, strong scalability, strong customizability and low cost. This is also the preferred solution for small and medium-sized companies.


Having said so much, in fact, the above solutions are based on HTTP requests. Each solution has its own shortcomings and advantages. When designing a system, it may not be a good thing to adopt all the above solutions at the beginning to meet the requirements of high performance. Each system gradually changes its architecture with the growth of business, The load scheme adopted in this process is generally software load – > Hardware load – > DNS LOAD. Of course, the hardware and DNS here may sometimes be reversed, but the software must bear the brunt. With the increase of business volume, the above three schemes are more cooperative and complementary to each other. Like wechat, it is impossible to meet the business requirements by using the hardware load alone.

As for the stage and scheme to be adopted, it still depends on the request volume of specific services. For example, at present, my QPS is about 10000, which can be solved by nginx or LVS. When it rises to millions, you can try to solve it by hardware + software. When it reaches tens of millions or even higher, you should consider deploying DNS load balancing in multiple computer rooms, No scheme is perfect, but a mixture of schemes can be used to achieve a nearly perfect situation.

Design of load balancing architecture for high concurrency systems

More wonderful articles

Design of load balancing architecture for high concurrency systems

Recommended Today

Understand the attributes in PHP 8

explain Starting with PHP 8, we will be able to start using ‘annotations’. The goal of these annotations (also known as annotations in many other languages) is to add metadata to classes, methods, variables, and so on in a structured manner. The concept of annotations is not new. We have used document blocks to simulate […]