Load balancing architecture for high concurrency system design


At the beginning of a system development, it is often a stand-alone system. Application and database on a server, with the development of business, the increase of access, a server performance will appear ceiling, often has been difficult to support the business. At this time, we should consider separating the database from the application server. If the access continues to increase, we will consider dividing the database into different databases and tables, and the application server will do load balancing. In fact, this also belongs to the category of distributed system. The core concept of distributed system is a word “divide”. If one server can’t support it, there will be two, three, four… Of course, after dividing, there will be other problems, such as the most common data consistency problems, call chain monitoring and other problems, which are not in the scope of today’s discussion. If you are interested, please move to Baidu.

Many projects do “distributed” deployment to improve system performance, the first phase is often load balancing strategy.

load balancing

Load balance, the English name is load balance, which means to balance the load (work task) and allocate it to multiple operation units for running, such as FTP server, web server, enterprise core application server and other main task servers, so as to complete the work tasks cooperatively. Load balancing is built on the original network structure. It provides a transparent, cheap and effective method to expand the bandwidth of servers and network devices, strengthen the network data processing capacity, increase the throughput, and improve the availability and flexibility of the network.

Since load balancing is a form of “sub” strategy, it will involve the task allocator, task executor and allocation algorithm. Here the task allocator is what we often call the load balancer, the task executor is the server that processes the task, and the allocation algorithm is what we often call the allocation strategy such as rotation training. It’s not right to call the task allocator load balancer here. The concept of load balancer pays more attention to evenly distributing tasks and balancing the task amount of each task’s computing unit. In reality, task allocation is more based on the performance or business of each computing unit. Having each cell handle almost the same number of tasks is only part of the distributed equalizer.

Take HTTP request as an example. In the process of an HTTP request, there are many load balancing processes. The stage of load balancing for a system depends on the number of requests, which is directly related to the commonly known QPS / TPS / dau. Assuming that the number of requests for the system is very small, there is no need for load balancing, Of course, sometimes in order to achieve the purpose of high availability, load balancing is also done, which is not discussed here. Which load balancer can an HTTP request go through? The HTTP request process is shown in the figure below
Load balancing architecture for high concurrency system design

DNS load balancing

When a client sends a request to a URL (not considering the direct request for IP address), the first step is to request the DNS server to do domain name resolution and convert the requested domain name to IP address. DNS resolution of the same domain name can return different IP addresses according to the source, which can be used for DNS load balancing. The fastest way for a client to request the nearest resource is to deploy the system in the computer rooms of different regions. After DNS resolution, each client only requests the nearest resource, which is much faster than the remote resource. For example, a website can be deployed in Beijing computer room and Shenzhen computer room at the same time. When Hebei users request a website, they will be directed to Beijing computer room, which is much faster than visiting Shenzhen.

DNS load balancing is only limited to the time of domain name resolution, so its strength is very rough, and the corresponding load balancing algorithm is also limited. But this scheme is relatively simple to implement, low cost, and to a certain extent, shorten the user’s response time, speed up the access speed. Because DNS information is cached for a long time, there will be a period of information difference when updating, which will lead to some users’ normal business access errors.

Hardware load balancing

When a request knows the target IP to be accessed, it will arrive at the computer room of the target IP through layers of gateways and routers. Before that, it belongs to the category of network transmission, and it is generally difficult to intervene. There are many computer rooms to achieve the purpose of load balancing through hardware facilities, which is similar to routers and switches, but also can be understood as the underlying equipment. At present, F5 is the most commonly used hardware device. Such hardware devices are generally produced by large companies. Their performance has been strictly tested, and their functions are powerful, but they are very expensive. Generally, small and medium-sized companies will not need to use this kind of equipment.

The performance of hardware load balancing is very powerful, supporting concurrency is generally several million per second, and supporting load algorithms are also many, and generally supporting security measures, such as firewall, anti attack and other security functions.

Software load balancing

Compared with hardware load balancing, software load balancing is more common in every company now. The basic process is to set up a load balancing server or cluster independently and install software with load balancing function for distribution. LVS, the most commonly used 4-tier load balancing software, can do load balancing for almost all application layers. At present, LVS has been integrated into Linux kernel module. The project implements a data request load balancing scheduling scheme based on IP in Linux kernel. In addition, nginx at layer 7 can also achieve load balancing. Nginx supports HTTP and E-mail protocols. Of course, there are corresponding nginx modules for layer 4 load balancing.

Compared with the hardware, the throughput of software load balancing is much smaller. Even the LVS performance of layer 4 is only tens of thousands, and nginx is tens of thousands. However, this is enough for the business of general companies. When a company’s business requests reach several million, it is estimated that it will have enough money to buy F5 hardware. The biggest advantage of software load balancing is flexible configuration, strong scalability, strong customization, and low cost. This is also the preferred solution for small and medium-sized companies.


Having said so much, in fact, the above solutions are based on HTTP requests to solve problems. Each solution has its own shortcomings and advantages. It may not be a good thing to adopt all the above solutions at the initial stage of designing a system to meet the requirements of high performance. Each system gradually changes its architecture with the growth of business, The load scheme used in this process is generally software load > Hardware load > DNS LOAD. Of course, the hardware and DNS may sometimes be reversed, but the software must bear the brunt. With the increase of business volume, the above three solutions are more complementary to each other. Just like wechat, it is impossible to use hardware load alone to meet the business requirements.

As for the stage and the scheme to be adopted, it depends on the number of requests for specific services. For example, my QPS is about 10000, which can be solved by nginx or LVS. When it reaches the million level, we can try to solve it by hardware + software. When it reaches 10 million or more, we need to consider deploying DNS load balancing in multiple computer rooms, No one solution is perfect, but a variety of solutions can be mixed to achieve near perfect situation.

Load balancing architecture for high concurrency system design

More wonderful articles

Load balancing architecture for high concurrency system design