The origin of flow peak cutting
It mainly comes from the business scenarios of the Internet, for example, the Spring Festival train ticket rush to start soon, a large number of users need to rush to buy at the same time; and the well-known Alibaba double 11 seconds kill,
In a short time, hundreds of millions of users swarmed in, and the instantaneous flow was huge (high concurrency). For example, 2 million people were ready to rush to buy a product at 12:00 in the morning, but the number of products was limited to about 100-500.
In this way, there are only a few hundred users who can actually buy this product. But in terms of business, seckill hopes more people to participate in it, that is to say, more and more people are expected to come to see and buy the product before rush buying.
However, when the time of rush to buy is up and the user begins to place an order, the lack of back-end server of seckill doesn’t want millions of people to initiate rush to buy at the same time.
We all know that the processing resources of the server are limited, so when there is a peak, it is easy to cause server downtime and users can not access the situation.
This is just like the problem of early peak and late peak when traveling. In order to solve this problem, there is a solution to limit the travel at the wrong peak.
Similarly, in online business scenarios such as seckill, similar solutions are also needed, and the problem of traffic peak brought by rush purchase needs to be solved safely. This is the origin of traffic peak reduction.
How to realize the scheme of flow peak shaving
In essence, peak shaving is to delay user requests more, filter user access requirements layer by layer, and follow the principle of “the number of requests finally landing in the database should be as small as possible”.
1. Message queuing to solve peak shaving
To cut the peak of traffic, the most easy solution is to buffer the instantaneous traffic with message queue, convert the synchronous direct call into asynchronous indirect push, and in the middle, receive the instantaneous traffic peak at one end through a queue, and push the message out smoothly at the other end.
Message Queuing Middleware mainly solves the problems of application coupling, asynchronous message, traffic cutting edge, etc. Common message queuing systems: Currently, in the production environment, active MQ, rabbitmq, zeromq, Kafka, metamq, rocketmq, etc. are widely used.
Here, the message queue is like a “reservoir” to retain the upstream flood and reduce the peak flow into the downstream river, so as to achieve the purpose of flood disaster relief.
2. Flow peak cutting funnel: layer by layer peak cutting
Another way to deal with the seckill scenario is to filter the requests hierarchically, so as to filter out some invalid requests.
In fact, layered filtering uses a “funnel” design to process requests, as shown in the following figure:
This is like a funnel that filters and reduces the amount of data and requests layer by layer.
1) Core idea of hierarchical filtering
- Filter out invalid requests as much as possible at different levels.
- Filter out a large number of images and static resource requests through CDN.
- Then through distributed caching like redis, filtering requests, etc., it is a typical upstream interception of read requests.
2) Basic principles of layered filtration
- The write data is partitioned reasonably based on time, and the expired invalidation requests are filtered out.
- Limit the flow of write requests to filter out requests that exceed the system’s capacity.
- The read data involved does not have strong consistency check, which reduces the bottleneck caused by consistency check.
- Strong consistency check is performed on write data, only the last valid data is retained.
- In the end, it’s the valid request that makes the “funnel” the end (database). For example: when the user really reaches the order and payment process, strong data consistency is required.
Summary of flow peak shaving
1. For the high concurrency scenario business like seckill, the basic principle is to intercept the request upstream of the system and reduce the downstream pressure. If not intercepted at the front end, it is likely to cause read-write lock conflicts, even deadlock, and finally avalanche and other scenarios.
2. Divide dynamic and static resources, and use CDN for service distribution.
3. Make full use of cache (redis, etc.): increase QPS to increase the throughput of the whole cluster.
4. High peak traffic is a very important reason to crush the system, so message queues such as Kafka need to receive instantaneous traffic peaks at one end and push messages out smoothly at the other end.