# Reading contents:

1. Preface

2. Algorithm Introduction – counter method

3. Algorithm Introduction – sliding window

4. Algorithm Introduction – leaky bucket algorithm

5. Algorithm Introduction – token bucket algorithm

# preface

In a highly concurrent system, traffic control is very important. When huge traffic is directly requested to our server, the interface may not be available soon. If it is not processed, it may even make the whole application unavailable.

So what’s the limit? As the name suggests, current limiting is to limit traffic, just like when you pack 1 g of broadband traffic, it’s gone when you run out. Through current limiting, we can well control the QPS of the system, so as to achieve the purpose of protecting the system. This article will introduce the common current limiting algorithms and their respective characteristics.

# Algorithm Introduction

Counter method

Counter method is the simplest and easiest algorithm in current limiting algorithm. For example, we stipulate that for the a interface, we can’t access more than 100 times in one minute. Then we can do this: at the beginning, we can set a counter. Whenever a request comes, the counter will increase by 1. If the value of the counter is greater than 100 and the interval between the request and the first request is still within 1 minute, it indicates that there are too many requests; If the interval between the request and the first request is greater than 1 minute and the counter value is still within the current limit, reset the counter. The specific algorithm is shown as follows:

The specific pseudo code is as follows:

Although this algorithm is simple, there is a very fatal problem, that is, the critical problem. Let’s see the following figure:

We can see from the above figure that if a malicious user sends 100 requests instantaneously at 0:59 and 100 requests instantaneously at 1:00, the user actually sends 200 requests instantaneously in one second. What we have just specified is that there are up to 100 requests per minute, that is, up to 1.7 requests per second. Users can instantly exceed our rate limit by burst requests at the reset node of the time window. Users may crush our application in an instant through this loophole in the algorithm.

Smart friends may have seen that the problem just now is actually because our statistical accuracy is too low. So how to deal with this problem well? Or how to reduce the impact of critical problems? We can look at the sliding window algorithm below.

# sliding window

Sliding window, also known as rolling window. In order to solve this problem, we introduce the sliding window algorithm. If you have studied TCP network protocol, you must be familiar with the term sliding window. The following figure well explains the sliding window algorithm:

In the above figure, the entire red rectangle represents a time window. In our example, a time window is one minute. Then we divide the time window. For example, in the figure, we divide the sliding window into 6 grids, so each grid represents 10 seconds. Every 10 seconds, our time window will slide one grid to the right. Each grid has its own independent counter. For example, when a request arrives at 0:35 seconds, the counter corresponding to 0:30 ~ 0:39 will increase by 1.

So how does the sliding window solve the critical problem just now? As shown in the figure above, 100 requests arriving at 0:59 will fall in the gray grid, while requests arriving at 1:00 will fall in the orange grid. When the time reaches 1:00, our window will move one grid to the right. At this time, the total number of requests in the time window is 200, exceeding the limit of 100, so it can be detected that the current limit is triggered.

Let me review the counter algorithm just now. We can find that the counter algorithm is actually a sliding window algorithm. However, it does not further divide the time window into 60s.

It can be seen that the more the grid division of the sliding window, the smoother the rolling of the sliding window and the more accurate the statistics of current limit.

# Leaky bucket algorithm

Leaky bucket algorithm, also known as leaky bucket. In order to understand the leaky bucket algorithm, let’s take a look at the schematic diagram of the algorithm:

We can see from the figure that the whole algorithm is actually very simple. First of all, we have a bucket with a fixed capacity. Water flows in and out. For the water flowing in, we can’t predict how much water will flow in, nor can we predict the speed of water flow. But for the water flowing out, this bucket can fix the rate of water flowing out. Moreover, when the bucket is full, the excess water will overflow.

We replace the water in the algorithm with the request in the actual application. We can see that the leaky bucket algorithm naturally limits the speed of the request. When the leaky bucket algorithm is used, we can ensure that the interface will process requests at a constant rate. Therefore, the leaky bucket algorithm will not have critical problems. The specific pseudo code implementation is as follows:

# Token Bucket

Token bucket algorithm, also known as token bucket. To understand the algorithm, let’s take a look at the schematic diagram of the algorithm:

From the figure, we can see that the token bucket algorithm is slightly more complex than the leaky bucket algorithm. First, we have a bucket with a fixed capacity in which tokens are stored. The bucket is empty at first, and the token is filled into the bucket at a fixed rate r until it reaches the capacity of the bucket, and the excess tokens will be discarded. Every time a request comes, it will try to remove a token from the bucket. If there is no token, the request cannot pass.

The specific pseudo code implementation is as follows:

Ratelimiter implementation

For the code implementation of token bucket, you can directly use ratelimiter in guava package.

# Related variants

If we study the algorithm carefully, we will find that it does not take time to remove the token from the bucket by default. If a delay time is set for removing tokens, the idea of leaky bucket algorithm is actually adopted. The smoothwarmingup class under Google’s guava library adopts this idea.

# Critical problem

Let’s consider the scenario of critical problem. At 0:59 seconds, since the bucket is full of 100 tokens, these 100 requests can be passed instantly. However, since tokens are filled at a lower rate, the number of tokens in the bucket cannot reach 100 at 1:00, so it is impossible to pass another 100 requests at this time. So token bucket algorithm can solve the critical problem well. The following figure compares the rate change of counter (left) and token bucket algorithm (right) at the critical point. We can see that although the token bucket algorithm allows burst rate, the next burst rate cannot occur until there are enough tokens in the bucket:

# summary

Counter vs sliding window

Counter algorithm is the simplest algorithm, which can be regarded as the low-precision implementation of sliding window. Sliding window needs to store multiple counters (one for each grid), so sliding window needs more storage space in implementation. That is, the higher the precision of the sliding window, the larger the storage space required.

Leaky bucket algorithm vs token bucket algorithm

The most obvious difference between leaky bucket algorithm and token bucket algorithm is that token bucket algorithm allows traffic to burst to a certain extent. Because of the default token bucket algorithm, taking away tokens does not take time, that is, assuming that there are 100 tokens in the bucket, 100 requests can be allowed to pass instantly.

Token bucket algorithm is widely used in the industry because of its simple implementation, allowing some traffic bursts and being user-friendly. Of course, we need to analyze the specific situation. There is only the most appropriate algorithm, and there is no optimal algorithm.