Go method for realizing various current limiting



When developing highly concurrent systems, we may encounter too frequent interface access. In order to ensure the high availability and stability of the system, we need to limit the traffic. You may use it  Nginx  This control request may also be implemented with some popular class libraries. Current limiting is a big killer of highly concurrent systems. Before designing current limiting algorithms, let’s understand what they are.

Current limiting

Current limitingThe purpose of is to protect the system by limiting the speed of concurrent access requests or requests within a time window. Once the limited rate is reached, it can be processed such as denial of service, queuing or waiting, degradation and so on. The system is protected by limiting the speed of concurrent (or within a certain time window) requests. Once the limit rate is reached, the system will refuse service (direct to the error page or inform that the resources are gone), queue up (such as second kill, comment, order), and degrade (return the bottom data or default data).

As shown in the figure:


A cartoon made by himself

As shown in the cartoon, when the traffic comes up in a certain period of time, the interface access frequency of the service may be very fast. If we do not limit the interface access frequency, the server may not be able to bear too much pressure to hang up, and data loss may occur at this time. Therefore, we need to limit the flow.

Current limiting algorithm can help us control the frequency of function calls of each interface or program. It is a bit like a fuse to prevent the system from paralysis due to exceeding the access frequency or concurrency. We may see response headers like this when calling some third-party interfaces:

X-RateLimit-Limit:   sixty          // 60 requests per second
X-RateLimit-Remaining:   twenty-two      // How many times are left
X-RateLimit-Reset:   one billion six hundred and twelve million one hundred and eighty-four thousand and twenty-four  // Limit reset time

above  HTTP Response  The response header is used to tell the caller what the current limiting frequency of the service end is, so as to ensure the upper limit of interface access at the back end. In order to solve the flow restriction problem, there are many algorithms, which have different purposes. The usual strategy is to reject the exceeded requests or queue the exceeded requests.

Generally speaking, the common treatment methods of current limiting are:

  • Counter
  • sliding window
  • Leaky bucket
  • Token bucket


Counter is the simplest current limiting algorithm, and its principle is:In a period of time interval, the request is counted and compared with the threshold to determine whether current restriction is required. Once the time critical point is reached, the counter is cleared.It’s just like when you go to the car. There are many positions in the car. If it’s full, you won’t be allowed to get on the car. Otherwise, it’s overloaded and you’ll be fined if caught by the traffic police uncle. If our system is not a fine, it may collapse directly.

  • You can set a variable in the program  count, when a request comes, I’ll count it+1And record the request time.
  • Judge when the next request comes  count  Whether the count value of exceeds the set frequency, and whether the current request time and the first request time are within  1  Within minutes.
  • If in  1  Within minutes and exceeding the set frequency, it indicates that there are too many requests, and the subsequent requests are rejected.
  • If the interval between the request and the first request is greater than the count cycle, and  count  If the value is still within the current limit, it will be reset  count

Code implementation:

package main
import (
type Counter struct {
    rate    int            // The maximum number of requests allowed in the count cycle
    begin   time.Time      // Count start time
    cycle   time.Duration  // Counting cycle
    count   int            // Cumulative number of requests received in the counting cycle
    lock  sync.Mutex
func (l *Counter) Allow() bool {
    defer l.lock.Unlock()
    if l.count == l.rate-1 {
        now := time.Now()
        if now.Sub(l.begin) >= l.cycle {
            //Within the allowable speed range,   Reset counter
            return true
        } else {
            return false
    } else {
        //Rate limit not reached, count plus 1
        return true
func (l *Counter) Set(r int, cycle time.Duration) {
    l.rate = r
    l.begin = time.Now()
    l.cycle = cycle
    l.count = 0
func (l *Counter) Reset(t time.Time) {
    l.begin = t
    l.count = 0
func main() {
    var wg sync.WaitGroup
    var lr Counter
    lr.Set(3,   time.Second)  //  Up to 3 requests in 1s
    for i := 0; i < 10; i++ {
        Log. Println ("create request:",   i)
        go func(i int) {
          if lr.Allow() {
              Log. Println ("response request:",   i)
        time.Sleep(200 * time.Millisecond)


2021/02/01   21:16:12   Create request:   0
2021/02/01   21:16:12   Response request:   0
2021/02/01   21:16:12   Create request:   one
2021/02/01   21:16:12   Response request:   one
2021/02/01   21:16:12   Create request:   two
2021/02/01   21:16:13   Create request:   three
2021/02/01   21:16:13   Create request:   four
2021/02/01   21:16:13   Create request:   five
2021/02/01   21:16:13   Response request:   five
2021/02/01   21:16:13   Create request:   six
2021/02/01   21:16:13   Response request:   six
2021/02/01   21:16:13   Create request:   seven
2021/02/01   21:16:13   Response request:   seven
2021/02/01   21:16:14   Create request:   eight
2021/02/01   21:16:14   Create request:   nine

You can see that we set every200msCreate a request, significantly higher than1Second maximum3After running, it is found that the number is  2、3、4、8、9  The request is discarded, indicating that the current limiting is successful.

So the problem is, if there is a requirement for an interface  /query  A maximum of 200 accesses are allowed per minute. Suppose a user sends 200 requests in the last few milliseconds of the 59th second. When the 59th second is over  Counter  Cleared, and he sends another 200 requests the next second. Then, the user sent twice the request in one second, which is in line with our design logic. This is also the design defect of the counter method. The system may bear a large number of requests from malicious users and even break down the system.

As shown below:

There is a simple way to deal with the problem of large time boundaries.

sliding window

sliding windowIt is aimed at the critical point defect of the counter  Sliding window  Is a flow control technology, the word appears in  TCP  In the agreement.sliding windowDivide the fixed time slice and move it with the passage of time. Count a fixed number of movable grids and judge the threshold.

As shown in the figure:


In the figure above, we use the red dotted line to represent a time window(One Minute), each time window has  6  A grid, each grid is  10  second. Every pass  10  The second time window moves one grid to the right to see the direction of the red arrow. We set up an independent counter for each grid  Counter, if a request is  0:45  Visited, then we will the counter of the fifth grid  +1(that’s right  0:40~0:50), when judging the current limit, you need to add up the counts of all grids and compare them with the set frequency.

So how can sliding windows solve the problems we encountered above? Look at the following figure:


When the user is0:59  Seconds to send  200A request is recorded by the counter in the sixth grid  +200, in the next second, the time window moves one to the right, and the counter has recorded the information sent by the user  200  If a request is sent again, the current limit will be triggered, and the new request will be rejected.

In fact, the counter is just a sliding window. There is only one grid, so to make the current limit more accurate, we only need to divide more grids. In order to be more accurate, we don’t know how many grids to set,The number of grids affects the accuracy of sliding window algorithm. There is still the concept of time slice, which can not fundamentally solve the critical point problem

Implementation of correlation algorithm  github.com/RussellLuo/slidingwindow

Leaky bucket

Leaky bucket algorithm, the principle is that a leaky bucket with a fixed capacity flows out water droplets at a fixed rate. It is known to all who have used the faucet that when the faucet is turned on, the water will flow down and drip into the bucket, and the leaky bucket refers to that there is a leak under the bucket that can water out. If the faucet is turned on very large, the water flow speed will be too large, which may cause the bucket to be full and overflow.

As shown in the figure:


A bucket of fixed capacity, with water flowing in and out. For the water flowing in, we can’t predict how much water will flow in, nor can we predict the speed of water flow. But for the water flowing out, this bucket can fix the rate of water flowing out(processing speed )To achieve  Flow shaping  and  flow control  The effect of.

Code implementation:

type LeakyBucket struct {
    Rate float64 // fixed water outflow rate per second
    Capacity float64 // capacity of bucket
    Water float64 // current water volume in the bucket
    Lastleakms Int64 // time stamp of last bucket leakage MS

    lock sync.Mutex

func (l *LeakyBucket) Allow() bool {
    defer l.lock.Unlock()

    now := time.Now().UnixNano() / 1e6
    Eclipse: = float64 ((now - l.lastleakms)) * l.rate / 1000 // execute the command first
    l. Water = L.Water - Eclipse // calculate the remaining water
    l. Water = math. Max (0, L.Water) // the bucket is dry
    l.lastLeakMs = now
    if (l.water + 1) < l.capacity {
        //Try adding water and the water is not full
        return true
    } else {
        //Refuse to add water when the water is full
        return false

func (l *LeakyBucket) Set(r, c float64) {
    l.rate = r
    l.capacity = c
    l.water = 0
    l.lastLeakMs = time.Now().UnixNano() / 1e6

The leaky bucket algorithm has the following characteristics:

  • The leaky bucket has a fixed capacity and the effluent rate is a fixed constant (outflow request)
  • If the bucket is empty, no water droplets need to flow out
  • The inflow water can drop to the leaky bucket at any rate (inflow request)
  • If the incoming water drops exceed the capacity of the bucket, the incoming water drops overflow (the new request is rejected)

The leakage bucket limits the constant outflow rate (i.e. the outflow rate is a fixed constant value), so the maximum rate is the outflow rate, and there can be no sudden flow.

Token Bucket

Token Bucket (Token Bucket)Network traffic shaping(Traffic Shaping)And rate limit(Rate Limiting)The most commonly used algorithm in. Typically, the token bucket algorithm is used to control the number of data sent to the network and allow the transmission of burst data.


We have a fixed bucket with tokens in it(token)。 At first, the bucket is empty, and the system operates at a fixed time(rate)Add tokens to the bucket until the number of tokens in the bucket is full, and the redundant requests will be discarded. When a request comes, a token is removed from the bucket. If the bucket is empty, the request is rejected or blocked.

Implementation code:

type TokenBucket struct {
    Rate Int64 // fixed token putting rate, R / S
    Capacity Int64 // capacity of bucket
    Tokens Int64 // the current number of tokens in the bucket
    Lasttokensec Int64 // timestamp of last token put in bucket s

    lock sync.Mutex

func (l *TokenBucket) Allow() bool {
    defer l.lock.Unlock()

    now := time.Now().Unix()
    l. Tokens = l.tokens + (now-l.lasttokensec) * l.rate // add tokens first
    if l.tokens > l.capacity {
        l.tokens = l.capacity
    l.lastTokenSec = now
    if l.tokens > 0 {
        //And a token. Get a token
        return true
    } else {
        //No token, reject
        return false

func (l *TokenBucket) Set(r, c int64) {
    l.rate = r
    l.capacity = c
    l.tokens = 0
    l.lastTokenSec = time.Now().Unix()

The token bucket has the following features:

  • The token is put into the token bucket at a fixed rate
  • Maximum storage in bucket  B  When the bucket is full, the newly added token is discarded or rejected
  • If there are not enough tokens in the bucket  N  If there are, the token will not be deleted and the request will be throttled (discarded or blocked waiting)

The token bucket limits the average inflow rate (allows burst requests, which can be processed as long as there are tokens. It supports taking 3 tokens, 4 tokens…) at a time, and allows a certain degree of burst traffic.


Currently commonly used isToken bucketThis paper introduces several common current limiting algorithms

This is the end of this article about go’s implementation of various flow restrictions. For more information about go’s implementation of various flow restrictions, please search for previous articles of developeppaer or continue to browse the relevant articles below. I hope you will support developeppaer in the future!

Recommended Today

Implementation example of go operation etcd

etcdIt is an open-source, distributed key value pair data storage system, which provides shared configuration, service registration and discovery. This paper mainly introduces the installation and use of etcd. Etcdetcd introduction etcdIt is an open source and highly available distributed key value storage system developed with go language, which can be used to configure sharing […]