Golang you must understand the connection pool implementation

Time:2021-5-9

Problem introduction

As a golang developer, the online environment has encountered a number of connection explosion problems (MySQL / redis / Kafka, etc.).

For this reason, golang, as a resident process, needs to manually close the connection after requesting a third-party service or resource, otherwise the connection will always exist. Most of the time, developers don’t remember to close the connection.

Is that a lot of trouble? So there’s the connection pool. As the name suggests, connection pool is used to manage connections; We get the connection from the connection pool and return the connection to the connection pool after the request; Connection pool helps us to establish, reuse and recycle connections.

When designing and implementing a connection pool, we usually need to consider the following issues:

  • Is there a limit on the number of connections in the connection pool? How many connections can be established?
  • When the connection is not used for a long time, do you need to recycle the connection?
  • When a business request needs to obtain a connection, if there is no idle connection in the connection pool and a new connection cannot be created, does the business need to queue?
  • There are other problems in queuing. Is there a limit to the length of the queue and the queuing time?

Implementation principle of golang connection pool

We take golang HTTP connection pool as an example to analyze the implementation principle of connection pool.

Structure transport

The transport structure is defined as follows:

type Transport struct {
  //A lock is required to operate an idle connection
  idleMu    sync.Mutex
  //Idle connection pool, key is the combination of Protocol target address, etc
  idleConn   map[connectMethodKey][]*persistConn // most recently used at end
  //Queue waiting for idle connection, based on slice implementation, unlimited queue size
  idleConnWait map[connectMethodKey]wantConnQueue // waiting getConns
  
  //A lock needs to be acquired when queuing to establish a connection
  connsPerHostMu  sync.Mutex
  //Number of connections per host
  connsPerHost   map[connectMethodKey]int
  //The queue waiting to establish a connection is also based on slicing, and the size of the queue is unlimited
  connsPerHostWait map[connectMethodKey]wantConnQueue // waiting getConns
  
  //Maximum number of idle connections
  MaxIdleConns int
  //The maximum number of idle connections per target host; The default is 2 (note the default)
  MaxIdleConnsPerHost int
  //The maximum number of connections that can be established per host
  MaxConnsPerHost int
  //The connection is closed when it is not in use
  IdleConnTimeout time.Duration
  
  //Disable long connection, use short connection
  DisableKeepAlives bool
}

It can be seen that the connection protects the queue and is a map structure, while the key is a combination of the Protocol target address, that is, the connection or idle connection that can be established between the same protocol and the same target host is limited.

It should be noted that maxidleconnsperhost is equal to 2 by default, that is, only two idle connections are maintained with the target host at most. What will this lead to?

In case of a burst of traffic, a large number of connections are established instantly. However, due to the limitation of the maximum number of idle connections, the online connection cannot enter the idle connection pool, and can only be closed directly. As a result, a large number of new connections have been created, and a large number of connections have been closed. The time of the business machine_ The number of wait connections increased dramatically.

Some online business architectures are like this: client = = > LVS = = > nginx = = > service. LVS load balancing scheme adopts Dr mode, LVS and nginx are configured with unified VIP. At this time, in the view of the client, there is only one IP address and only one host. The above problems are more obvious.

Finally, transport also provides the ability to configure disablekeepalives, disable long connections, and use short connections to access third-party resources or services.

Connection acquisition and recovery

The transport structure provides the following two methods to achieve the connection acquisition and recycling operations.


func (t *Transport) getConn(treq *transportRequest, cm connectMethod) (pc *persistConn, err error) {}

func (t *Transport) tryPutIdleConn(pconn *persistConn) error {}

There are two steps to get the connection: 1) try to get the idle connection; 2) Try to create a new connection:

//Internal implementation of getconn method

if delivered := t.queueForIdleConn(w); delivered {
  return pc, nil
}
  
t.queueForDial(w)

Of course, you may not be able to get the connection and need to queue up. What should I do at this time? At present, the current protocol will be blocked until the connection is obtained, or the httpclient times out to cancel the request:

select {
  case <-w.ready:
    return w.pc, w.err
    
  //Timeout cancelled
  case <-req.Cancel:
    return nil, errRequestCanceledConn
  ……
}

var errRequestCanceledConn = errors.New("net/http: request canceled while waiting for connection") // TODO: unify?

The logic of queuing for idle connections is as follows:

func (t *Transport) queueForIdleConn(w *wantConn) (delivered bool) {
  //If the idle timeout is configured and the connection needs to be detected, the connection will be closed if the timeout occurs
  if t.IdleConnTimeout > 0 {
    oldTime = time.Now().Add(-t.IdleConnTimeout)
  }
  
  if list, ok := t.idleConn[w.key]; ok {
    for len(list) > 0 && !stop {
      pconn := list[len(list)-1]
      tooOld := !oldTime.IsZero() && pconn.idleAt.Round(0).Before(oldTime)
      //Timeout, closing connection
      if tooOld {
        go pconn.closeConnIfStillIdle()
      }
      
      //Distribution connect to wantconn
      delivered = w.tryDeliver(pconn, nil)
    }
  }
  
  //Queuing for idle connections
  q := t.idleConnWait[w.key]
  q.pushBack(w)
  t.idleConnWait[w.key] = q
}

The logic of queuing for a new connection is as follows:

func (t *Transport) queueForDial(w *wantConn) {
  //If there is no limit on the maximum number of connections, establish the connection directly
  if t.MaxConnsPerHost <= 0 {
    go t.dialConnFor(w)
    return
  }
  
  //If the connection limit is not exceeded, the connection is established directly
  if n := t.connsPerHost[w.key]; n < t.MaxConnsPerHost {
    go t.dialConnFor(w)
    return
  }
  
  //Queuing for connection establishment
  q := t.connsPerHostWait[w.key]
  q.pushBack(w)
  t.connsPerHostWait[w.key] = q
}

After the connection is established, the trydeliverer will also be called to distribute the connection to wantconn, and the channel w.ready will be closed at the same time. In this way, the main Coordinator will correct the contact blocking.


func (w *wantConn) tryDeliver(pc *persistConn, err error) bool {
  w.pc = pc
  close(w.ready)
}

After the request is processed, the connection is put back to the connection pool through tryputtidleconn; At this time, if there is an association waiting for an idle connection, it needs to distribute and reuse the connection. In addition, when reclaiming connections, it is also necessary to verify whether the number of idle connections exceeds the limit

func (t *Transport) tryPutIdleConn(pconn *persistConn) error {
  //Disable long connection; Or the maximum number of idle connections is illegal
  if t.DisableKeepAlives || t.MaxIdleConnsPerHost < 0 {
    return errKeepAlivesDisabled
  }
  
  if q, ok := t.idleConnWait[key]; ok {
    //If the waiting queue is not empty, distribute the connection
    for q.len() > 0 {
      w := q.popFront()
      if w.tryDeliver(pconn, nil) {
        done = true
        break
      }
    }
  }
  
  //The number of idle connections exceeds the limit. The default value is defaultmaxidleconnsperhost = 2
  idles := t.idleConn[key]
  if len(idles) >= t.maxIdleConnsPerHost() {
    return errTooManyIdleHost
  }

}

Idle connection timeout closed

How does the golang HTTP connection pool implement the timeout closing logic for idle connections? As can be seen from the above queueforidleconn logic, every time an idle connection is obtained, it will be detected whether it has timed out, and the connection will be closed if it has timed out.

If there is no business request arriving and there is no need to obtain a connection, will idle connections not close out of time? In fact, when an idle connection is added to the connection pool, golang also sets a timer. When the timer expires, the connection will be closed naturally.


pconn.idleTimer = time.AfterFunc(t.IdleConnTimeout, pconn.closeConnIfStillIdle)

How to realize queuing

How to implement the queue model? It’s very simple. It can be based on slicing:

queue  []*wantConn

//Join the team
queue = append(queue, w)

//Get out of the team
v := queue[0]
queue[0] = nil
queue = queue[1:]

What’s wrong with that? With the frequent queue in and queue out operations, the underlying array of slicing queue will have a lot of space that can not be reused, resulting in waste. Unless the slice is expanded.

When implementing the queue, golang uses two slices: head and tail; The head slice is used for outbound operation, and the tail slice is used for inbound operation; When leaving the team, if the head slice is empty, the head and tail are exchanged. In this way, golang realizes the reuse of the underlying array space.


func (q *wantConnQueue) pushBack(w *wantConn) {
  q.tail = append(q.tail, w)
}

func (q *wantConnQueue) popFront() *wantConn {
  if q.headPos >= len(q.head) {
    if len(q.tail) == 0 {
      return nil
    }
    // Pick up tail as new head, clear tail.
    q.head, q.headPos, q.tail = q.tail, 0, q.head[:0]
  }
  w := q.head[q.headPos]
  q.head[q.headPos] = nil
  q.headPos++
  return w
}

Here is the article about connection pool implementation that you must understand about golang. For more information about golang connection pool, please search previous articles of developer or continue to browse the following related articles. I hope you can support developer more in the future!