Detailed analysis on the source code of redis network model

Time:2020-10-9

preface

The network model of redis is based on I / O multiplexing program. The source code contains four multiplexing function libraries epoll, select, evport and kqueue. When the program is compiled, one of the four libraries will be automatically selected according to the system. Take epoll as an example to analyze the source code of redis I / O module.

Epoll system call method

The code of redis network event processing module is written around epoll’s three system methods. If we make clear these three methods, we will not be difficult in the future.

epfd = epoll_create(1024);

Create epoll instance

Parameter: indicates the maximum number of socket FD (file descriptor) that the epoll instance can listen to.

Return: epoll specific file descriptor.

int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event)

Manage events in epoll, register, modify and delete events.

Parameters:
EPFD: the file descriptor of epoll instance;
OP: three values: epoll_ CTL_ Add registration, epoll_ CTL_ Mod modification, epoll_ CTL_ Del deletion;
FD: file descriptor of socket;
epoll_ Event * event: Event

Event represents an event, similar to the channel “channel” in Java NiO. epoll_ The structure of event is as follows:

typedef union epoll_data {
void *ptr;
Int FD; / * socket file descriptor*/
__uint32_t u32;
__uint64_t u64;
} epoll_data_t;

struct epoll_event {
Wei uint32_ T events; / * epoll events is the result of summation of opcodes for various operations to be monitored, such as epollin (FD readable), epollout (FD writable)*/
epoll_data_t data; /* User data variable */
};

int epoll_wait(int epfd, struct epoll_event * events, intmaxevents, int timeout);

Wait for the event to be ready, similar to the select method in Java NiO. If the event is ready, store the ready event in the events array.

parameter
EPFD: the file descriptor of epoll instance;
Events: ready event array;
Int maxevents: the number of events that can be processed each time;
Timeout: blocking time, the timeout value waiting for the ready event to be generated.

Source code analysis

event

Events are divided into two types in redis event system

  • File event; network socket event;
  • Time events: some timing operation events in redis, such as the servercron function.

Next, analyze the source code from the two processes of event registration and trigger

Binding events

Set up EventLoop

In the initserver method (called by the main function of redis. C), when the redisdb object is created, an “EventLoop” object is initialized, which I call the event handler object. The key member variables of a structure are as follows:

struct aeEventLoop{
Aefileevent * events; // registered file event array
Aefiredevent * fired; // ready file event array
Aetimeevent * timeeventhead; // time event array
...
}

Initializing the EventLoop is executed in the “aecreateeventloop” method of AE. C. In addition to initializing EventLoop, this method also calls the following methods to initialize an epoll instance.

/*
 * ae_epoll.c
 *Create a new epoll instance and assign it to the EventLoop
 */
static int aeApiCreate(aeEventLoop *eventLoop) {

  aeApiState *state = zmalloc(sizeof(aeApiState));

  if (!state) return -1;

  //Initialize event slot space
  state->events = zmalloc(sizeof(struct epoll_event)*eventLoop->setsize);
  if (!state->events) {
    zfree(state);
    return -1;
  }

  //Create epoll instance
  state->epfd = epoll_create(1024); /* 1024 is just a hint for the kernel */
  if (state->epfd == -1) {
    zfree(state->events);
    zfree(state);
    return -1;
  }

  //Assign value to EventLoop
  eventLoop->apidata = state;
  return 0;
}

It is here that the system method “epoll” is called_ create”。 The state here is an aeapistate structure, as follows:

/*
 *Event status
 */
typedef struct aeApiState {

  //Epoll instance descriptor
  int epfd;

  //Event slot
  struct epoll_event *events;

} aeApiState;

This state is recorded by EventLoop > apidata.

Binding IP port and handle

Open the TCP port through the listentoport method. Each IP port corresponds to a file descriptor ipfd (because the server may have multiple IP addresses)

//Open the TCP listening port to wait for the command request from the client
if (server.port != 0 &&
  listenToPort(server.port,server.ipfd,&server.ipfd_count) == REDIS_ERR)
  exit(1);

Note: * EventLoop and ipfd are server.el And server.ipfd ] reference. Server is an instance of the structure redisserver and a global variable of redis.

Registration events

The following code binds an event function to each file descriptor

//Initserver method:
for (j = 0; j < server.ipfd_count; j++) {
  if (aeCreateFileEvent(server.el, server.ipfd[j], AE_READABLE,
    acceptTcpHandler,NULL) == AE_ERR)
    {
      redisPanic(
        "Unrecoverable error creating server.ipfd file event.");
    }
}
//Aecreatefileevent method in AE. C
/*
 *According to the value of the mask parameter, the status of the FD file is monitored,
 *When FD is available, the proc function is executed
 */
int aeCreateFileEvent(aeEventLoop *eventLoop, int fd, int mask,
    aeFileProc *proc, void *clientData)
{
  if (fd >= eventLoop->setsize) {
    errno = ERANGE;
    return AE_ERR;
  }

  if (fd >= eventLoop->setsize) return AE_ERR;

  //Extract file event structure
  aeFileEvent *fe = &eventLoop->events[fd];

  //Listen to the specified event of the specified FD
  if (aeApiAddEvent(eventLoop, fd, mask) == -1)
    return AE_ERR;

  //Set the file event type and the event handler
  fe->mask |= mask;
  if (mask & AE_READABLE) fe->rfileProc = proc;
  if (mask & AE_WRITABLE) fe->wfileProc = proc;

  //Private data
  fe->clientData = clientData;

  //If necessary, update the maximum FD of the event handler
  if (fd > eventLoop->maxfd)
    eventLoop->maxfd = fd;

  return AE_OK;
}

There is a method call in the aecreatefileevent function: aeapiaddevent. The code is as follows

/*
 * ae_epoll.c
 *Associate the given event to fd
 */
static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) {
  aeApiState *state = eventLoop->apidata;
  struct epoll_event ee;

  /* If the fd was already monitored for some event, we need a MOD
   * operation. Otherwise we need an ADD operation. 
   *
   *If FD is not associated with any events, then this is an add operation.
   *
   *If a / some events are already associated, then this is a mod operation.
   */
  int op = eventLoop->events[fd].mask == AE_NONE ?
      EPOLL_CTL_ADD : EPOLL_CTL_MOD;

  //Register events to epoll
  ee.events = 0;
  mask |= eventLoop->events[fd].mask; /* Merge old events */
  if (mask & AE_READABLE) ee.events |= EPOLLIN;
  if (mask & AE_WRITABLE) ee.events |= EPOLLOUT;
  ee.data.u64 = 0; /* avoid valgrind warning */
  ee.data.fd = fd;

  if (epoll_ctl(state->epfd,op,fd,&ee) == -1) return -1;

  return 0;
}

This is actually a call to the system method “epoll”_ CTL “to register the event (file descriptor) into epoll. First, encapsulate an epoll_ Event structure, that is, EE, through “epoll”_ CTL “registers it in epoll.

In addition, aecreatefileevent performs the following two important operations:

  • The event function “accepttchandler” is stored in the EventLoop, which is referenced by EventLoop > events [FD] – > rfileproc (or wfileproc, representing read events and write events respectively);
  • Add the current opcode to EventLoop > events [FD] – > mask (mask is similar to OPS opcode in javanio, representing event type).

Event monitoring and execution

The main function of redis. C will call the main method in AE. C, as shown below:

/*
 *Main loop of event processor
 */
void aeMain(aeEventLoop *eventLoop) {

  eventLoop->stop = 0;

  while (!eventLoop->stop) {

    //If you have a function that needs to be executed before event processing, run it
    if (eventLoop->beforesleep != NULL)
      eventLoop->beforesleep(eventLoop);

    //Start processing events
    aeProcessEvents(eventLoop, AE_ALL_EVENTS);
  }
}

The above code calls the aeprocessevents method to handle the event, as shown below

/* Process every pending time event, then every pending file event
 * (that may be registered by time event callbacks just processed).
 *
 *Handles all time of arrival events, as well as all ready file events.
 *The return value of the function is the number of events processed
 */
 int aeProcessEvents(aeEventLoop *eventLoop, int flags)
{
  int processed = 0, numevents;

  /* Nothing to do? return ASAP */
  if (!(flags & AE_TIME_EVENTS) && !(flags & AE_FILE_EVENTS)) return 0;

  if (eventLoop->maxfd != -1 ||
    ((flags & AE_TIME_EVENTS) && !(flags & AE_DONT_WAIT))) {
    int j;
    aeTimeEvent *shortest = NULL;
    struct timeval tv, *tvp;

    //Get the latest time event
    if (flags & AE_TIME_EVENTS && !(flags & AE_DONT_WAIT))
      shortest = aeSearchNearestTimer(eventLoop);
    if (shortest) {
      //If time events exist
      //Then the blocking time of file events is determined according to the time difference between the latest executable time event and the current time
      long now_sec, now_ms;

      /* Calculate the time missing for the nearest
       * timer to fire. */
      //How long will it take to calculate the most recent time event
      //The time interval is preserved in the TV structure
      aeGetTime(&now_sec, &now_ms);
      tvp = &tv;
      tvp->tv_sec = shortest->when_sec - now_sec;
      if (shortest->when_ms < now_ms) {
        tvp->tv_usec = ((shortest->when_ms+1000) - now_ms)*1000;
        tvp->tv_sec --;
      } else {
        tvp->tv_usec = (shortest->when_ms - now_ms)*1000;
      }

      //If the time difference is less than 0, the event can be executed. Set the seconds and milliseconds to 0 (no blocking)
      if (tvp->tv_sec < 0) tvp->tv_sec = 0;
      if (tvp->tv_usec < 0) tvp->tv_usec = 0;
    } else {
      
      //When you go to this step, there is no time event
      //So according to AE_ DONT_ Whether wait is set to determine whether it is blocked and how long it will be blocked

      /* If we have to check for events but need to return
       * ASAP because of AE_DONT_WAIT we need to set the timeout
       * to zero */
      if (flags & AE_DONT_WAIT) {
        //Set file events not to block
        tv.tv_sec = tv.tv_usec = 0;
        tvp = &tv;
      } else {
        /* Otherwise we can block */
        //File events can be blocked until an event arrives
        tvp = NULL; /* wait forever */
      }
    }

    //Processing file events, blocking time is determined by TVP
    numevents = aeApiPoll(eventLoop, tvp);
    for (j = 0; j < numevents; j++) {
      //Get events from ready array
      aeFileEvent *fe = &eventLoop->events[eventLoop->fired[j].fd];

      int mask = eventLoop->fired[j].mask;
      int fd = eventLoop->fired[j].fd;
      int rfired = 0;

      /* note the fe->mask & mask & ... code: maybe an already processed
       * event removed an element that fired and we still didn't
       * processed, so we check if the event is still valid. */
      //Read events
      if (fe->mask & mask & AE_READABLE) {
        //Rfired ensures that only one of the read / write events can be executed
        rfired = 1;
        fe->rfileProc(eventLoop,fd,fe->clientData,mask);
      }
      //Write events
      if (fe->mask & mask & AE_WRITABLE) {
        if (!rfired || fe->wfileProc != fe->rfileProc)
          fe->wfileProc(eventLoop,fd,fe->clientData,mask);
      }

      processed++;
    }
  }

  /* Check time events */
  //Execution time events
  if (flags & AE_TIME_EVENTS)
    processed += processTimeEvents(eventLoop);

  return processed; 
}

The code in this function is roughly divided into three main steps

  • According to the relationship between time events and current time, the blocking time TVP is determined;
  • Call aeapipoll method, write all ready events to EventLoop > fired [] and return the number of ready events;
  • Traverse EventLoop > fired [] to traverse every ready event and execute the previously bound method rfileproc or wfileproc.

Ae_ The aeapipoll method in epoll. C is as follows:

/*
 *Get executable events
 */
static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) {
  aeApiState *state = eventLoop->apidata;
  int retval, numevents = 0;

  //Waiting time
  retval = epoll_wait(state->epfd,state->events,eventLoop->setsize,
      tvp ? (tvp->tv_sec*1000 + tvp->tv_usec/1000) : -1);

  //At least one event is ready?
  if (retval > 0) {
    int j;

    //Set the appropriate mode for ready events
    //And add it to the fired array of EventLoop
    numevents = retval;
    for (j = 0; j < numevents; j++) {
      int mask = 0;
      struct epoll_event *e = state->events+j;

      if (e->events & EPOLLIN) mask |= AE_READABLE;
      if (e->events & EPOLLOUT) mask |= AE_WRITABLE;
      if (e->events & EPOLLERR) mask |= AE_WRITABLE;
      if (e->events & EPOLLHUP) mask |= AE_WRITABLE;

      eventLoop->fired[j].fd = e->data.fd;
      eventLoop->fired[j].mask = mask;
    }
  }
  
  //Returns the number of ready events
  return numevents;
}

Execute epoll_ After wait, the ready events are written to the event loop > apidata > events event slot. The following loop is to write the events in the event slot to EventLoop > fired [. Description: every event is an epoll_ If e is used to refer to the event structure, then E data.fd Represents the file descriptor, E > events represents its opcode, converts the opcode to mask, and finally writes both FD and mask into EventLoop > fired [J].

Then, in the outer layer of the aeprocessevents method, the method pointed to by the function pointer rfileproc or wfileproc will be executed, such as the registered “accepttcphandler” mentioned above.

summary

Redis’s network module is actually a simple reactor mode. This paper analyzes the source code of redis and describes the process of receiving client connection by redis by following the route of “server registering event > accepting client connection > monitoring event is ready > executing event”. In fact, NiO’s ideas are basically similar.

This article on the redis network model source code detailed analysis of the article introduced here, more relevant redis network model source content please search the previous articles of developeppaer or continue to browse the relevant articles below, I hope you will support developeppaer more in the future!