Detailed explanation of redis event mechanism

Time:2020-11-27

Redis uses event driven mechanism to handle a large number of network io. Instead of using a mature open source solution like libevent or libev, it implements a very concise event driven library AE_ event。

The event driven Library in redis only focuses on network IO and timer. The event library handles the following two types of events:

  • File event: used to handle network IO between redis server and client.
  • Time event: some operations in redis server (such as servercron function) need to be executed at a given point in time, and time events are used to handle such timing operations.

The code of event driven library is mainly implemented in Src / AE. C, and its schematic diagram is as follows.

Detailed explanation of redis event mechanism

aeEventLoopIt manages the file event table and time event list,
The ready file event and the expiration time event are processed repeatedly. Next, we will introduce the file event and time event respectively, and then describe the relatedaeEventLoopSource code implementation.

File events

Redis has developed its own network event processor based on reactor mode, namely file event processor. The file event processor uses IO multiplexing technology to listen to multiple sockets at the same time and associate different event handling functions for the sockets. When a read or write event of a socket is triggered, the corresponding event handler is called.

The IO multiplexing technologies used by redis mainly include:selectepollevportandkqueueEtc. Each IO multiplexing function library corresponds to a separate file in redis source code, such as AE_ select.c,ae_ epoll.c, ae_ Kqueue. C. Redis will select multiplexing technology according to different operating systems and priorities. Event response frameworks generally adopt this architecture, such as netty and libevent.

Detailed explanation of redis event mechanism

As shown in the figure below, the file event processor has four components: socket, I / O multiplexer, file event dispatcher, and event handler.

Detailed explanation of redis event mechanism

File event is the abstraction of socket operation. When a socket is ready to perform accept, read, write and close operations, a file event will be generated. Because redis usually connects multiple sockets, multiple file events may occur concurrently.

The I / O multiplexer is responsible for listening to multiple sockets and passing the sockets that generated the event to the file event dispatcher.

Although multiple file events may occur concurrently, the I / O multiplexer always places all generated sockets in the same queue (as described in the following sectionaeEventLoopOffiredThen the file event handler processes the sockets in the queue in an orderly, synchronous, single socket manner, that is, ready file events.

Detailed explanation of redis event mechanism

Therefore, the process of a redis client connecting with the server and sending commands is shown in the figure above.

  • If the client sends a request to the server to establish a socket connection, then listening on the socket will generate AE_ Readable event, triggeringConnect the reply processorImplementation. The processor will respond to the client’s connection request, and then create the client socket, as well as the client state, and set the AE of the client socket_ Readable event andCommand request processorrelation.
  • After the client establishes the connection and sends the command to the server, the client socket will generate AE_ Readable event, triggeringCommand request processorThe processor reads the client command and passes it to the relevant program for execution.
  • Execute the command to get the corresponding command reply. In order to pass the command reply to the client, the server sends the AE of the client socket_ Writable event andCommand reply processorrelation. When the client attempts to read the command reply, the client socket generates AE_ Writable event, trigger commandReply processorWrite all command replies to the socket.

Time events

Redis’s time events are divided into the following two categories:

  • Timed event: lets a program execute once after a specified time.
  • Periodic event: causes a program to execute every specified time.

The specific definition structure of redis time event is as follows.

typedef struct aeTimeEvent {
    /*Globally unique ID*/
    long long id; /* time event identifier. */
    /*Second accurate UNIX time stamp to record the arrival time of time event*/
    long when_sec; /* seconds */
    /*Millisecond precise UNIX time stamp to record the arrival time of time events*/
    long when_ms; /* milliseconds */
    /*Time processor*/
    aeTimeProc *timeProc;
    /*Event end callback function to deconstruct some resources*/
    aeEventFinalizerProc *finalizerProc;
    /*Private data*/
    void *clientData;
    /*Precursor node*/
    struct aeTimeEvent *prev;
    /*Successor node*/
    struct aeTimeEvent *next;
} aeTimeEvent;

Whether a time event is a timed event or a periodic event depends on the return value of the time processor:

  • If the return value is AE_ Nomore, then this event is a timed event. The event will be deleted after it reaches and will not be repeated.
  • If the return value is not AE_ The value of nomore, then this event is a periodic event. When a time event arrives, the server will, according to the return value of the time processor, set thewhenProperty to make the event reach again after a period of time.

Redis places all time events in an unordered linked list. Each time redis traverses the whole list, finds all the time events that have arrived, and calls the corresponding event handler.

After introducing file events and time events, let’s take a lookaeEventLoopThe specific implementation of.

Create event manager

Redis server in its initialization functioninitServerThe event manager is createdaeEventLoopObject.

functionaeCreateEventLoopAn event manager is created, mainly initializationaeEventLoopFor exampleeventsfiredtimeEventHeadandapidata

  • First createaeEventLoopObject.
  • Initializes the not ready file event table and the ready file event table.eventsPointer to not ready file event tablefiredPointer to ready file event table. The contents of the table are initially changed when specific events are added later.
  • Initialization time event list, settingtimeEventHeadandtimeEventNextIdProperty.
  • callaeApiCreateFunction creationepollInstance and initializeapidata
aeEventLoop *aeCreateEventLoop(int setsize) {
    aeEventLoop *eventLoop;
    int i;
    /*Create event state structure*/
    if ((eventLoop = zmalloc(sizeof(*eventLoop))) == NULL) goto err;
    /*Create not ready event table, ready event table*/
    eventLoop->events = zmalloc(sizeof(aeFileEvent)*setsize);
    eventLoop->fired = zmalloc(sizeof(aeFiredEvent)*setsize);
    if (eventLoop->events == NULL || eventLoop->fired == NULL) goto err;
    /*Set array size*/
    eventLoop->setsize = setsize;
    /*Last execution time of initialization execution*/
    eventLoop->lastTime = time(NULL);
    /*Initialize time event structure*/
    eventLoop->timeEventHead = NULL;
    eventLoop->timeEventNextId = 0;
    eventLoop->stop = 0;
    eventLoop->maxfd = -1;
    eventLoop->beforesleep = NULL;
    eventLoop->aftersleep = NULL;
    /*Associate multiplexing IO with event manager*/
    if (aeApiCreate(eventLoop) == -1) goto err;
    /*Initialize listening events*/
    for (i = 0; i < setsize; i++)
        eventLoop->events[i].mask = AE_NONE;
    return eventLoop;
err:
   .....
}

aeApiCreateThe function first creates theaeApiStateObject initializing the epoll ready event table; and then callingepoll_createCreatedepollFinally, theaeApiStateAssign toapidataProperty.

aeApiStateObjectepfdstorageepollThe logo of,eventsIt’s aepollArray of ready events, if anyepollWhen the event occurred, all of theepollEvents and their descriptors are stored in this array. The ready event array is opened up by the application layer, and the kernel is responsible for filling all the events that have occurred to the array.

static int aeApiCreate(aeEventLoop *eventLoop) {
    aeApiState *state = zmalloc(sizeof(aeApiState));

    if (!state) return -1;
    /*Initializing epoll ready event table*/
    state->events = zmalloc(sizeof(struct epoll_event)*eventLoop->setsize);
    if (!state->events) {
        zfree(state);
        return -1;
    }
    /*Create epoll instance*/
    state->epfd = epoll_create(1024); /* 1024 is just a hint for the kernel */
    if (state->epfd == -1) {
        zfree(state->events);
        zfree(state);
        return -1;
    }
    /*Event manager associated with epoll*/
    eventLoop->apidata = state;
    return 0;
}
typedef struct aeApiState {
    /* epoll_ Event instance descriptor*/
    int epfd;
    /*Store epoll ready event table*/
    struct epoll_event *events;
} aeApiState;

Create file event

aeFileEventIs the file event structure, for each specific event, there are read processing function and write processing function. Redis callaeCreateFileEventThe function registers the corresponding file events for different socket read and write events.

typedef struct aeFileEvent {
    /*Monitor event type mask. The value can be AE_ Readable or AE_ WRITABLE */
    int mask;
    /*Read event processor*/
    aeFileProc *rfileProc;
    /*Write event handler*/
    aeFileProc *wfileProc;
    /*Private data of multiplexing Library*/
    void *clientData;
} aeFileEvent;
/*The function type of the processor function defined using typedef*/
typedef void aeFileProc(struct aeEventLoop *eventLoop, 
int fd, void *clientData, int mask);

For example, when Redis performs master-slave replication, it needs to establish a connection from the server to the main server. It will initiate a socekt connection and then call it.aeCreateFileEventThe function registers the corresponding event handler for the read / write event of the socket initiated, that issyncWithMasterFunction.

aeCreateFileEvent(server.el,fd,AE_READABLE|AE_WRITABLE,syncWithMaster,NULL);
/*Function definition conforming to aefileproc*/
void syncWithMaster(aeEventLoop *el, int fd, void *privdata, int mask) {....}

aeCreateFileEventParameters offdI mean specificsocketSocket,procfingerfdWhen an event is generated, the specific processing function,clientDataIs the data that needs to be passed in to call back the processing function.
aeCreateFileEventThree main things have been done:

  • withfdFor index, ineventsThe corresponding event was found in the not ready events table.
  • callaeApiAddEventFunction, which is registered in the specific underlying I / O multiplexing, epoll in this case.
  • Fill in the callback, parameter, event type and other parameters of the event.
int aeCreateFileEvent(aeEventLoop *eventLoop, int fd, int mask,
                       aeFileProc *proc, void *clientData)
{
    /*Take out the file event structure corresponding to FD, and FD represents the specific socket socket*/
    aeFileEvent *fe = &eventLoop->events[fd];
    /*Listen to the specified event of the specified FD*/
    if (aeApiAddEvent(eventLoop, fd, mask) == -1)
        return AE_ERR;
    /*Set file event type and event handler*/
    fe->mask |= mask;
    if (mask & AE_READABLE) fe->rfileProc = proc;
    if (mask & AE_WRITABLE) fe->wfileProc = proc;
    /*Private data*/
    fe->clientData = clientData;
    if (fd > eventLoop->maxfd)
        eventLoop->maxfd = fd;
    return AE_OK;
}

As mentioned above, there are multiple sets of underlying I / O multiplexing libraries based on redisaeApiAddEventThere are also multiple sets of implementation, the following source code isepollUnder the implementation. Its core operation is to callepollOfepoll_ctlFunction directionepollRegister response events. ofepollRelated knowledge can see “Java NiO source code analysis”

static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) {
    aeApiState *state = eventLoop->apidata;
    struct epoll_event ee = {0}; /* avoid valgrind warning */
    /*If FD is not associated with any events, then this is an add operation. If a / some events are already associated, then this is a mod operation. * /
    int op = eventLoop->events[fd].mask == AE_NONE ?
            EPOLL_CTL_ADD : EPOLL_CTL_MOD;

    /*Register events to epoll*/
    ee.events = 0;
    mask |= eventLoop->events[fd].mask; /* Merge old events */
    if (mask & AE_READABLE) ee.events |= EPOLLIN;
    if (mask & AE_WRITABLE) ee.events |= EPOLLOUT;
    ee.data.fd = fd;
    /*Call epoll_ CTL system call to add the event to epoll*/
    if (epoll_ctl(state->epfd,op,fd,&ee) == -1) return -1;
    return 0;
}

event processing

Since there are both file event and time event types in redis, the server must schedule these two events to decide when to handle file events, when to process time events, and how to schedule them.

aeMainFunctions are called in an infinite loopaeProcessEventsFunction to handle all events.

void aeMain(aeEventLoop *eventLoop) {
    eventLoop->stop = 0;
    while (!eventLoop->stop) {
        /*If there is a function that needs to be executed before event processing, execute it*/
        if (eventLoop->beforesleep != NULL)
            eventLoop->beforesleep(eventLoop);
        /*Start processing events*/
        aeProcessEvents(eventLoop, AE_ALL_EVENTS|AE_CALL_AFTER_SLEEP);
    }
}

Here isaeProcessEventsThe pseudo code, which first calculates the latest time event from the current time, calculates a timeout time, and then calls.aeApiPollFunction to wait for the underlying I / O multiplexing event to be ready;aeApiPollAfter the function returns, it will handle all the generated file events and the time reached events.

/*Pseudo code*/
int aeProcessEvents(aeEventLoop *eventLoop, int flags) {
    /*Gets the time event whose arrival time is closest to the current time*/
    time_event = aeSearchNearestTimer();
    /*Calculate how many milliseconds are the closest time events to arrive*/
    remaind_ms = time_event.when - unix_ts_now();
    /*If the event has arrived, then remain_ MS is negative, set it to 0*/
    if (remaind_ms < 0) remaind_ms = 0;
    /*According to remained_ MS to create a timeval structure*/
    timeval = create_timeval_with_ms(remaind_ms);
    /*Block and wait for the file event to be generated. The maximum blocking time is determined by the incoming timeval structure, if remaining_ If the value of MS is 0, then aeapipoll will return immediately after calling without blocking*/
    /*Aeapipoll calls epoll_ Wait function, waiting for I / o Events*/
    aeApiPoll(timeval);
    /*Handle all generated file events*/
    processFileEvents();
    /*Handle all time events that have arrived*/
    processTimeEvents();
}

AndaeApiAddEventsimilar,aeApiPollThere are also multiple sets of implementations. It actually does two things: callepoll_waitBlocking waitepollEvent ready, the timeout is the time-out previously calculated from the fastest time to event; then readyepollEvent to the fired ready event.aeApiPollIs the I / O multiplexing program mentioned above. The specific process is shown in the figure below.

Detailed explanation of redis event mechanism

static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) 
{
    aeApiState *state = eventLoop->apidata;
    int retval, numevents = 0;
    //Call epoll_ Wait function, the waiting time is calculated from the time of the most recent time event.
    retval = epoll_wait(state->epfd,state->events,eventLoop->setsize,
            tvp ? (tvp->tv_sec*1000 + tvp->tv_usec/1000) : -1);
    //At least one event is ready?
    if (retval > 0) 
    {
        int j;
        /*Set the corresponding mode for the ready event and add it to the fired array of EventLoop*/
        numevents = retval;
        for (j = 0; j < numevents; j++) 
    {
            int mask = 0;
            struct epoll_event *e = state->events+j;
            if (e->events & EPOLLIN)
        mask |= AE_READABLE;
            if (e->events & EPOLLOUT)
        mask |= AE_WRITABLE;
            if (e->events & EPOLLERR) 
        mask |= AE_WRITABLE;
            if (e->events & EPOLLHUP)
        mask |= AE_WRITABLE;
            /*Set ready event table elements*/
            eventLoop->fired[j].fd = e->data.fd;
            eventLoop->fired[j].mask = mask;
        }
    }
    
    //Returns the number of ready events
    return numevents;
}

processFileEventIt is the pseudo code for handling ready file events and the file event dispatcher mentioned above. It is actually traversalfiredThen, according to the corresponding event type, call different processors registered in the event to read the event callrfileProc, and write event callwfileProc

void processFileEvent(int numevents) {
    for (j = 0; j < numevents; j++) {
            /*Get events from ready array*/
            aeFileEvent *fe = &eventLoop->events[eventLoop->fired[j].fd];
            int mask = eventLoop->fired[j].mask;
            int fd = eventLoop->fired[j].fd;
            int fired = 0;
            int invert = fe->mask & AE_BARRIER;
            /*Read events*/
            if (!invert && fe->mask & mask & AE_READABLE) {
                /*Call the read processing function*/
                fe->rfileProc(eventLoop,fd,fe->clientData,mask);
                fired++;
            }
            /*Write events*/
            if (fe->mask & mask & AE_WRITABLE) {
                if (!fired || fe->wfileProc != fe->rfileProc) {
                    fe->wfileProc(eventLoop,fd,fe->clientData,mask);
                    fired++;
                }
            }
            if (invert && fe->mask & mask & AE_READABLE) {
                if (!fired || fe->wfileProc != fe->rfileProc) {
                    fe->rfileProc(eventLoop,fd,fe->clientData,mask);
                    fired++;
                }
            }
            processed++;
        }
    }
}

andprocessTimeEventsIs a function that handles time events. It traversesaeEventLoopA list of event events that are executed if the time event arrivestimeProcFunction, depending on whether the return value of the function is equal to or notAE_NOMORETo determine whether the time event is a periodic event and to modify the arrival time.

static int processTimeEvents(aeEventLoop *eventLoop) {
    int processed = 0;
    aeTimeEvent *te;
    long long maxId;
    time_t now = time(NULL);
    ....
    eventLoop->lastTime = now;

    te = eventLoop->timeEventHead;
    maxId = eventLoop->timeEventNextId-1;
    /*Traversing the time event list*/
    while(te) {
        long now_sec, now_ms;
        long long id;

        /*Delete the time event that needs to be deleted*/
        if (te->id == AE_DELETED_EVENT_ID) {
            aeTimeEvent *next = te->next;
            if (te->prev)
                te->prev->next = te->next;
            else
                eventLoop->timeEventHead = te->next;
            if (te->next)
                te->next->prev = te->prev;
            if (te->finalizerProc)
                te->finalizerProc(eventLoop, te->clientData);
            zfree(te);
            te = next;
            continue;
        }

        /*If Id is greater than the maximum maxid, it is a time event generated by the cycle, and will not be handled*/
        if (te->id > maxId) {
            te = te->next;
            continue;
        }
        aeGetTime(&now_sec, &now_ms);
        /*Event has arrived, call its timeproc function*/
        if (now_sec > te->when_sec ||
            (now_sec == te->when_sec && now_ms >= te->when_ms))
        {
            int retval;

            id = te->id;
            retval = te->timeProc(eventLoop, id, te->clientData);
            processed++;
            /*If the return value is not equal to Ae_ Nomore, indicating that it is a periodic event, modify its when_ SEC and when_ MS attribute*/
            if (retval != AE_NOMORE) {
                aeAddMillisecondsToNow(retval,&te->when_sec,&te->when_ms);
            } else {
                /*One time event, marked to be deleted, will be deleted in the next traversal*/
                te->id = AE_DELETED_EVENT_ID;
            }
        }
        te = te->next;
    }
    return processed;
}

Delete event

When an event is no longer needed, it needs to be deleted. For example, if FD monitors both read and write events. When you no longer need to listen for write events, you can delete the write events of the FD.

aeDeleteEventLoopThe execution of the function is summarized in the following steps
1. According tofdEvent found in not ready table
2. Cancel thefdCorresponding event identifier
3. CallaeApiFreeFunction, the kernel will cancel epoll monitoring of the corresponding events on the red black tree.

Postscript

Next, we will continue to learn the principle of master-slave replication of redis. We welcome your continuous attention.

Programmer Li Xiaobing’s blog

Detailed explanation of redis event mechanism

Recommended reading

  • Twelve pictures show you the data structure and object system of redis
  • Redis RDB persistence details
  • Distributed current limiting based on redis and Lua