Redis command execution process (next)

Time:2020-2-26

In the previous article, “redis command execution process (I)”, we first understood the overall process of redis command execution, and then analyzed the principle and implementation details of the process from redis startup to socket connection, then to read socket data to input buffer, analyze command, execute command, etc. Next, let’s see the implementation details of set and get commands and how to send command results to redis client through output buffer and socket.

Redis command execution process (next)

Specific implementation of set and get commands

As mentioned earlier, the processCommand method resolves the corresponding redisCommand from the input buffer, and then calls the call method to perform the proc method of redisCommand parsed. The proc methods of different commands are different. For example, the Proc of rediscommand named set is the setcommand method, and the get is the getcommand method. In this form, the polymorphism strategy, which is especially common in Java, is actually implemented.

void call(client *c, int flags) {
    ....
    c->cmd->proc(c);
    ....
}
//Rediscommand structure
struct redisCommand {
    char *name;
    //Function normal form of corresponding method
    redisCommandProc *proc;
    ... // other definitions
};
//Aliases defined with typedef
typedef void redisCommandProc(client *c);
//Different commands call different methods.
struct redisCommand redisCommandTable[] = {
    {"get",getCommand,2,"rF",0,NULL,1,1,1,0,0},
    {"set",setCommand,-3,"wm",0,NULL,1,1,1,0,0},
    {"hmset",hsetCommand,-4,"wmF",0,NULL,1,1,1,0,0},
    ... // all redis commands have
}

Redis command execution process (next)

SetCommand will determine whether the set command has optional parameters such as NX, XX, ex or Px, and then invoke the setGenericCommand command. Let’s go straight to the setgenericcommand method.

The processing logic of setgenericcommand method is as follows:

  • First, determine whether the type of set is set ﹣ NX or set ﹣ XX. If it is NX and the key already exists, it will return directly; if it is XX and the key does not exist, it will return directly.
  • Call the setkey method to add the key value to the corresponding redis database.
  • If there is an expiration time, calling setexpire will set the expiration time
  • Key space notification
  • Return the corresponding value to the client.
// t_string.c 
void setGenericCommand(client *c, int flags, robj *key, robj *val, robj *expire, int unit, robj *ok_reply, robj *abort_reply) {
    long long milliseconds = 0; 
    /**
     *Expiration time is set; expire is of robj type, get integer value
     */
    if (expire) {
        if (getLongLongFromObjectOrReply(c, expire, &milliseconds, NULL) != C_OK)
            return;
        if (milliseconds <= 0) {
            addReplyErrorFormat(c,"invalid expire time in %s",c->cmd->name);
            return;
        }
        if (unit == UNIT_SECONDS) milliseconds *= 1000;
    }
    /**
     *Directly return when NX and key exist; directly return when XX and key don't exist
     *Lookupkeywrite is to find whether the key value exists in the corresponding database
     */
    if ((flags & OBJ_SET_NX && lookupKeyWrite(c->db,key) != NULL) ||
        (flags & OBJ_SET_XX && lookupKeyWrite(c->db,key) == NULL))
    {
        addReply(c, abort_reply ? abort_reply : shared.nullbulk);
        return;
    }
    /**
     *Add to data dictionary
     */
    setKey(c->db,key,val);
    server.dirty++;
    /**
     *Expiration time added to expiration dictionary
     */
    if (expire) setExpire(c,c->db,key,mstime()+milliseconds);
    /**
     *Key space notification
     */
    notifyKeyspaceEvent(NOTIFY_STRING,"set",key,c->db->id);
    if (expire) notifyKeyspaceEvent(NOTIFY_GENERIC,
        "expire",key,c->db->id);
    /**
     *Return value, addreply will be explained in detail during get command
     */
    addReply(c, ok_reply ? ok_reply : shared.ok);
}

We will not elaborate on the implementation of setkey and setexpire. In fact, we add the key value to the dict data hash table of DB and the key and expiration time to the expires hash table, as shown in the following figure.

Redis command execution process (next)

Next, let’s see the specific implementation of getcommand. Similarly, the getgenericcommand method will be called at the bottom.

The getgenericcommand method calls lookupkeyreadorreply to find the corresponding key value from the dict data hash table. If it is not found, it will directly return C “OK”; if it is found, it will call the addreply or addreplybulk method to add the value to the output buffer according to the type of value.

int getGenericCommand(client *c) {
    robj *o;
    //Call lookupkeyreadorreply to find the corresponding key from the data dictionary
    if ((o = lookupKeyReadOrReply(c,c->argv[1],shared.nullbulk)) == NULL)
        return C_OK;
    //If it is a string type, call addreply to return a single line. If it is another object type, call addreplybulk
    if (o->type != OBJ_STRING) {
        addReply(c,shared.wrongtypeerr);
        return C_ERR;
    } else {
        addReplyBulk(c,o);
        return C_OK;
    }
}

Lookupkeyreadwithflags will find the corresponding key value pair from redisdb. First, it will call expireifneed to determine whether the key is expired and needs to be deleted. If it is expired, it will call lookupkey method to find and return from the dict hash table. For specific explanation, see the detailed comments in the code

/*
 *Find the key read operation. If the key cannot be found or has expired logically, null will be returned. There are some side effects
 *1 if the key reaches the expiration time, it will be expired by the device and deleted
 *2. Update the latest access time of key
 *3 Update global cache hit probability
 *There are two values for flags: lookup ﹣ none. Lookup ﹣ Notouch does not modify the latest access time
 */
robj *lookupKeyReadWithFlags(redisDb *db, robj *key, int flags) { // db.c
    robj *val;
    //Check if the key is out of date
    if (expireIfNeeded(db,key) == 1) {
        ... // special handling of this situation by master and slave
    }
    //Find key dictionary
    val = lookupKey(db,key,flags);
    //Update global cache hit rate
    if (val == NULL)
        server.stat_keyspace_misses++;
    else
        server.stat_keyspace_hits++;
    return val;
}

Redis will call expireifneed to determine whether the key expires before calling the find key value series method, and then perform synchronous or asynchronous deletion according to whether redis has configured lazy deletion. For details on key deletion, please refer to the article “detailed explanation of redis memory management mechanism and implementation”.

There are two special cases in the logic of judging key release Expiration:

  • If the current redis is a slave instance in the master-slave structure, only judge whether the key expires, not delete the key directly, but wait for the delete command sent by the master instance before deleting. If the current redis is the primary instance, propagate the expired instruction by calling propagateexpire.
  • If Lua script execution is currently in progress, because of its atomicity and transactional nature, the time in the whole execution expiration is calculated according to the moment when it starts execution, that is to say, the keys that are not expired during Lua execution will not expire in the whole execution process.

Redis command execution process (next)

/*
 * call this method before calling the lookupKey* series method.
 *In the case of slave:
 *Slave does not actively expire and delete the key, but the return value will still return that the key has been deleted.
 *If the master key expires, it will actively delete the expired key and trigger the AOF and synchronization operations.
 *A return value of 0 indicates that the key is still valid, otherwise 1 is returned
 */
int expireIfNeeded(redisDb *db, robj *key) { // db.c
    //Get key expiration time
    mstime_t when = getExpire(db,key);
    mstime_t now;

    if (when < 0) return 0;

    /*
     *If the Lua script is currently executing, according to its atomicity, the time in the whole execution expiration is calculated according to the moment when it starts executing
     *That is to say, a key that does not expire when Lua executes will not expire during its entire execution.
     */ 
    now = server.lua_caller ? server.lua_time_start : mstime();

    //Whether the slave direct return key expires
    if (server.masterhost != NULL) return now > when;
    //When the master key is not expired, it returns directly
    if (now <= when) return 0;

    //Key expiration, delete key
    server.stat_expiredkeys++;
    //Trigger command propagation
    propagateExpire(db,key,server.lazyfree_lazy_expire);
    //And key space events
    notifyKeyspaceEvent(NOTIFY_EXPIRED,
        "expired",key,db->id);
    //Call different functions according to whether you are lazy to delete 
    return server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
                                         dbSyncDelete(db,key);
}

The lookupkey method looks up the key value from the dict hash table of redisdb through the dictfind method. If it can be found, it will judge whether to update the latest access time of LRU or to call the updatefu method to update other indicators according to the maxmemory policy of redisdb. These indicators can recover the key value when the subsequent memory is insufficient.

robj *lookupKey(redisDb *db, robj *key, int flags) {
    //Dictfind gets the entry of the dictionary according to the key
    dictEntry *de = dictFind(db->dict,key->ptr);
    if (de) {
        //Get value
        robj *val = dictGetVal(de);
        //When in RDB AOF subprocess replication phase or flags is not lookup
        if (server.rdb_child_pid == -1 &&
            server.aof_child_pid == -1 &&
            !(flags & LOOKUP_NOTOUCH))
        {
            //If maxmemory? Flag? LFU
            if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
                updateLFU(val);
            } else {
                //Update last access time
                val->lru = LRU_CLOCK();
            }
        }
        return val;
    } else {
        return NULL;
    }
}

Write command results to output buffer

At the end of all rediscommand execution, the addreply method is generally called to return the result. Our analysis also comes to the return data stage of rediscommand execution.

The addreply method does two things:

  • Prepareclienttowrite determines whether data needs to be returned, and adds the current client to the data queue waiting to be written back.
  • Call the methods “addreplytobuffer” and “addreplyobjecttolist” to write the return value to the output buffer, waiting to write to the socekt.
void addReply(client *c, robj *obj) {
    if (prepareClientToWrite(c) != C_OK) return;
    if (sdsEncodedObject(obj)) {
        //The response content needs to be added to the output buffer. The general idea is to try to add to the fixed buffer first. If the adding fails, try to add to the response list
        if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != C_OK)
            _addReplyObjectToList(c,obj);
    } else if (obj->encoding == OBJ_ENCODING_INT) {
        ... // optimization of special cases
    } else {
        serverPanic("Wrong obj->encoding in addReply()");
    }
}

Prepareclienttowrite first determines whether the current client needs to return data:

  • The client executed by Lua script needs to return a value;
  • If the client sends the reply off or skip command, no return value is required;
  • If it is the primary instance client during master-slave replication, no return value is required;
  • If it is currently a fake client in AOF loading status, no return value is required.

Then, if the client is not in the state of waiting for write, set it to this state, and add it to the waiting for write return value client queue of redis, that is, the clients’ pending write queue.

int prepareClientToWrite(client *c) {
    //If it's Lua client, OK
    if (c->flags & (CLIENT_LUA|CLIENT_MODULE)) return C_OK;
    //The client has sent reply off or skip command, and does not need to send return value
    if (c->flags & (CLIENT_REPLY_OFF|CLIENT_REPLY_SKIP)) return C_ERR;

    //Master sends commands to slave as client without receiving return value
    if ((c->flags & CLIENT_MASTER) &&
        !(c->flags & CLIENT_MASTER_FORCE_REPLY)) return C_ERR;
    //The false client in AOF loading does not need to return a value
    if (c->fd <= 0) return C_ERR; 

    //Add the client to the queue waiting to write the return value, and the next event cycle will write the return value.
    if (!clientHasPendingReplies(c) &&
        !(c->flags & CLIENT_PENDING_WRITE) &&
        (c->replstate == REPL_STATE_NONE ||
         (c->replstate == SLAVE_STATE_ONLINE && !c->repl_put_online_on_ack)))
    {
        //Set flags and add clients to the clients pending write queue
        c->flags |= CLIENT_PENDING_WRITE;
        listAddNodeHead(server.clients_pending_write,c);
    }
    //Indicates that it has been queued to return data
    return C_OK;
}

Redis divides the space for storing the response data waiting to be returned, that is, the output buffer into two parts, a fixed size buffer and a linked list of response content data. When the list is empty and the buffer has enough space, the response is added to the buffer. If the buffer is full, create a node to append to the linked list. _Addreplytobuffer and “addreplyobjecttolist” are the methods to write data to these two spaces respectively.

Redis command execution process (next)

Fixed buffer and response list form a queue as a whole. The advantage of this organization is that it can not only save memory, but also avoid frequent allocation and recycling of memory.

The above is the process of writing the response content to the output buffer. Next, look at the process of writing data from the output buffer to the socket.

The prepareclienttowrite function adds the client to the waiting write return value client queue of redis, that is, the clients’ pending write queue. The event processing logic of the request processing is over. Wait for redis to write the response from the output buffer to the socket when the next event cycle is processed.

Write command return value to socket from output buffer

In details of redis event mechanism
As we know in this article, redis will call the beforesleep method between the two event loops to handle some things, and the handling of the clients’pending’write list is in it.

The following aemain method is the main logic of the redis event loop. You can see that the beforesleep method is called every time the loop occurs.

void aeMain(aeEventLoop *eventLoop) { // ae.c
    eventLoop->stop = 0;
    while (!eventLoop->stop) {
        /*If there is a function that needs to be executed before the event is processed, execute it*/
        if (eventLoop->beforesleep != NULL)
            eventLoop->beforesleep(eventLoop);
        /*Start processing events*/
        aeProcessEvents(eventLoop, AE_ALL_EVENTS|AE_CALL_AFTER_SLEEP);
    }
}

The beforesleep function calls the handleclientswithpendingwrites function to handle the clients.

The handleclientswithpendingwrites method will traverse the clients ﹐ pending ﹐ write list. For each client, the writetoclient method will be called first to try to write the returned data from the output cache to the Sockt. If not, the aecreatefileevent method can only be called to register a write data event handler sendreplytoclient, waiting for redis Event mechanism.

Redis command execution process (next)

The advantage of this is that for clients with less data returned, there is no need to register and write data events, wait for the event to trigger and then write data to the socket. Instead, the data will be written directly to the socket in the next event cycle cycle, speeding up the response speed of data return.

However, it can also be found that if the clients ﹣ pending ﹣ write queue is too long, the processing time will be too long, blocking normal event response processing, resulting in an increase in the delay of redis subsequent commands.

//Write the return value directly to the output buffer of the client without system call or event handler registration
int handleClientsWithPendingWrites(void) {
    listIter li;
    listNode *ln;
    //Get the length of system delay write queue
    int processed = listLength(server.clients_pending_write);

    listRewind(server.clients_pending_write,&li);
    //Handle in turn
    while((ln = listNext(&li))) {
        client *c = listNodeValue(ln);
        c->flags &= ~CLIENT_PENDING_WRITE;
        listDelNode(server.clients_pending_write,ln);

        //Write the buffered value to the socket of the client, and if it is finished, skip the subsequent operation.
        if (writeToClient(c->fd,c,0) == C_ERR) continue;

        //There is still data not written. You can only register to write event handler
        if (clientHasPendingReplies(c)) {
            int ae_flags = AE_WRITABLE;
            if (server.aof_state == AOF_ON &&
                server.aof_fsync == AOF_FSYNC_ALWAYS)
            {
                ae_flags |= AE_BARRIER;
            }
            //Register write event handler sendreplytoclient, waiting for execution
            if (aeCreateFileEvent(server.el, c->fd, ae_flags,
                sendReplyToClient, c) == AE_ERR)
            {
                    freeClientAsync(c);
            }
        }
    }
    return processed;
}

In fact, the sendreplytoclient method also calls the writetoclient method, which writes as much data as possible from the buf in the output buffer and the reply list to the corresponding socket.

//Write the data in the output buffer to the socket, and return C "OK" if there is any unprocessed data
int writeToClient(int fd, client *c, int handler_installed) {
    ssize_t nwritten = 0, totwritten = 0;
    size_t objlen;
    sds o;
    //Data is still not written
    while(clientHasPendingReplies(c)) {
        //If there is data in the buffer
        if (c->bufpos > 0) {
            //Write to the socket represented by FD
            nwritten = write(fd,c->buf+c->sentlen,c->bufpos-c->sentlen);
            if (nwritten <= 0) break;
            c->sentlen += nwritten;
            //Count how many sub sections have been output this time
            totwritten += nwritten;

            //If the data in the buffer has been sent, reset the flag bit to write the subsequent data of the response to the buffer
            if ((int)c->sentlen == c->bufpos) {
                c->bufpos = 0;
                c->sentlen = 0;
            }
        } else {
            //Buffer has no data, get it from reply queue
            o = listNodeValue(listFirst(c->reply));
            objlen = sdslen(o);

            if (objlen == 0) {
                listDelNode(c->reply,listFirst(c->reply));
                continue;
            }
            //Write the data in the queue to the socket
            nwritten = write(fd, o + c->sentlen, objlen - c->sentlen);
            if (nwritten <= 0) break;
            c->sentlen += nwritten;
            totwritten += nwritten;
            //Delete queue if write is successful
            if (c->sentlen == objlen) {
                listDelNode(c->reply,listFirst(c->reply));
                c->sentlen = 0;
                c->reply_bytes -= objlen;
                if (listLength(c->reply) == 0)
                    serverAssert(c->reply_bytes == 0);
            }
        }
        //If the number of bytes output has exceeded the net Max writes per event limit, break
        if (totwritten > NET_MAX_WRITES_PER_EVENT &&
            (server.maxmemory == 0 ||
             zmalloc_used_memory() < server.maxmemory) &&
            !(c->flags & CLIENT_SLAVE)) break;
    }
    server.stat_net_output_bytes += totwritten;
    if (nwritten == -1) {
        if (errno == EAGAIN) {
            nwritten = 0;
        } else {
            serverLog(LL_VERBOSE,
                "Error writing to client: %s", strerror(errno));
            freeClient(c);
            return C_ERR;
        }
    }
    if (!clientHasPendingReplies(c)) {
        c->sentlen = 0;
        //If the content has all been output, delete the event handler
        if (handler_installed) aeDeleteFileEvent(server.el,c->fd,AE_WRITABLE);
        //If all data is returned, close the client and connection
        if (c->flags & CLIENT_CLOSE_AFTER_REPLY) {
            freeClient(c);
            return C_ERR;
        }
    }
    return C_OK;
}

Personal blog address, welcome to view

Redis command execution process (next)

Recommended Today

Laravel service container must know

The article was forwarded from the professional laravel developer community. Original link: https://learnku.com/laravel/t To learn how to build an application with laravel is not only to learn how to use different classes and components in the framework, but also to remember allartisanCommand or all helper functions (we have Google). Learning to code with laravel is […]