Redis design and implementation 10: ordered collection of five data types

Time:2021-4-13

Ordered setsorted set(we call itzsetThere are two encoding methods: compressed listziplistAnd jump tableskiplist

Code 1: ziplist

zsetstayziplistIn, members(member)And scores(score)They are next to each other, and the elements are stored in fractions from small to large.

For example, let’s create azset

redis> ZADD key 26.1 z 1 a 2 b
(integer) 3

So this onezsetIts structure is as follows


Now let’s analyze itzscoreCommand source code, further understandingzsetHow to use itziplistStored

int zsetScore(robj *zobj, sds member, double *score) {
    // ...
    if (zobj->encoding == OBJ_ENCODING_ZIPLIST) {
        if (zzlFind(zobj->ptr, member, score) == NULL) return C_ERR;
    }
    // ...
    return C_OK;
}

unsigned char *zzlFind(unsigned char *zl, sds ele, double *score) {
    //Eptr is the pointer to member and sptr is the pointer to score
    unsigned char *eptr = ziplistIndex(zl,0), *sptr;

    //Traversing ziplist
    while (eptr != NULL) {
        //Because member and score are stored next to each other, the next node to get member is score
        sptr = ziplistNext(zl,eptr);
        serverAssert(sptr != NULL);

        //Compare whether the current member and the member to be queried are equal
        if (ziplistCompare(eptr,(unsigned char*)ele,sdslen(ele))) {
            //If equal, the score is obtained
            if (score != NULL) *score = zzlGetScore(sptr);
            return eptr;
        }

        //Not equal to continue to traverse down
        eptr = ziplistNext(zl,sptr);
    }
    return NULL;
}

//Score
double zzlGetScore(unsigned char *sptr) {
    unsigned char *vstr;
    unsigned int vlen;
    long long vlong;
    char buf[128];
    double score;

    serverAssert(sptr != NULL);
    //Ziplistget gets the value through the sptr pointer. The parameter is assigned according to the node code (the ziplist node code mentioned above)
    //If it is a string, it is assigned to vstr; if it is an integer, it is assigned to vlong.
    serverAssert(ziplistGet(sptr,&vstr,&vlen,&vlong));

    if (vstr) {
        //If it's a string, it's a floating-point number
        memcpy(buf,vstr,vlen);
        buf[vlen] = '
int zsetScore(robj *zobj, sds member, double *score) {
// ...
if (zobj->encoding == OBJ_ENCODING_ZIPLIST) {
if (zzlFind(zobj->ptr, member, score) == NULL) return C_ERR;
}
// ...
return C_OK;
}
unsigned char *zzlFind(unsigned char *zl, sds ele, double *score) {
//Eptr is the pointer to member and sptr is the pointer to score
unsigned char *eptr = ziplistIndex(zl,0), *sptr;
//Traversing ziplist
while (eptr != NULL) {
//Because member and score are stored next to each other, the next node to get member is score
sptr = ziplistNext(zl,eptr);
serverAssert(sptr != NULL);
//Compare whether the current member and the member to be queried are equal
if (ziplistCompare(eptr,(unsigned char*)ele,sdslen(ele))) {
//If equal, the score is obtained
if (score != NULL) *score = zzlGetScore(sptr);
return eptr;
}
//Not equal to continue to traverse down
eptr = ziplistNext(zl,sptr);
}
return NULL;
}
//Score
double zzlGetScore(unsigned char *sptr) {
unsigned char *vstr;
unsigned int vlen;
long long vlong;
char buf[128];
double score;
serverAssert(sptr != NULL);
//Ziplistget gets the value through the sptr pointer. The parameter is assigned according to the node code (the ziplist node code mentioned above)
//If it is a string, it is assigned to vstr; if it is an integer, it is assigned to vlong.
serverAssert(ziplistGet(sptr,&vstr,&vlen,&vlong));
if (vstr) {
//If it's a string, it's a floating-point number
memcpy(buf,vstr,vlen);
buf[vlen] = '\0';
//String to float
score = strtod(buf,NULL);
} else {
//The integer type is assigned directly
score = vlong;
}
return score;
}
'; //String to float score = strtod(buf,NULL); } else { //The integer type is assigned directly score = vlong; } return score; }

Code 2: skiplist

Realization of jump table

skiplistThe underlying implementation of coding is jump table.

Here is the structure of the jump table (picture fromRedis design and Implementation )

  1. This is the leftmost part of the picturezskiplistThe code implementation is as follows(server.h):
typedef struct zskiplist {
    //Head pointer and tail pointer, pointing to the head and tail node
    struct zskiplistNode *header, *tail;
    //The number of nodes in the jump table (excluding the head node, the empty jump table will also contain the head node)
    unsigned long length;
    //The maximum number of layers in all nodes
    int level;
} zskiplist;
  1. The four nodes on the right in the figure are the jump table nodeszskiplistNodeThe code implementation is as follows(server.h):
typedef struct zskiplistNode {
    //Members
    sds ele;
    //Score
    double score;
    //Back pointer to the previous node
    struct zskiplistNode *backward;
    //Each node may have many layers, and each layer may point to a different node
    struct zskiplistLevel {
        //Forward pointer, pointing to the next node
        struct zskiplistNode *forward;
        //The span between the node and the next node
        unsigned long span;
    } level[];
} zskiplistNode;

The most important part of the jump table is the layerlevelWhy do you say that?

hypothesiszsetUsing linked list orderly storage, if we want to find data, we can only traverse from beginning to end, the time complexity is very low\(O(n)\)The efficiency is very low.
链表

What can be done to improve efficiency? We can add a layer of index to it.
链表加索引
As you can see, the performance of our traversal is improved. For example, we want to find 6, first traverse the first layer, between 5 and 7, and then go down to find 6!
Some readers have found that if there is a large amount of data, it is very slow to find it.
Yes, so how to solve it? Add index to it!
链表再加几层索引
This does not, the linked list has become the jump list! And the above layer is the index! The search time complexity of the final hop table is 1\(O(logn)\)


Let’s seezrangeThe core implementation of the command, to feel the traversal of jump table

zskiplistNode* zslGetElementByRank(zskiplist *zsl, unsigned long rank) {
    zskiplistNode *x;
    unsigned long traversed = 0;
    int i;
    //Start of layer head node
    x = zsl->header;
    //Layers from high to low
    for (i = zsl->level-1; i >= 0; i--) {
        //As long as the number of traversal does not reach rank, it will continue to traverse
        while (x->level[i].forward && (traversed + x->level[i].span) <= rank)
        {
            //Span of each top layer
            traversed += x->level[i].span;
            //Go ahead
            x = x->level[i].forward;
        }
        //If you have not reached the rank after this layer, go down the layer. If you still can't find it, go on until you reach the bottom layer
        if (traversed == rank) {
            return x;
        }
    }
    return NULL;
}

The structure of Zset

skiplistCodedzsetThe structure of the system is defined as follows:

typedef struct zset {
    dict *dict;
    zskiplist *zsl;
} zset;

The structure contains a dictionary and a jump table. Why do we need a dictionary when we use a jump table?
commandzscoreIf only the jump table is used, the time complexity of searching is very low\(O(logn)\)With a dictionary, the time complexity can be reduced to\(O(n)\)

Then some students will say that adding a dictionary will waste a lot of space.
Indeed, adding one more dictionary will certainly take up more space. It is a common practice to exchange space for time. However, the object that the dictionary value points to is shared with the object of the jump table.

The picture below is an examplezsetFor convenience, the string objects they point to are drawn separately, which are actually shared. (picture fromRedis design and Implementation )

Source code analysis

Let’s seeskiplistUnder codingzscoreHow to achieve it.

int zsetScore(robj *zobj, sds member, double *score) {
    //Other ziplist codes are omitted
    // if ...
    else if (zobj->encoding == OBJ_ENCODING_SKIPLIST) {
        zset *zs = zobj->ptr;
        //Direct search by dict, time complexity O (1)
        dictEntry *de = dictFind(zs->dict, member);
        if (de == NULL) return C_ERR;
        *score = *(double*)dictGetVal(de);
    }
    
    // ...
    return C_OK;
}

Code conversion

When an ordered collection object can satisfy the following two conditions at the same time, the object usesziplistcode:

  • The number of elements saved in the ordered set is less than 128zset-max-ziplist-entriesModify the configuration);
  • The length of all element members stored in an ordered collection is less than 64 byteszset-max-ziplist-valueModify the configuration);

An ordered collection object that does not meet the above two conditions will use theskiplistcode.

Recommended Today

Review of SQL Sever basic command

catalogue preface Installation of virtual machine Commands and operations Basic command syntax Case sensitive SQL keyword and function name Column and Index Names alias Too long to see? Space Database connection Connection of SSMS Connection of command line Database operation establish delete constraint integrity constraint Common constraints NOT NULL UNIQUE PRIMARY KEY FOREIGN KEY DEFAULT […]