Read the source code with Dabin – redis 9 – object coding list

Time:2020-12-6

Redis uses three list structures: ziplist, skiplist and QuickList to implement related objects. As the name suggests, ziplist saves more space, skiplist focuses on search efficiency, and QuickList compromises space and time.

In a typical two-way linked list, we have anodeThat represents each value in the list. Each node has three properties: a pointer to the previous and next node in the list, and a pointer to a string in the node. Each value string value is actually stored in three parts: an integer representing the length, an integer representing the number of free bytes left, and the string itself followed by a null character.

As you can see, each item in the linked list occupies a separate piece of memory, and the items are connected by address pointer (or reference). This method will bring a lot of memory fragmentation, and the address pointer will also take up extra memory. This is the common linked listMemory wasteQuestions.

In addition, the time complexity of random search operation in ordinary linked list is O (n), which is not acceptable for redis which pays attention to efficiency. This is an ordinary linked listSearch efficiency is too lowQuestions.

In view of the above two problems, redis designedZiplist (compressed list)Skip listandQuick linked listRelevant optimization was carried out.

1 ziplist

For ziplist, what it has to solve isMemory wasteThe problem. In other words, its design goal is toIn order to save space and improve storage efficiency

Based on this, redis made a special design to make it a special codeDouble linked list。 Each item in the table is stored in the address space before and after, and a ziplist occupies a large memory as a whole. It’s a list, but it’s not a linked list.

In addition, in order to save memory in details, ziplist adopted a variable length encoding method for the storage of values, which roughly means that for large integers, more bytes are used for storage, while for small integers, less bytes are used.

It is precisely for this efficient storage that ziplist has many bit level operations, which makes the code more obscure. But it doesn’t matter. One of the goals of this section is toUnderstand the ziplist compared with ordinary linked list, which optimization can better save space

Let’s take a look at the structure of compressed lists.

1.1 structure of compressed list

A compressed list can contain any number of entries, and each node can hold an array of bytes or an integer value.

Figure 1-1 shows the components of the compressed list

Read the source code with Dabin - redis 9 - object coding list

The related fields are described as follows:

  • Zlbytes: 4 bytes, representing the total number of bytes occupied by ziplist (including 4 bytes occupied by zlbytes itself).
  • Zltail: 4 bytes, indicating how many bytes the last entry in the ziplist table is from the starting address of the list, that is, the offset of the end node of the table. Through this field, the program can quickly determine the address of the tail node.
  • Zllen: 2 bytes: represents the number of nodes in ziplist. Note that since this field is only 16bit, the maximum expressible value is 2 ^ 16-1. Once the number of nodes in the list exceeds this value, it is necessary to traverse the entire compressed list to obtain the actual number of nodes.
  • Entry: represents the node of ziplist. The length varies, depending on what is saved. It should be noted that the entry of the list also has its own data structure, which will be explained in detail later.
  • Zlend: the end tag of the ziplist, with a fixed value of 255, used to mark the end of the compressed list.

Figure 1-2 shows a compressed list of five nodes:

Read the source code with Dabin - redis 9 - object coding list

  • The value of the zlbytes property of the list is 0xd2 (decimal 210), indicating that the total length of the compressed list is 210 bytes.
  • The value of the zltail property of the list is 0xb3 (decimal 179), indicating that the tail node is 179 bytes away from the starting address of the list. If the starting address of the list is p, then the pointer P + 179 = the address of entry5.
  • The value of the list zllen property is 0x5 (decimal 5), indicating that the compressed list contains five nodes.

1.2 structure of list nodes

The source code of the node structure is as follows (ziplist. C):

typedef struct zlentry {
    unsigned int prevrawlensize, prevrawlen;
    unsigned int lensize, len;
    unsigned int headersize;
    unsigned char encoding;
    unsigned char *p;
} zlentry;

Figure 1-3 shows the structure of the compressed list node.

Read the source code with Dabin - redis 9 - object coding list

  • Prevrawlen: indicates the total number of bytes occupied by the previous node. This field is to allow ziplist to traverse from back to front.
  • Encoding: this field records the data type saved in the content attribute of the node.
  • Lensize: this field records the length of data saved by the node.
  • Header size: this field records the size of the node header.
  • *p: This field records a pointer to what the node holds.

1.3 how does compressed lists save memory

Back to our initial understanding of the common linked list, in the ordinary linked list, each node package:

  • An integer representing the length
  • An integer representing the number of free bytes left
  • String itself
  • Null character at the end.

Take figure 1-4 as an example

Read the source code with Dabin - redis 9 - object coding list

Figure 1-4 shows the three nodes of a common linked list. Among the three nodes, each node actually stores only 1 byte of content. In addition to the actual storage content, they also need to have:

  • 3 pointers – takes 3 bytes
  • 2 integers – takes 2 bytes
  • Content string – takes 1 byte
  • Space for null character at the end – takes 1 byte

In this way, storing three bytes of data requires at least 21 bytes of overhead. As you can see, this storage efficiency is very low.

On the other hand, the common linked list is associated with nodes through the front and back pointer, and the address is discontinuous, which makes memory fragmentation easy to occur when there are multiple nodes, which reduces the utilization rate of memory.

Finally, the operation granularity of ordinary linked list on storage unit is byte. In this way, when storing small integers or strings, each byte will have a lot of space, which is a waste. Just like the three nodes above, they are used to storeInteger of the number of free bytes leftThe actual storage space only needs 1 bit, but with 1 byte to represent the size of the remaining space, the remaining 7 bits in this byte are wasted.

So, how does redis use ziplist to transform ordinary linked lists? Through the following two aspects:

On the one hand, ziplistUse a whole block of continuous memory to avoid memory fragmentation and improve the utilization rate of memory

Ziplist, on the other handReduce the operation granularity of storage unit from byte to bitIt effectively solves the problem of wasting bits in a single byte when storing small data.

2 skiplist

Skiplist is aOrderData structure, which maintains multiple pointers to other nodes in each node to achieve the purpose of fast access to nodes.

Skiplist is essentially a search structure, which is used to solve the search problem in the algorithm. That is, according to the specified value, quickly find its location.

In addition, we know that the solutions to the “find” problem are generally divided into two categories:Balanced treeandHashtable 。 Interestingly, skiplist, because of its particularity, is not in the above two categories. But in most cases, its efficiency is comparable to that of the balanced tree, and the implementation of the jump table is more simple, so many programs use jump table to replace the balance tree.

This section does not introduce the definition and principle of jump table, and children’s shoes of interest can refer to here.

After knowing what a jump table is and what it does, let’s take a look at how to implement a jump table in redis.

stayserver.hYou can find the source code of the jump table in the

typedef struct zskiplist {
    struct zskiplistNode *header, *tail;
    unsigned long length;
    int level;
} zskiplist;

typedef struct zskiplistNode {
    robj *obj;
    double score;
    struct zskiplistNode *backward;
    struct zskiplistLevel {
        struct zskiplistNode *forward;
        unsigned int span;
    } level[];
} zskiplistNode;

The structure of skiplist in redis is not much different from that of ordinary skiplist, but there are the following differences in some details:

  • The score can be repeated. That is to say, the skiplist in redis can be duplicated in the score field, while the ordinary skiplist is not allowed to be repeated.
  • The first level list is not a one-way list, but a two-way list. In this way, the elements in a range can be obtained in reverse order.
  • In comparison, in addition to comparing scores, we also compare the data itself. In redis’s skiplist, the content of the data itself is the unique identifier of the data, not the score field. In addition, when the scores of multiple elements are the same, dictionary sorting is also needed according to the data content.

3 quicklist

For QuickList, thequicklist.cThere are the following instructions in:

A doubly linked list of ziplists

It is a two-way linked list, and is a two-way linked list composed of ziplist.

The relevant source code structure can be found inquicklist.hTo search for, as follows:

/* quicklistNode is a 32 byte struct describing a ziplist for a quicklist.
 * We use bit fields keep the quicklistNode at 32 bytes.
 * count: 16 bits, max 65536 (max zl bytes is 65k, so max count actually < 32k).
 * encoding: 2 bits, RAW=1, LZF=2.
 * container: 2 bits, NONE=1, ZIPLIST=2.
 * recompress: 1 bit, bool, true if node is temporarry decompressed for usage.
 * attempted_compress: 1 bit, boolean, used for verifying during testing.
 * extra: 12 bits, free for future use; pads out the remainder of 32 bits */
typedef struct quicklistNode {
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;
    unsigned int sz;             /* ziplist size in bytes */
    unsigned int count : 16;     /* count of items in ziplist */
    unsigned int encoding : 2;   /* RAW==1 or LZF==2 */
    unsigned int container : 2;  /* NONE==1 or ZIPLIST==2 */
    unsigned int recompress : 1; /* was this node previous compressed? */
    unsigned int attempted_compress : 1; /* node can't compress; too small */
    unsigned int extra : 10; /* more bits to steal for future usage */
} quicklistNode;

/* quicklistLZF is a 4+N byte struct holding 'sz' followed by 'compressed'.
 * 'sz' is byte length of 'compressed' field.
 * 'compressed' is LZF data with total (compressed) length 'sz'
 * NOTE: uncompressed length is stored in quicklistNode->sz.
 * When quicklistNode->zl is compressed, node->zl points to a quicklistLZF */
typedef struct quicklistLZF {
    unsigned int sz; /* LZF size in bytes*/
    char compressed[];
} quicklistLZF;

/* quicklist is a 32 byte struct (on 64-bit systems) describing a quicklist.
 * 'count' is the number of total entries.
 * 'len' is the number of quicklist nodes.
 * 'compress' is: -1 if compression disabled, otherwise it's the number
 *                of quicklistNodes to leave uncompressed at ends of quicklist.
 * 'fill' is the user-requested (or default) fill factor. */
typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;        /* total count of all entries in all ziplists */
    unsigned int len;           /* number of quicklistNodes */
    int fill : 16;              /* fill factor for individual nodes */
    unsigned int compress : 16; /* depth of end nodes not to compress;0=off */
} quicklist;

As mentioned in the introduction of linked list above, the linked list is composed of multiple nodes. For QuickList, each node is a ziplist. The design of QuickList is actually what we said at the beginning of the article, which is a compromise between space and time.

Compared with the common linked list, ziplist mainly optimizes two pointsReduce memory overheadandReduce memory fragmentation。 As the saying goes, things always have two sides. Ziplist solves the memory fragmentation problem of common linked list through continuous memory, but at the same time, it also brings new problemsNot conducive to modification

Since ziplist is a block of continuous memory, every data change will cause a memory reallocation. When the ziplist is very large, a large number of data copy operations will occur every time the ziplist is redistributed, which reduces the performance.

Therefore, combining the advantages of bidirectional linked list and ziplist, QuickList is created.

The basic idea of QuickList is to allocate an appropriate size to each node’s ziplist, so as to avoid the problem of reducing performance due to data copy. This is another problem that needs to find a balance point. First, we analyze the storage efficiency

  • The shorter the ziplist on each QuickList node, the more memory fragmentation. And the memory fragmentation is more, it is likely to produce a lot of small memory fragments that can not be used, reducing storage efficiency.
  • The longer the ziplist on each QuickList node, the more difficult it is to allocate large blocks of contiguous memory to the ziplist. It is possible that there is a lot of small memory in the memory, but a large enough free space can not be found to allocate to ziplist. This also reduces storage efficiency.

It can be seen that the ziplist on a QuickList node needs to maintain a reasonable length. The rationality here depends on the actual application scenario. Based on this, redis provides a configuration parameter, which allows users to adjust the following according to the situation:

list-max-ziplist-size -2

This parameter can be positive or negative.

When a positive value is taken, it means thatNumber of data itemsTo limit the length of the ziplist on each QuickList node. For example, if the configuration is 2, it means that the ziplist on each node of QuickList contains at most two data items.

When a negative value is taken, it means thatNumber of bytes occupiedTo limit the length of the ziplist on each QuickList node. In this case, its value range is [- 1, – 5], and each value has different meanings

  • -1: The size of ziplist on each QuickList node cannot exceed 4KB;
  • -2: The size of ziplist on each QuickList node cannot exceed 8KB (the default value);
  • -3: The size of ziplist on each QuickList node cannot exceed 16kb;
  • -4: The size of ziplist on each QuickList node cannot exceed 32KB;
  • -5: The size of ziplist on each QuickList node cannot exceed 64KB;

summary

  1. There are two problems in common linked listLow memory utilizationandProne to memory fragmentation
  2. Ziplist uses continuous memory to reduce memory fragmentation and provide memory utilization.
  3. Skiplist can be implemented relatively simply to achieve the same search efficiency as the balanced tree.
  4. QuickList absorbs the advantages of common linked list and compressed linked list, and improves memory utilization as much as possible under the premise of ensuring performance.