Redis design and implementation 4: Dictionary Dict

Time:2021-4-19

In redis, the dictionary is the infrastructure. Redis database data, expiration time and hash type all take the dictionary as the underlying structure.

Structure of dictionary

Hashtable

The implementation code of hash table is as follows:dict.h/dictht The dictionary of redis is implemented in the form of hash table.

typedef struct dictht {
    //Hash table array, commonly known as the hash bucket
    dictEntry **table;
    //The length of the hash table
    unsigned long size;
    //The length of the hash table掩码,用来计算索引值,保证不越界。总是 size - 1
    // h = dictHashKey(ht, he->key) & n.sizemask;
    unsigned long sizemask;
    //The number of nodes that the hash table has used
    unsigned long used;
} dictht;
  • tableIs a hash table array, the implementation of each node in thedict.h/dictEntry, eachdictEntrySave a key value pair.
  • sizeAttribute records the length of the hash table applied to the system, which may not be used up. There is reserved space.
  • sizemaskAttribute is mainly used to calculateIndex value = hash value & sizemaskThis index value determines where the key value pair is placedtableIt’s where you are. Its value is always zerosize - 1Actually, I don’t understand why I don’t use it directly when calculatingsize - 1If you know, please make it clear.
  • usedProperty is used to record the number of nodes that have been used,sizeuseIt’s the unused node.

The following figure shows an empty hash table structure of size 4 without any key value pairs
一个空哈希表

Hash node

Hashtable dicthtOftableThe element of the is determined by the hash nodedictEntryComposition, eachdictEntryIt’s a key value pair

typedef struct dictEntry {
    //Key
    void *key;
    //Value
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    //Next hash node, used for zipper table when hash conflict occurs
    struct dictEntry *next;
} dictEntry;

The next pointer is used whenHash Collisions It can be used to form a linked list. Follow up meeting

Dictionaries

Redis’s Dictionary implementation is as follows:dict.h/dict

typedef struct dict {
    //Hash algorithm
    dictType *type;
    //Private data, parameters for different types of hash algorithms
    void *privdata;
    //Two hash tables are used for rehash expansion and reduction
    dictht ht[2];
    //The index value of rehash, when not in rehash, is - 1
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    //Running iterator
    unsigned long iterators; /* number of iterators currently running */
} dict;

//Dicttype is actually a hash algorithm. I don't know why it is called dicttype
typedef struct dictType {
    //Hash method, according to the key to calculate the hash value
    uint64_t (*hashFunction)(const void *key);
    //Copy key
    void *(*keyDup)(void *privdata, const void *key);
    //Copy value
    void *(*valDup)(void *privdata, const void *obj);
    //Key comparison
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    //Destroy key
    void (*keyDestructor)(void *privdata, void *key);
    //Destroy value
    void (*valDestructor)(void *privdata, void *obj);
} dictType;

dictTypeAttribute represents the dictionary type. In fact, this dictionary type is a set of operation key value pair algorithms, which specifies many functions.
privdataIt is for different typesdictTypeOptional parameters provided.
If necessary, when creating a dictionary, you can pass in thedictTypeandprivdata

dict.c

//Create a dictionary, where type and privdata can be transferred
dict *dictCreate(dictType *type, void *privDataPtr) {
    dict *d = zmalloc(sizeof(*d));
    _dictInit(d,type,privDataPtr);
    return d;
}

//Initialize dictionary
int _dictInit(dict *d, dictType *type, void *privDataPtr) {
    _dictReset(&d->ht[0]);
    _dictReset(&d->ht[1]);
    d->type = type;
    d->privdata = privDataPtr;
    d->rehashidx = -1;
    d->iterators = 0;
    return DICT_OK;
}

The figure below is a more complete picture of the general statedictStructure of (no rehash, no state of iterator)
dict 结构图

Hash algorithm

When you need to add a new key value pair to the dictionary, you need to hash the key first, calculate the hash value, and then calculate the index value according to the length of the dictionary.

//Use the hash algorithm in the hash dictionary to calculate the hash value
hash = dict->type->hashFunction(key)
//Use sizemask and hash to calculate the index value
idx = hash & d->ht[table].sizemask;
//The hash node is located by the index value
he = d->ht[table].table[idx];

Hash Collisions

Hash conflict refers to multiple different keys, and the calculated index value is the same.

Redis’s solution to hash conflict is zipper method. That is, there is a hash node after each hash nodenextPointer. When it is found that there are other nodes in the corresponding position of the calculated index value, it can be directly added to the front node, thus forming a linked list.

This is shown in the figure below{k1, v1}and{k2, v2}The structure of hash conflicts.
hypothesisk1andk2The calculated index values are all 3k2findtable[3]There are alreadydictEntry{k1,v1}ThendictEntry{k1,v1}.next = dictEntry{k2,v2}
哈希冲突拉链表的示意图

rehash

With the continuous operation, the length of the hash table will continue to increase or decrease. Too long hash table will waste space, too short hash conflict will lead to performance degradation, hash table needs to be expanded or reduced to keep the length of hash table in a reasonable range.
Redis completes the rehash operation through HT [0] and HT [1]. The steps are as follows:

  1. To allocate space for HT [1], there are two kinds of space lengths
    • Expansion: the first is greater than or equal toht[0].used * 2Of\(2^n\)For example, HT [0]. Used = 3, then the nearest number to 6 is allocated\(2^3=8\)
    • Shrinkage: the first is greater than or equal toht[0].used / 2Of\(2^n\)For example, HT [0]. Used = 6, then the nearest number to 3 is allocated\(2^2=4\)
  2. When the key value pairs on H [0] are migrated to h [1], the index values are recalculated. Because the length of H [1] is long, the elements in H [0] zipper will be divided into different positions.
  3. After all the key value pairs of HT [0] are migrated, H [0] is released, and thenh[0] = h[1]And empty h [1] to prepare for the next rehash

Progressive rehash

In the second step of rehash mentioned above, the migration process is not completed at one time. If the length of the hash table is small, it can be completed quickly at one time. But if the hash table is very long, such as millions, the migration process will not be so fast, which will cause command blocking!
Let’s talk about how redis gradually integratesh[0]Key value pairs in are migrated toh[1]In the

  1. Open up space for H [1], the dictionary holds both H [0] and H [1]
  2. In the dictionaryrehashidxThe progress of rehash is maintained. When it is set to 0, rehash starts
  3. Every time the dictionary is added, deleted, modified and searched, it will not only complete the specified operation, but also add therehashidxThe entire linked list on theh[1]In the middle. After migrationrehashidx + 1
  4. With the continuous reading and operation of the dictionary, the final result is achievedh[0]All key value pairs on are migrated toh[1]In the middle. After all migration is completedrehashidx = -1

The advantage of this progressive rehash method is that it allocates the huge migration work to each addition, deletion and query, and avoids the huge loss of performance caused by one-time operation.
The disadvantage is that the migration process is slowh[0]andh[1]At the same time, it exists for a long time and the space utilization rate is low.

The following series of figures show how the dictionary can be progressively rehashRedis design and Implementation )








Common operations of dictionary

1. Find nodes

The search code is implemented indict.c:

#define dictHashKey(d, key) (d)->type->hashFunction(key)
// ...
dictEntry *dictFind(dict *d, const void *key) {
    dictEntry *he;
    uint64_t h, idx, table;
    //Empty dictionary returns null directly
    if (dictSize(d) == 0) return NULL;
    //If the state is in rehash, the progressive rehash step
    if (dictIsRehashing(d)) _dictRehashStep(d);
    //Call the dictionary hash algorithm to hash and calculate the hash code
    h = dictHashKey(d, key);
    
    for (table = 0; table <= 1; table++) {
        //Calculate the index in the hash bucket
        idx = h & d->ht[table].sizemask;
        //Find the corresponding hash slot
        he = d->ht[table].table[idx];
        //Hash slot is a linked list
        while(he) {
        	//Traverse the linked list to find the node equal to the key
            if (key==he->key || dictCompareKeys(d, key, he->key))
                return he;
            he = he->next;
        }
        //If it is not in rehash, it will not find HT [1]
        if (!dictIsRehashing(d)) return NULL;
    }
    return NULL;
}

2. Add node

int dictAdd(dict *d, void *key, void *val) {
    //The method of adding nodes in the bottom layer
    dictEntry *entry = dictAddRaw(d, key, NULL);
    if (!entry) return DICT_ERR;
    //Setting values for nodes
    dictSetVal(d, entry, val);
    return DICT_OK;
}

//How to add nodes
dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing) {
    long index;
    dictEntry *entry;
    dictht *ht;

    //International practice, a wave of rehash
    if (dictIsRehashing(d)) _dictRehashStep(d);

    //Check whether the inserted key already exists, and return null if it exists
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
        return NULL;

    //If you are rehash, just put it in HT [1]
    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
    entry = zmalloc(sizeof(*entry));
    //Put the node in the corresponding hash slot
    entry->next = ht->table[index];
    ht->table[index] = entry;
    ht->used++;

    //Key assignment of node
    dictSetKey(d, entry, key);
    return entry;
}

3. Delete the node

The deleted code is implemented indict.cOn the whole, the first half is similar to searching. After finding the corresponding node, delete it

static dictEntry *dictGenericDelete(dict *d, const void *key, int nofree) {
    uint64_t h, idx;
    dictEntry *he, *prevHe;
    int table;

    //If it is empty, return
    if (d->ht[0].used == 0 && d->ht[1].used == 0) return NULL;

    //International practice, a wave of rehash
    if (dictIsRehashing(d)) _dictRehashStep(d);
    //According to the dictionary hash algorithm, calculate the hash code
    h = dictHashKey(d, key);

    for (table = 0; table <= 1; table++) {
        //Calculate the index
        idx = h & d->ht[table].sizemask;
        //Hash slot found
        he = d->ht[table].table[idx];
        prevHe = NULL;
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) {
                /* Unlink the element from the list */
                if (prevHe)
                    prevHe->next = he->next;
                else
                    d->ht[table].table[idx] = he->next;
                if (!nofree) {
                    //Release the node.
                    dictFreeKey(d, he);
                    dictFreeVal(d, he);
                    zfree(he);
                }
                d->ht[table].used--;
                return he;
            }
            prevHe = he;
            he = he->next;
        }
        if (!dictIsRehashing(d)) break;
    }
    return NULL; /* not found */
}

nofreeThe parameter means “free the memory of the found node”. Why do you need this parameter?
In some cases, you need to find the value of the node before deleting it.
for exampleskiplistCodedzsetBefore deleting, you need to get the correspondingscore, and then delete the node in the hop table, which will be specially mentioned in the following article.

int zsetDel(robj *zobj, sds ele) {
    if (zobj->encoding == OBJ_ENCODING_ZIPLIST) {
        // ...
    } else if (zobj->encoding == OBJ_ENCODING_SKIPLIST) {
        //'skiplist 'encoding
        zset *zs = zobj->ptr;
        dictEntry *de;
        double score;

        //Find the hash node to delete, but do not free memory first
        de = dictUnlink(zs->dict,ele);
        if (de != NULL) {
            //Get the score of the node, used to delete the list node
            /* Get the score in order to delete from the skiplist later. */
            score = *(double*)dictGetVal(de);
           	//Free node memory
            dictFreeUnlinkedEntry(zs->dict,de);
            //Delete linked list node
            int retval = zslDelete(zs->zsl,score,ele,NULL);
            // ...
        }
    }
    // ...
}

Recommended Today

Envoy announced alpha version of native support for windows

Author: sunjay Bhatia Since 2016, porting envoy to the windows platform has been an important part of the projectOne of the goalsToday, we are excited to announce the alpha version of envoy’s windows native support. The contributor community has been working hard to bring the rich features of envoy to windows, which is another step […]