- Memory database, all operations are completed in memory, memory access speed is very fast.
- Efficient data structure is used.
There are six underlying data structures in redis, which are simple dynamic string, bidirectional linked list, compressed list, hash table, jump table and integer array. The corresponding relationship between them and data types is shown in the following figure:
In order to achieve fast access from key to value, redis uses hash table to save all key value pairs.
A hash table is actually an array, and each element of the array is called a hash bucket. Therefore, we often say that a hash table is composed of multiple hash buckets, in which key value pairs are stored.
We only need to calculate the hash value of the key to know its corresponding hash bucket location, and then we can access the corresponding entry element.
The way redis solves hash conflicts is chain hash. Chain hash is also easy to understand, which means that multiple elements in the same hash bucket are saved in a linked list, and they are connected with pointers in turn.
If the hash table writes more and more data, there will be more and more hash conflicts, which will lead to the length of the linked list on the hash bucket is too long, which will lead to the time-consuming of element search on this chain and reduce the efficiency.
Rehash operation is to expand the hash bucket, increase the number of hash buckets, reduce hash conflicts, so as to reduce the number of elements in a single bucket.
In order to make rehash operation more efficient, redis uses two global hash tables by default: hash table 1 and hash table 2. At first, when you just insert data, hash table 1 is used by default, and hash table 2 is not allocated space. With the gradual increase of data, redis begins to execute rehash, which is divided into three steps
Allocate more space to hash table 2, for example, twice the size of current hash table 1;
The data in hash table 1 is remapped and copied to hash table 2;
Free space for hash table 1.
This process seems simple, but the second step involves a large number of data copies. If all the data in hash table 1 is migrated at one time, the redis thread will be blocked and unable to serve other requests. At this time, redis can’t access the data quickly.
To avoid this problem, redis uses a progressive rehash.
In short, when copying data in the second step, redis still processes client requests normally. When processing a request, it starts from the first index position in hash table 1 and copies all entries in this index position to hash table 2; When the next request is processed, the entries of the next index position in hash table 1 are copied. As shown in the figure below:
In this way, the cost of a large number of copies at one time is allocated to the process of multiple requests, which avoids the time-consuming operation and ensures the rapid access of data.
In addition to data migration based on key value pair operation, redis also has a timing task to execute rehash. If there is no key value pair operation, the timing task will move some data to the new hash table periodically (for example, every 100ms), which can shorten the whole rehash process.
A compressed list is actually like an array, in which each element holds a corresponding data. Different from array, compressed list has three fields zlbytes, zltail and zllen in the header, which respectively represent the length of the list, the offset at the end of the list and the number of entries in the list; There is also a Zlend at the end of the compressed list, indicating the end of the list.
In the compressed list, if we want to locate the first element and the last element, we can directly locate them by the length of the three fields in the header. The complexity is O (1). When searching for other elements, it is not so efficient. It can only be searched one by one. At this time, the complexity is O (n).
Ordered list can only find elements one by one, resulting in very slow operation, so there is a jump list. Specifically, on the basis of the linked list, the jump list adds a multi-level index. Through several jumps of the index position, the fast positioning of data is realized, as shown in the figure below:
As you can see, the search process is to jump up and down the multi-level index, and finally locate the element. This is also in line with the “jump” table. When the amount of data is large, the lookup complexity of jump table is O (logn).
This work adoptsCC agreementReprint must indicate the author and the link of this article