Redis provides two persistence methods, one is RDB based on snapshot form, the other is AOF based on log form, each of which has its own advantages and disadvantages. This article will introduce the two persistence methods of redis, hoping that you will have a more comprehensive and clear understanding of these two methods after reading this article.
RDB snapshot persistence
Starting from the RDB snapshot mode, RDB is the default persistence mode of redis. We do not need to enable it alone. First, let’s look at the configuration information related to RDB:
################################ SNAPSHOTTING ################################ # # Save the DB on disk: # # save <seconds> <changes> # # Will save the DB if both the given number of seconds and the given # number of write operations against the DB occurred. # # In the example below the behaviour will be to save: # after 900 sec (15 min) if at least 1 key changed # after 300 sec (5 min) if at least 10 keys changed # after 60 sec if at least 10000 keys changed # save "" #The trigger mechanism of automatic snapshot generation is time, unit second, followed by 60 seconds of change data. If 10000 pieces of data are changed, the snapshot will be generated automatically save 900 1 save 300 10 save 60 10000 #Whether the main thread stops writing when snapshot generation fails stop-writes-on-bgsave-error yes #Whether to use compression algorithm for storage rdbcompression yes #Check RDB file validity during data recovery rdbchecksum yes # The filename where to dump the DB #File name generated by RDB snapshot dbfilename dump.rdb #The path AOF of snapshot generation is also stored under this path dir .
There is not much configuration information about RDB, so we need to adjust even less. We only need to modify the mechanism and file storage path to generate the snapshot according to our own business volume.
RDB has two persistence methods:Manual triggerandAutomatic triggering，Manual trigger uses the following two commands:
- save: the current redis server will be blocked from responding to other commands until the RDB snapshot is generated. Instances with large memory will be blocked for a long time, so online environment is not recommended
- bgsave: the redis main process will fork a subprocess. The generation of RDB snapshot is in the charge of subprocesses. After completion, the subprocesses will automatically end. Bgsave will only block the fork subprocesses for a short time. This process is very short, so it is recommended to use this command to trigger manually
In addition to executing commands to trigger manually, there is also a persistence mechanism to trigger RDB automatically in redis,Redis will automatically trigger RDB persistence in the following cases：
- The save related configuration information is configured in the configuration, as shown in the configuration file above
save 60 10000It can also be classified as a “save m n” configuration, which means that bgsave will be triggered automatically when there are n changes to the data set in M seconds.
- In the case of master-slave, if the slave node performs a full copy operation, the master node automatically performs bgsave to generate the RDB file and send it to the slave node
- When executing the debug reload command to reload redis, the save operation will also be triggered automatically
- By default, when executing the shutdown command, if the AOF persistence function is not enabled, bgsave will be executed automatically
The above is the way of RDB persistence. It can be seen that the Save command is rarely used. In most cases, the bgsave command is used. So there are still some things about this bgsave command. Let’s look at the principle behind bgsave, starting with the flowchart:
The bgsave command has the following steps:
- 1. Execute the bgsave command. The redis main process judges whether there are currently executing RDB / AOF subprocesses. If there are, the bgsave command directly returns not to be executed.
- 2. The parent process performs the fork operation to create a child process. During the fork operation, the parent process will block. After the fork is completed, the parent process will not block and can accept other commands.
- 3. The child process creates a new RDB file, generates a temporary snapshot file based on the current memory data of the parent process, replaces the original RDB file with the new RDB file after completion, and sends a notification to the parent process that the RDB snapshot generation is completed
The above is some of the content behind the bgsave command. The content of RDB is almost the same. Let’s summarize the advantages and disadvantages of RDB persistence,Advantages of RDB：
- RDB snapshot is the memory data of redis node at a certain time, which is very suitable for backup and upload to remote server or file system for disaster recovery backup
- RDB is much faster than AOF in data recovery
There are advantages and disadvantages,The disadvantages of RDB are：
- RDB persistence mode data can not achieve real-time persistence / second level persistence. We have known that bgsave command needs to perform fork operation to create sub process every time it runs, which belongs to heavyweight operation and the frequent execution cost is too high.
- The RDB file is saved in a specific binary format. There are multiple RDB versions in the redis version evolution process. There is a problem that the redis service of the old version cannot be compatible with the RDB format of the new version
If we have high requirements for data, we can’t lose every second of data, and the RDB persistence method can’t meet the requirements, then can redis meet the requirements? The answer is yes, that’s the next AOF persistence method
Aof persistence mode
Redis does not enable AOF persistence by default. We need to enable it by ourselves. In the redis.conf configuration file
appendonly noAdjusted to
appendonly yes, which enables AOF persistence. Unlike RDB, AOF persists data in the form of recording operation commands. We can view the following AOF persistence files
*2 $6 SELECT $1 0 *3 $3 set $6 mykey1 $6 Hello *3 $3 set $4 key2 $5 hello *1 $8
It’s like this. You can check your redis server’s
appendonly.aofConfiguration files, which also means that we can
appendonly.aofThe modified value of file China will be loaded when redis restarts. It seems that some simple operation commands, actually from command to
appendonly.aofThis process is very learned. Here is the AOF persistence flowchart:
There are two very important operations in the AOF persistence process:One is to append the operation command to the AOF buf cache, and the other is to synchronize the AOF buf cache data to the AOF file, let’s talk about these two operations in detail:
1. Why write commands to the AOF buf cache instead of directly to the AOF file?
We know that redis is a single thread response. If every AOF command written is directly appended to the AOF file on the disk, the performance of redis depends on your machine hardware. In order to improve the response efficiency of redis, an AOF buf cache layer is added, which uses the cache technology of the operating system, thus improving redis Although this performance is solved, it also introduces a problem, how to synchronize the AOF buf buffer data to the AOF file? Who synchronizes? This is the next operation we will talk about:Fsync operation
2. How to synchronize the AOF buffer data to the AOF file?
There is a Linux system to write AOF buffer data to the AOF file. Because the scheduling mechanism of Linux system has a long cycle, if the system fails, it means that all the data in a cycle will be lost. This is not what we want, so Linux provides a fsync command. Fsync is for single file operation (such as AOF here File), fsync will block and return after writing to the hard disk to ensure data persistence. Because of this command, redis provides configuration items for us to decide when to synchronize the hard disk. Redis provides configuration items in redis.conf
appendfsyncConfiguration items have the following three options:
# appendfsync always appendfsync everysec # appendfsync no
- Always: every time there is a write command, it synchronizes the data between the cache and the disk, so as to ensure that there will be no data loss, but this will lead to a significant reduction in the throughput of redis, which can only support hundreds of TPS per second. This violates the design of redis, so this method is not recommended
- Everysec: This is the default synchronization mechanism of redis. Although it synchronizes data once per second and seems to take a very fast time, it has no impact on the throughput of redis. Synchronization once per second means that in the worst case, we will only lose one second of data. It is recommended to use this synchronization mechanism to give consideration to performance and data security
- No: without any processing, the cache and AOF files are synchronized and handed over to the system for scheduling. The operating system’s synchronous scheduling cycle is not fixed, with a maximum interval of 30 seconds. In this way, more data will be lost in case of failure.
These are the three disk synchronization strategies, but have you noticed a problem? AOF files are added. With the server running, AOF files will become larger and larger. Too large AOF files will affect the redis server and even the host. Moreover, it will take too long to load too large AOF files when redis restarts. These are not friendly. Redis How to solve this problem? Redis introduces rewriting mechanism to solve the problem of too large AOF files.
3. How does redis rewrite AOF files?
Redis AOF file rewriting is the process of converting the data in the redis process into write commands and synchronizing them with the new AOF file. The rewritten AOF file will occupy a smaller volume than the old AOF file, which is caused by the following reasons:
- Data that has timed out in the process is no longer written to the file
- The old AOF file contains invalid commands, such as del key1, HDEL key2, SREM keys, set a111, set a222, etc. Rewriting is directly generated with in-process data, so that the new AOF file only retains the write command of the final data
- Multiple write commands can be combined into one. For example: lpush list a, lpush List B, lpush list C can be converted into: lpush list a B C. In order to prevent a single command from overflowing the client buffer, for list, set, hash, Zset and other types of operations, 64 elements are divided into multiple operations.
The volume of the rewritten AOF file is smaller, which can not only save disk space, but also shorten the loading time of the smaller AOF file when redis data is recovered. As RDB persistence, AOF file rewriting is divided intoManual triggerandAutomatic triggering, trigger direct call manually
bgrewriteaofThe command is good. We will talk about this command in detail later. To trigger automatically, we need to modify the following configurations in redis.conf
auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb
- Auto AOF rewrite percentage: represents the ratio of the current AOF file space (AOF current size) to the AOF file space (AOF base size) after the last rewrite, which is 100% by default, that is, when it is the same size
- Auto AOF rewrite min size: indicates the minimum volume of AOF file when running AOF rewrite. The default is 64MB, that is to say, the minimum size of AOF file is 64MB, which can trigger the rewrite
If these two conditions are met, redis will automatically trigger AOF file rewriting. The details of AOF file rewriting are similar to that of RDB persistent snapshot generation. The following is the AOF file rewriting flow chart:
Aof file rewriting is also done by subprocesses. Similar to RDB’s snapshot generation, AOF file rewriting establishes an AOF rewrite buf cache to store the commands of the main process during rewriting. After the new AOF file rewriting is completed, synchronize this part of the file to the new AOF file, and finally replace the old AOF file with the new AOF file. It should be noted that during the rewriting, the old AOF files will still be synchronized on disk, so as to prevent data loss due to rewriting failure,
Redis persistent data recovery
We know that redis is based on memory. All data is stored in memory. If the machine is down or restarted due to other factors, all our data will be lost. This is the reason for persistence. When the server is restarted, redis Data will be loaded from the persistent file, so our data will be recovered to the data before restart. How is redis implemented in data recovery? Let’s first look at the flow chart of data recovery:
Redis’s data recovery process is relatively simple. The priority is to recover AOF files. If the AOF files do not exist, try to load the RDB files. Why is the recovery speed of RDB faster than that of AOF files, but the priority is still to load the AOF files? I personally think that AOF file data is more comprehensive and AOF compatibility is stronger than RDB. It should be noted that when RDB / AOF exists, if the data is not loaded successfully, the redis service will fail to start.
At present, many big guys on the Internet have redis series of tutorials. If they are the same, please forgive them. It’s not easy to be original and code. I hope you can support me a lot. If there are any mistakes in the article, I hope to put forward them. Thank you.
Welcome to WeChat official account: “technology blog of Ping tou Ge”, brother Ping head brother, learn together and make progress together.