Introduction to the concept of redis data persistence

Time:2022-6-2

1、 Overview of data persistence

Redis is an in memory database. All data is stored in memory. In order to avoid permanent loss of data after the redis process exits abnormally due to server power failure and other reasons, it is necessary to regularly save the data in redis from memory to hard disk in some form (or command data); When redis restarts next time, use persistent files to realize data recovery. In addition, for disaster backup, persistent files can be copied to a remote location (NFS).

Redis provides two methods for persistence:
RDB persistence: the principle is to periodically save the database records of IDS in memory to disk. (similar to snapshot)

Aof persistence (append only file): the principle is to write the IDS operation log to the file by appending, which is similar to MySQL’s binlogo (based on log persistence)

Because AOF persistence has better real-time performance, that is, less data is lost when the process exits unexpectedly (it is generally set to save once per second), so AOF is the current mainstream persistence method, and RDB persistence is basically enabled (for clusters)

在这里插入图片描述

1. RDB persistence

(1) RDB persistence refers to saving the generated snapshot of the data in the current process in the memory to the hard disk within a specified time interval (so it is also called snapshot persistence), and storing it in binary compression. The saved file suffix is RDB; When redis is restarted, the snapshot file recovery data can be read.

Redis database file, full name redis database
-One of data persistence methods
-Default method of data persistence
-Writes a snapshot of a dataset in memory to the hard disk at a specified time interval - the snapshot term is snapshot
-Read snapshot files directly into memory during recovery
·Define RDB file name
-Dbfilename "dump.rdb" \

(2) Trigger save

Optimization settings, frequency of data saving from memory to hard disk
-Save 900 1  minutes and save when there is a key change
-Save 300 10  minutes and 10 keys changed
-Save 600000  minute and 10000 keys are changed


Save manually
-Save \n\e blocks the write save disk. During the save process, redis is not allowed to write new data
-Bgsave \

vim /etc/redis/6379.conf

#----Line 219: bgsave will be called when any of the following three save conditions is met
Save 9001: when the time reaches 900s, if the redis data changes at least once, execute bgsave
Save 30010: when the time reaches 300 seconds, if the redis data has changed at least 10 times, execute bgsave
Save 6010000: when the time reaches 60 seconds, if the redis data has changed at least 10000 times, execute bgsave
#----Line 242 -- whether to enable RDB file compression
rdbcompression yes
#----Line 254 -- specify the RDB file name
dbfilename dump.rdb
#----Line 264 -- specify the directory where the RDB file and AOF file are located
dir /var/lib/redis/6379

(3) Other automatic triggering mechanisms
In addition to save m n, there are other situations that trigger bgsave:
In the master-slave replication scenario, if the slave node performs full replication, the master node will execute the bgsave command and send the RDB file to the slave node.

RDB persistence is also performed automatically when the shutdown command is executed.

(4) Execution process:

在这里插入图片描述

Recovering data using RDB files
·Backup data
-Backup dump RDB files to other locations
]#CP database directory /dump RDB backup directory

·Recover data
-Copy the backup file to the database directory and restart the redis service
]#CP backup directory /dump RDB database directory/

(4) Load on startup

RDB file loading is performed automatically when server is started. There is no special command. However, due to the higher priority of AOF, redis will first load AOF files to recover data when AOF is enabled; Only when AOF is closed can RDB files be detected and loaded automatically when redis server is started. The server is blocked during RDB file loading until the loading is complete.
When redis (when AOF is closed) loads an RDB file, it will verify the RDB file. If the file is damaged, an error will be printed in the log, and redis fails to start

. Aof persistence
RDB persistence is to write process data to a file, while AOF persistence is to record each write and delete command executed by redis in a separate log file. Query operations will not be recorded; When redis restarts, execute the commands in the AOF file again to recover the data.
Compared with RDB, AOF has better real-time performance, so it has become the mainstream persistence scheme.

2. Open AOF

The redis server turns on RDB by default and turns off AOF; To enable AOF, you need to configure in the configuration file:
(1) . modify the configuration file
vim /etc/redis/6379.conf

① The redis parent process first determines whether save or the child process of bgsave/bgrewriteaof is currently executing. If it is executing, the bgsave command returns directly. The sub processes of bgsave/bgrewriteaof cannot be executed at the same time. This is mainly based on performance considerations: two concurrent sub processes perform a large number of disk writes at the same time, which may cause serious performance problems.
② The parent process forks to create a child process. In this process, the parent process is blocked. Redis cannot execute any commands from the client
③ After the parent process forks, the bgsave command returns the "background saving started" information, no longer blocks the parent process, and can respond to other commands
④ The child process creates an RDB file, generates a temporary snapshot file according to the memory snapshot of the parent process, and atomically replaces the original file after completion
⑤ The child process sends a signal to the parent process to indicate completion, and the parent process updates the statistical information

(2) . execution process

#----Line 700 - modify; Open AOF
appendonly yes
#----Line 704 -- specify the AOF file name
appendfilename "appendonly.aof"
#----Line 796 - whether to ignore the last instruction that may have a problem
aof-load-truncated yes
#It means that when redis recovers, it ignores the last instruction that may have a problem. The default value is yes. That is, when AOF is written, there may be an error in the instruction (sudden power failure leads to the end of the execution). In this case, yes logs and continues, while no directly fails to recover
/etc/init.d/redis_6379 restart
#You need to cancel the password first

The trigger of file rewriting can be divided into manual trigger and automatic trigger:

Manual triggering: directly call the bgrewriteaof command. The execution of this command is somewhat similar to that of bgsave: both fork subprocesses perform specific work and are blocked only during fork.

Auto trigger: bgrewriteaof is automatically executed by setting the auto AOF rewrite min size option and auto AOF rewrite percentage option. Only when the auto AOF rewrite min size and auto AOF rewrite percentage options are met at the same time, can the AOF rewrite be triggered automatically, that is, the bgrewriteaof operation.

Since each write command of redis needs to be recorded, AOF does not need to be triggered. The following describes the execution process of AOF.
The implementation process of AOF includes:
Append: appends the redis write command to the buffer AOF_ buf
File write and file sync: set AOF based on different synchronization policies_ Contents in buf are synchronized to the hard disk
File rewrite: periodically rewrite AOF files to achieve the purpose of compression
① . command append
Redis first adds the write command to the buffer instead of directly writing to the file. The main reason is to avoid directly writing to the hard disk every time there is a write command, which causes the hard disk IO to become the bottleneck of redis load.
The format of command addition is the protocol format requested by redis command. It is a plain text format and has the advantages of good compatibility, strong readability, easy processing, simple operation and avoiding secondary overhead. In the AOF file, except for the select command used to specify the database (for example, select 0 is the selected database No. 0), which is added by redis, other commands are write commands sent by the client.
② , file write, and file sync
Redis provides a variety of synchronization file policies for the AOF cache. The policies involve the write function and fsync function of the operating system. The description is as follows:
In order to improve the efficiency of file writing, in modern operating systems, when users call the write function to write data to a file, the operating system usually temporarily stores the data in a memory buffer. When the buffer is filled or exceeds the specified time limit, the data in the buffer is actually written to the hard disk. Although this operation improves the efficiency, it also brings security problems: if the computer shuts down, the data in the memory buffer will be lost; Therefore, the system also provides synchronization functions such as fsync and fdatasync, which can force the operating system to write the data in the buffer to the hard disk immediately, so as to ensure the security of the data.
There are three synchronization methods for the synchronization file policy of the AOF cache. They are: (VIM /etc/redis/6379.conf ----- "line 729)
Appendfsync always: Command writes AOF_ Call the system fsync operation immediately after buf to synchronize to the AOF file, and the thread returns after fsync is completed. In this case, every time there is a write command, it must be synchronized to the AOF file. The hard disk IO becomes a performance bottleneck. Redis can only support about a few hundred TPS writes, which seriously reduces the performance of redis; Even if a solid-state drive (SSD) is used, it can only process tens of thousands of commands per second, and it will greatly reduce the service life of the SSD.
Appendfsync No: Command writes AOF_ Call the system write operation after buf, and do not fsync synchronize the AOF file; Synchronization is the responsibility of the operating system, and usually the synchronization cycle is 30 seconds. In this case, the time of file synchronization is uncontrollable, and there will be a lot of data accumulated in the buffer, so the data security cannot be guaranteed.
Appendfsync everysec: Command writes AOF_ Call the system write operation after buf, and the thread returns after the write is completed; The fsync synchronize file operation is called once per second by a dedicated thread. Everysec is a compromise between the above two strategies and a balance between performance and data security. Therefore, it is the default configuration of redis and our recommended configuration.
③ . file rewrite
As time goes by, the redis server executes more and more write commands, and the AOF file becomes larger and larger; Too large AOF files will not only affect the normal operation of the server, but also cause the data recovery to take too long.
File rewriting refers to rewriting AOF files regularly to reduce the volume of AOF files.
Aof rewriting is to convert the data in the redis process into write commands and synchronize them to a new AOF file
The old AOF file will not be read or written
For AOF persistence, file rewriting is strongly recommended, but it is not necessary; Even without file rewriting, data can be persisted and imported when redis is started; Therefore, in some implementations, automatic file rewriting will be turned off and then executed at a certain time of the day through a scheduled task.

File rewriting can compress AOF files because:
Expired data is no longer written to the file
Invalid commands are no longer written to the file: for example, some data are repeatedly set (set MyKey V1, set MyKey V2), and some data are deleted (Sadd myset V1, del myset).
Multiple commands can be merged into one: for example, Sadd myset V1, Sadd myset V2, and Sadd myset V3 can be merged into Sadd myset V1 V2 v3.

The trigger of file rewriting can be divided into manual trigger and automatic trigger:

Manual triggering: directly call the bgrewriteaof command. The execution of this command is somewhat similar to that of bgsave: both fork subprocesses perform specific work and are blocked only during fork.

Auto trigger: bgrewriteaof is automatically executed by setting the auto AOF rewrite min size option and auto AOF rewrite percentage option. Only when the auto AOF rewrite min size and auto AOF rewrite percentage options are met at the same time, can the AOF rewrite be triggered automatically, that is, the bgrewriteaof operation.

Auto AOF rewrite percentage 100: bgrewriteaof occurs when the current AOF file size (i.e. aof\u current\u size) is twice the AOF file size (aof\u base\u size) of the last log rewrite
   Auto AOF rewrite min size 64MB: the minimum value of bgrewriteaof command executed by the current AOF file to avoid frequent bgrewriteaof caused by small file size when starting IDS



vim /etc/redis/6379.conf
#----Line 729----
auto-aof-rewrite-percentage 100
#When the current AOF file size (i.e. aof\u current\u size) is twice the AOF file size (aof\u base\u size) of the last log rewrite, the bgrewriteaof operation occurs
auto-aof-rewrite-min-size 64mb 
#The minimum value of the bgrewriteaof command executed in the current AOF file to avoid frequent bgrewriteaof caused by the small file size when redis is started

be careful:
	Override by parent process fork child process
	The write command executed by redis during rewriting needs to be appended to the new AOF file. For this reason, redis has introduced AOF_ rewrite_ BUF cache.

(3) . the process of file rewriting is as follows

The redis parent process first determines whether there is a child process executing bgsave/bgrewriteaof. If there is a child process, the bgrewriteaof command returns directly. If there is a bgsave command, execute it after the bgsave execution is completed.
The parent process performs a fork operation to create a child process, in which the parent process is blocked.
After the parent process forks, the bgrewriteaof command returns the “background append only file rewrite started” information. The parent process is no longer blocked and can respond to other commands. All write commands of redis are still written to the AOF buffer and synchronized to the hard disk according to the appendfsync policy to ensure the correctness of the original AOF mechanism.
Because fork operation uses write time replication technology, child processes can only share memory data during fork operation. Since the parent process is still responding to commands, redis uses the AOF rewrite buffer (aof\u rewrite\u buf) to save this part of data to prevent this part of data from being lost during the generation of new AOF files. That is, during the execution of bgrewriteaof, the redis write command is appended to the AOF at the same time_ BUF and AOF_ rewirte_ BUF two buffers.
The child process writes to the new AOF file according to the memory snapshot and the command merge rules.
After the child process writes the new AOF file, it sends a signal to the parent process, and the parent process updates the statistical information. For details, you can view it through info persistence.
The parent process writes the data of the AOF rewrite buffer to the new AOF file, which ensures that the database state saved by the new AOF file is consistent with the current state of the server.
Replace the old file with the new AOF file to complete the AOF rewrite.

(4) Loading at startup
When AOF is enabled, redis will preferentially load AOF files to recover data; RDB files are loaded to recover data only when AOF is closed.
When AOF is enabled but the AOF file does not exist, the RDB file will not be loaded even if it exists.
When redis loads the AOF file, it will verify the AOF file. If the file is damaged, an error will be printed in the log, and redis fails to start. However, if the end of the AOF file is incomplete (such as sudden machine downtime, etc.), and the AOF load truncated parameter is enabled, a warning will be output in the log. Redis ignores the end of the AOF file and starts successfully. The AOF load truncated parameter is enabled by default.

II Advantages and disadvantages of RDB and AOF

1. Advantages and disadvantages of RDB persistence

Advantages: RDB files are compact, small in size, fast in network transmission, and suitable for full replication; Recovery is much faster than AOF. Of course, one of the most important advantages of RDB over AOF is that it has a relatively small impact on performance.
Disadvantages: the fatal disadvantage of RDB files is that the persistence method of data snapshots determines that real-time persistence is inevitable. Today, when data is becoming more and more important, a large amount of data loss is often unacceptable. Therefore, AOF persistence has become the mainstream. In addition, RDB files need to meet specific formats and have poor compatibility (for example, the old version of redis is incompatible with the new version of RDB files).
For RDB persistence, on the one hand, the redis main process will block when bgsave performs a fork operation. On the other hand, the sub process will also bring IO pressure when writing data to the hard disk.

2. Aof persistence advantages and disadvantages

Corresponding to RDB persistence, AOF has the advantages of supporting second level persistence and good compatibility, but has the disadvantages of large files, slow recovery speed and great impact on performance.
For AOF persistence, the frequency of writing data to the hard disk is greatly increased (seconds under the everysec Policy), and the IO pressure is greater, which may even cause AOF additional blocking.
The rewriting of AOF files is similar to the bgsave of RDB. There will be blocking during fork and IO pressure of child processes. Relatively speaking, since AOF writes data to the hard disk more frequently, it will have a greater impact on the performance of the redis main process
. Redis performance management
9.1 view redis memory usage

Redis cli -h 192.168.184.10 -p 6379 \
Info memory \

9.2 memory fragmentation rate
Operating system allocated memory value used_ memory_ RSS divided by the memory used by redis used_ Memory calculation shows that memory fragmentation is caused by the operating system’s inefficient allocation / recycling of physical memory (discontinuous physical memory allocation)
Tracking the memory fragmentation rate is very important for understanding the resource performance of redis instances:
It is reasonable that the memory fragmentation rate is slightly greater than 1. This value indicates that the memory fragmentation rate is relatively low
The memory fragmentation rate exceeds 1.5, indicating that redis consumes 150% of the actual physical memory, of which 50% is the memory fragmentation rate. You need to enter the shutdown Save command on the redis cli tool and restart the redis server.
If the memory fragmentation rate is lower than 1, the redis memory allocation exceeds the physical memory, and the operating system is in the process of memory exchange. The available physical memory needs to be increased or redis memory needs to be reduced.

9.3 memory usage
The memory utilization of the redis instance exceeds the maximum available memory. The operating system will start exchanging memory and swap space.
Ways to avoid memory swapping:
(1) Select to install redis instance according to cache data size
(2) Use hash data structure storage as much as possible
(3) Set the expiration time of the key

9.4 internal recycling key
Ensure reasonable allocation of redis’ limited memory resources.
When the set maximum threshold is reached, you need to select a key recycling policy. By default, the recycling policy is not allowed to delete.

Modify the maxmemory policy attribute value in the configuration file:

vim /etc/redis/6379.conf

#----598 uncomment----
maxmemory-policy noenviction

volatile-lru 	: Use LRU algorithm to weed out data from data sets with set expiration time
volatile-ttl 	: Select data that will expire from the data set with expiration time
volatile-random 	: Randomly select data from data sets with set expiration time for obsolescence
allkeys-lru 		: Eliminate data from all data sets using LRU algorithm
allkeys-random 	: Select any data obsolescence from the data set
noenviction 		: Prohibit obsolescence data

This is the end of this article about the concept of redis data persistence. For more information about redis data persistence, please search the previous articles of developeppaer or continue to browse the following articles. I hope you will support developeppaer in the future!