How does redis master-slave replication work? Do you know how to synchronize data while maintaining high performance?
Note that the following is based on the latest version of redis 5,
slaveNouns and configuration items have been officially changed to
replicaIn fact, it is a thing that refers to the slave node.
Basic process of master-slave replication
# Master-Replica replication. Use replicaof to make a Redis instance a copy of # another Redis server. A few things to understand ASAP about Redis replication. # # +------------------+ +---------------+ # | Master | ---> | Replica | # | (receive writes) | | (exact copy) | # +------------------+ +---------------+ # # 1) Redis replication is asynchronous, but you can configure a master to # stop accepting writes if it appears to be not connected with at least # a given number of replicas. # 2) Redis replicas are able to perform a partial resynchronization with the # master if the replication link is lost for a relatively small amount of # time. You may want to configure the replication backlog size (see the next # sections of this file) with a sensible value depending on your needs. # 3) Replication is automatic and does not need user intervention. After a # network partition replicas automatically try to reconnect to masters # and resynchronize with them. # # replicaof
replicaBasic process of replication
- When the connection between the master and replica is stable, the master continues to perform incremental synchronization（
partial resync）, send incremental data to replica. After receiving the data, replica updates its own data in seconds
REPLCONFACK Ping reports the processing status to the master.
- If the replica is disconnected from the master and reconnected, the replica attempts to send a message
PSYNCCommand to the master if the conditions are met (for example, the reference is a known historical copy, or
backlogIf the backlog is sufficient, it triggers the continuation of incremental synchronization（
partial resync)。 Otherwise, the master will trigger a full synchronization to the replica（
From the above basic process, we can see that if there is a problem with the network, we can lead to full synchronization（
full resync）, which will seriously affect the data progress of catching up with the master from the replica.
So how to solve it?
It can be from two aspects: master-slave response time strategy and master-slave space accumulation strategy.
Master slave response time strategy
- 1. Ping the master every repl Ping replica period seconds to check whether the master is hung.
- 2. Replication timeout between replica (save) and master. The default value is 60s
- a) From the replica perspective, RDB data transmitted by the master is not received during full synchronization sync
- b) From the perspective of replica, no packet sent by master or Ping response sent by replica was received
- c) The master angle does not receive the replica’s repconf ack pings (copy offset offset).
When redis detects the repl timeout timeout (the default value is 60s), it will close the master-slave connection, and redis replica initiates a request to re-establish the master-slave connection.
Master-slave space stacking strategy
After the master accepts data writing, it will write
replication buffer(this is mainly used for the data transmission buffer of master-slave replication), and it is also written to the backlog
When the replica disconnects and reconnects PSYNC (including replication ID and offset processed at present), if
replication backlogIf a historical copy can be found in, an incremental synchronization is triggered（
partial resync）, otherwise it will be triggered
The master synchronizes the replica in full once（
# Set the replication backlog size. The backlog is a buffer that accumulates # replica data when replicas are disconnected for some time, so that when a replica # wants to reconnect again, often a full resync is not needed, but a partial # resync is enough, just passing the portion of data the replica missed while # disconnected. # # The bigger the replication backlog, the longer the time the replica can be # disconnected and later be able to perform a partial resynchronization. # # The backlog is only allocated once there is at least a replica connected. # # repl-backlog-size 1mb
replication backlogRelevant parameters of:
#Incremental synchronization window repl-backlog-size 1mb repl-backlog-ttl 3600
Full resync full synchronization workflow
Workflow of full synchronization:
- Replica sends PSYNC.
(it is assumed that the conditions for full synchronization are met)
- The master handles the full synchronization through the sub process, and the sub process passes the synchronization
BGSAVECommand, fork a child process to write the snapshot dump.rdb. At the same time, the master starts buffering all new write commands received from the client to
- The master sub process transmits RDB data to replica through the network card.
- Replica saves RDB data to disk and then loads it into memory (deletes old data and blocks loading new data)
(followed by incremental synchronization)
If the disk of the master is slow and the bandwidth is good, the diskless mode can be used (note that this is experimental):
Repl diskless sync no -- > Yes enables diskless mode repl-diskless-sync-delay 5
Replica can provide services by default during full synchronization or disconnection.
During the time window when the replica is loaded into memory, the replica will block the connection of the client.
Allow writes only with n attached replicas
The master uses asynchronous replication by default, which means that the client writes the command. The master needs to confirm it, and confirm that there are at least N copies, and the delay is less than m seconds, then it will accept the write, otherwise an error is returned
#It is not turned on by default min-replicas-to-write min-replicas-max-lag
In addition, the client can use
WAITThe command is similar to the ACK mechanism and can ensure that there are a specified number of confirmed copies in other redis instances.
127.0.0.1:9001>set a x OK. 127.0.0.1:9001>wait 1 1000 1
replication IDIt is mainly used to identify the dataset ID from the current master.
There are two replication IDS: master_ replid，master_ replid2
127.0.0.1:9001> info replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=9011,state=online,offset=437,lag=1 master_replid:9ab608f7590f0e5898c4574299187a52ad0db7ec master_replid2:0000000000000000000000000000000000000000 master_repl_offset:437 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:437
When the master hangs up and one of the replicas is upgraded to master, it will open a new era and generate a new replication ID:
At the same time, the old
# Replication role:master connected_slaves:2 slave0:ip=127.0.0.1,port=9021,state=online,offset=34874,lag=0 slave1:ip=127.0.0.1,port=9001,state=online,offset=34741,lag=0 master_replid:dfa343264a79179c1061f8fb81d49077db8e4e5f master_replid2:9ab608f7590f0e5898c4574299187a52ad0db7ec master_repl_offset:34874 second_repl_offset:6703 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:34874
In this way, when other replicas connect to a new master, they do not need another full synchronization. They can continue to synchronize the replica and then use the new era data.
How does replica handle expired keys?
- Replica does not actively delete expired keys. Replica will delete them only when the master gives the composite del command to replica through LRU and other memory elimination strategies or active access expiration
- There is a time difference above. Replica uses a logical clock internally. When the client attempts to read an expired key, replica will report that it does not exist.
More attention to WeChat official account, focus on sharing the dry cargo related to server development and programming: