Redis replication

Time:2021-7-27

1、 Replication introduction

Master-slave replication is a mechanism for copying data from one redis server to other servers. The former is called master and the latter is called slave.

Main functions of master-slave replication:

  1. Data redundancy:Data hot standby, multi machine backup.
  2. Fault recovery: when the master node has problems, the slave node can provide services, which is a kind of functional redundancy.
  3. load balancing : the master node can write and there are many slave nodes. The pressure can be distributed to multiple slave nodes to achieve load balancing.
  4. High availability cornerstone: master-slave replication is the basis of sentinel and cluster.

By default, each redis server is a master node. Each master can have multiple slave nodes, but a save node can only have one master node.

2、 Copy configuration

2.1 create replication

2.1.1 command

  1. Add slaveof {masterhost} {masterport} to the configuration file
  2. After the redis server startup command, add — slaveof {masterhost} {masterport}
  3. Directly use the command (executed on the client): slaveof {masterhost} {masterport}

2.1.2 demonstration

Prepare node

By default, 6380 port is used as the master node, and another 6380 node is started as the save node.

Copy a configuration file named redis6380.conf

Configure a new port number

Two redis instances have been started.

Execute copy command

Start two instances

Execute the slaveof command

View effect

Master node write, read

Slave node reads data

So far, the replication setup is successful, the data has been successfully copied from the master to the slave, and the data has been successfully read through the save.

2.2 fracture replication

3.2.1 direct disconnection

Directly use the slaveof no one command to disconnect the replication relationship with the master

Data after the replication relationship is broken

  1. The original copied data will be retained
  2. Subsequent data written by the master will no longer be synchronized

You can see that the original data is retained

Redis replication

Write in the original master after disconnection

Data will no longer be synchronized to 6380

3.2.2 switching to another master

You can disconnect the current master by switching to another master. But unlike slave no one,After switching to a new master, the data copied from the original master will be cleared.

3、 Topology

One to one, one to many, tree structure.

4、 Replication process

Redis replication

4.1 save master node

After slaveof is executed, the slave node only saves the address information of the master node and returns directly. The replication process has not officially started.

4.2 establishing socket connection between master and slave

The slave node processes the relevant logic through the scheduled tasks running every second. When the scheduled task finds that there is a new master node, it will try to establish a network connection with the master node.

The slave node will create a socket to connect to the master node, and subsequent data synchronization is based on this socket.

If the slave node cannot establish a connection, the scheduled task will retry indefinitely until the connection is successful or the replication is cancelled.

4.3 send ping command

After the connection is established successfully, the slave node will send a ping request to the master node for the first communication. The main purposes are as follows:

  1. Check whether the previously established socket is available.
  2. Detects whether the master node can currently accept processing commands

After sending the ping command, if the slave node does not receive a response or times out (such as network timeout, or the master node is blocked and unable to handle it), the slave node will disconnect the replication and reconnect after the next scheduled task is initiated.

4.4 authority verification

If the master node sets the requirepass parameter, password authentication is required. The slave node must configure the masterauth parameter to ensure that the same password as the master node can pass the authentication. If the authentication fails, the slave node will disconnect the replication and reconnect after the next scheduled task is initiated.

4.5 data synchronization

After the master-slave replication connection is established successfully, data synchronization starts. It belongs to data initialization. The master node will send all the data held to the slave node. The theme is that the implementation method is to send PSYNC command from the slave node to the master node (sync command before 2.8). This is the longest time-consuming step, which is divided into full synchronization and partial synchronization.

4.6 command continuous synchronization

After the master node synchronizes the current data to the slave node, the replication establishment process is completed. The subsequent master node will continue to send commands to the slave node to ensure the consistency between the master and slave nodes.

5、 Data synchronization principle

After the master-slave connection is successfully established, the slave node will send PSYNC command to the master node to complete data synchronization. The synchronization process is divided into full replication and partial replication.

  1. Full copy: it is generally used in the initial copy scenario. The master node sends all data to the slave node at one time, which is a relatively heavy operation
  2. Partial replication: it is used to deal with the scenario of data loss caused by network flash failure in master-slave replication. When the master-slave is connected again, if the master node completely saves the data during the interruption, the master node will reissue the lost data to the slave node. The reissued data is much smaller than the full data. Partial replication effectively avoids the high overhead of full replication.

5.1 components required by PSYNC command

The PSYNC command requires the support of the following components:

  1. Master slave copy offset
  2. Master node copy squash buffer
  3. Master node run ID

5.1.1 master slave copy offset

After the master node processes the write command, it will accumulate the byte length of the command.

After the slave node receives the command sent by the master node, it will also accumulate its own offset.

By comparing the master offset with the slave offset, we can see the data difference between slave and master.

5.1.2 master node copy backlog buffer

The replication buffer is a fixed length queue stored on the master node. The default size is 1MB. When there is a slave, the buffer will be created back. When the master node responds to the write command, it will not only send the command to the slave node, but also write to the replication buffer.

The squeeze buffer is a first in first out queue. If the capacity is exceeded, the previous data will be overwritten. The size is configurable. The squeeze buffer is mainly prepared for partial replication. Can passinfo replicationTo view:

repl_ backlog_ Active: 1 // enable the copy buffer
repl_ backlog_ Size: 1048576 // maximum buffer length
repl_ backlog_ first_ byte_ Offset: 4505 // start variance. Calculate the available range of the current buffer
repl_ backlog_ Histlen: 5460 // effective length of saved data

5.1.3 master node operation ID

After each redis node (master-slave) is started, a 40 bit hexadecimal string will be generated as the operation ID to uniquely identify a redis node. The slave node will save the running ID of the master node to identify which master node it is replicating. The ID will change after redis is restarted.

During the initial replication, the slave node saves the runid of the master node.

5.2 PSYNC command

The slave node sends PSYNC command to the master node to realize partial replication or full replication. The command format is:

psync {runid} {offset}
  1. Runid: the runid of the auxiliary primary node,
  2. Offset: the data offset of the current slave node

If there is no offset and the runid of the master node during the first replication, the PSYNC – 1 command will be sent

Full replication

  1. Send PSYNC for first replication- one
  2. The master continues to request full replication and restores fullresync {runid} {offset}
  3. Slave saves the runid and offset of the response from the master node
  4. The master node executes bgsave and writes the command starting from the selection to a buffer (copy buffer) at the same time
  5. After executing bgsave, the master node sends the final RDB to save.
  6. After receiving the RDB sent by the master node, the salve starts to clear its own data
  7. Save loads the RDB of the master node into its own RDB. At this time, the data of save is updated to the state when the master node executes bgsave.
  8. The main node willCopy bufferSend the command to save
  9. The slave executes the received command to copy the buffer. So far, the data of the save is updated to the latest state of the master node
  10. If the slave has enabled AOF, it will do bgrewrite AOF immediately to ensure that the AOF persistent files are available immediately after full replication.

5.4 partial reproduction

During partial replication, redis uses PSYNC {runid} {offset} command to implement an optimization measure for excessive full replication overhead.

In the process of master-slave replication, if there are exceptions such as network flash off or command loss, the slave node will ask the master node to reissue data. If this part of data just exists in the replication backlog buffer memory of the master node at this time(That is, the data of the slave node is not synchronized during the period of network disconnection), it is directly sent to the slave node, which finally maintains the consistency with the slave node and avoids large-scale full replication.

  1. If the network between the master and slave nodes is interrupted, if the repl timeout time is exceeded, the master node will think that the slave node has failed and interrupt the replication connection.
  2. After the master-slave is disconnected, the master node is still responding to the command. At this time, the new command cannot be synchronized to the slave node,The master and slave are inconsistent。 It’s also explained aboveThe master node will write the command to the replication backlog buffer by default, which is 1m by default. It will be overwritten after exceeding it.
  3. The network is restored, the slave node connects to the master node again, and the connection is established successfully.
  4. The slave node saves the master node’s runid and its own copy offset, and interacts with the master node through the PSYNC {runid} {offset} command
  5. If the master node is foundThe conditions for partial replication are met(this condition will be explained in detail later), then continue is returned to the slave node
  6. The master node sends the data of the replication backlog buffer to the slave node according to the offset, and finally ensures that the master-slave replication enters the normal state.

The master node judges that the conditions for partial replication are met

  1. Runid must be consistent with itself
  2. It is easy to understand that the data after the offset sent from the node is in its own replication backlog buffer. For example, the offset sent from the node is 10 (meaning that the data after 10 is not synchronized), but the offset of the buffer is 15 (indicating that the data before 15 is no longer in the buffer). At this time, there is no way to perform partial replication

If partial replication conditions are not met, the master node will return fullsync to the slave node, and the slave node will start full replication

This shows that the size of the replication backlog buffer is important, if it is too small, it will be overwritten, resulting in the failure of partial replication after the recovery of the master-slave network. This value should be calculated based on the network interruption time, the QPS of the master node and the size of the command, and then set reasonably.

6、 Master slave heartbeat

6.1 process

Schematic diagram of master-slave heartbeat detection

  1. The master will ping the slave periodically. The cycle time is controlled by the repl Ping replica period parameter. The default is 10 seconds

  2. The slave sends the replconf ACK {offset} command back to the master every 1 second:

    • Monitor the network status of master-slave nodes in real time

    • Report its own data replication offset. If the master node finds that the slave node is missing data, the master node will pull data from its own replication backlog buffer and send it to the slave node

    • The number and latency of slave nodes are defined by Min replicas to write (minimum number of available slave nodes) and min replicas Max lag (minimum delay seconds allowed, generally 0 or 1).

      • If the master turns on these two parameters, if the available slave nodes are less than min replica to write or the delay is greater than min replica Max lag, the master will reject data writing. The schematic diagram is as follows.

6.2 repl – timeout parameter

Redis.conf has a repl timeout parameter:

  1. From the slave perspective, if the transmitted RDB snapshot data is not received within the repl timeout time,

  2. From the slave perspective, if no packet or Ping is received from the master during repl timeout.

  3. From the master perspective, if the repconf ack confirmation message is not received at the repl timeout time.

When redis detects the repl timeout timeout (the default value is 60s), it will close the connection between the master and the slave, and redis slave will re-establish the request for the master-slave connection. This value must be greater than the repl Ping replica period parameter

In order to reduce the master-slave delay, it is generally recommended to deploy the master-slave nodes of redis in the same machine room.

7、 Full copy scenario

Full replication is very heavy and should be avoided as much as possible. Here are some operations that will lead to full replication.

  1. Establish replication for the first time, it cannot be avoided. It is recommended to do it at low peak
  2. Runid mismatch, the slave node will save the runid of the master node. If the master node restarts, the runid of the master node will change. If it is found that it is inconsistent with the runid saved by the slave node, it will be copied in full. Restart should be avoided. For example, the debug reload command or failover function can be used. When the master node fails, the slave node can be promoted to the master node, Or use sentinel or cluster solutions
  3. Insufficient replication backlog buffer (repl backlog size), the default size of this buffer is 1m. When it exceeds 1m, it will be overwritten. After the master-slave interrupts and reconnects again, if the offset of the slave node cannot be found in the replication backlog buffer, it will lead to full replication. The size of this buffer should be calculated and configured based on Network conditions, command size and QPS.

7、 Some configurations and commands

  1. salve-read-only=yes。 The slave node is read-only. If the slave node is modified, the master-slave data will be inconsistent
  2. Repl disable TCP nodelay whether to turn off TCP_ Nodelay, the default is no, and it is recommended to configure it as yes. This is a function of TCP server,TCP Nagle algorithm
  3. Debug reload will not cause the runid to change, but will load the memory data from RDB again.
  4. 。。。