Redis backup, disaster recovery and high availability

Redis has been widely used in a variety of Internet architecture scenarios. Its excellent performance, good operability, and a large number of scenario application cases have attracted much attention. In this paper, the author introduces a disaster recovery solution of redis in non large cluster distributed application scenario. Let's read it together~

1、 Brief introduction to redis

Redis is a high-performance key value non relational database. Because of its high-performance characteristics, supporting high availability, persistence, multiple data structures, clusters, etc., redis stands out and becomes a common non relational database.
In addition, redis has many scenarios.

  1. Session cache
    Redis caching session has a very good advantage, because redis provides persistence, which can provide good long session support in application scenarios that need to keep session for a long time, such as shopping cart scenarios, and provide users with a good shopping experience.
  2. Full page caching
    In WordPress, Pantheon provides a good plug-inwp-redis, this plug-in can load the pages you have visited at the fastest speed.
  3. queue
    Reids provides list and set operations, which makes redis a good message queuing platform.

    We often restrict our purchases through the queue function of reids. For example, when it comes to holidays or promotion, some activities will be carried out to limit the user’s purchase behavior, limiting that they can only buy goods several times today or only once in a period of time. It is also suitable.

  4. ranking
    Redis can increase or decrease the number in memory very well. So we will use redis in many ranking scenarios, such as novel website ranking novels, according to the ranking, the top novels will be recommended to users.
  5. Publish / subscribe
    Redis provides publish and subscribe functions. There are many scenarios for publish and subscribe. For example, we can build a chat system based on the script trigger of publish and subscribe.

In addition, there are many other scenes, and redis performs well.

2、 Single point of failure in redis

It is because redis has a variety of excellent features and rich application scenarios that redis has its presence in various companies. Then the problems and risks will come. Although redis has rich application scenarios, some companies are still relatively conservative in using single node deployment when practicing redis applications, which brings security risks for future maintenance.

In 2015, we dealt with a business interruption caused by a single point of failure. At that time, redis did not adopt distributed deployment, but adopted single instance deployment, and did not consider the problem of disaster recovery.

At that time, we used the redis server to control the user’s purchase behavior. However, due to unknown reasons, the server of the redis node went down, which made us unable to control the user’s purchase behavior. As a result, the user was able to purchase the preferential products many times in a period of time.

It can be said that this kind of downtime has caused irreparable losses to the company, and the security risk problem is very serious. As the operator of the system at that time, it is necessary for me to repair this problem and improve the architecture. So I began to solve the problem of redis single point of failure in non distributed applications.

3、 Backup and disaster recovery of redis application in non distributed environment

Redis master-slave replication should be very common now. There are two common master-slave replication architectures.

Common redis master-slave replication

  • Scheme 1
    Redis backup, disaster recovery and high availability
    This is the most common architecture, one master node and two slave nodes. When the client writes data, it writes to the master node, and when it reads data, it reads two slaves. In this way, the read expansion is realized, and the read load of the master node is reduced.
  • Scheme 2
    Redis backup, disaster recovery and high availability
    This architecture is also a master and two slave. The difference is that master and slave 1 use keepalived for VIP transfer. The client connects to the master through VIP. The IP change of scheme 1 is avoided.

Advantages and disadvantages of redis master slave replication

  • advantage
  1. Once the master fails, the slave node can be promoted to a new master and continue to provide services instead of the old master
  2. Implement read expansion. The master-slave replication architecture is generally used to achieve read expansion. Master mainly realizes the function of writing, slave realizes the function of reading
  • Insufficient
    Architecture scheme 1
    When the master fails, the client will be disconnected from the master, and the write function cannot be realized. At the same time, slave cannot copy from the master.
    Redis backup, disaster recovery and high availability

At this time, you need to go through the following operations (assuming that slave1 is promoted to master)

1) On slave1slaveof no oneCommand to promote slave1 to a new master node.
2) It is configured to be writable on slave 1 because in most cases, slave is configured as read-only.
3) Tell the client side (that is, the program connecting to redis) the connection address of the new master node.
4) Configure slave 2 to copy data from the new master.

Architecture scheme 2
When the master fails, the client can connect to slave1 for data operation, but slave1 becomes a single point of failure, which is often avoided.
Redis backup, disaster recovery and high availabilityAfter that, we need to go through the following operations:

1) Execute the slave of no one command on slave 1 to promote slave 1 to a new master node
2) It is configured to be writable on slave 1 because in most cases, slave is configured as read-only
3) Configure slave 2 to copy data from a new master

It can be found that no matter what kind of architecture scheme, it needs human intervention to fail over. The need for manual intervention increases the workload of operation and maintenance, but also has a huge impact on the business. In this case, sentinel, the highly available solution of redis, can be used

4、 Redis Sentinel

Redis sentinel provides a highly available solution for redis. In practice, using redis sentinel can create a redis environment that can prevent some failures without human intervention.
Redis sentinel is designed as a distributed architecture, running multiple sentinel processes to work together. Run multiple sentinel processes to cooperate. When the same sentinel master can no longer provide services, fault detection will be performed, which will reduce the possibility of false positives.

5、 Redis sentinel function

The main functions of redis sentinel in redis high availability solution are as follows:

  • monitor
    Sentinel will constantly check whether the master and slave are running as expected
  • notice
    Through the API, sentinel can inform the system administrator that the redis instance monitored by the program has failed
  • Automatic fail over
    If the master does not run normally as expected, sentinel can start the failure transfer process. One of the slaves will be promoted to be the master, and other slaves will be reconfigured to use the new master. Applications using redis service will also be informed to use the new address when connecting.
  • Configuration provider
    Sentinel can be used as the authentication source for client service discovery: the client connects to sentinel to obtain the redis master address currently responsible for a given service. Sentinel reports the new address in the event of a failover.

6、 Redis sentinel architecture

Redis backup, disaster recovery and high availability

7、 Implementation principle of redis Sentinel

Sentinel cluster monitors itself and redis master-slave replication. When the master node fails, the following steps will be taken:

  • 1) A leader is elected between sentinel, and the elected leader will fail over
  • 2) Sentinel leader selects one of the slave nodes as the new master node. For slave election, the method of slave election is as follows:

    a) Disconnection time with master
    If the time to disconnect from the master exceeds 10 seconds (sentinel configuration) * plus the time from the time when sentinel determines that the master is not available to the time when sentinel starts to perform a fail over, it is considered that the slave is not suitable to be promoted to master.
    b) Slave priority
    Each slave has a priority, which is saved in the redis.conf In the configuration file. If the priority is the same, proceed.
    c) Copy offset position
    The copy offset records where the data is copied from the master. The larger the copy offset is, the more data is received from the master. If the copy offset is the same, continue the election
    d) Run ID
    Choose slave with minimum run ID as new master
    The flow chart is as follows:
    Redis backup, disaster recovery and high availability

  • 3) Sentinel leader will perform slaveof no one operation on the new master selected in the previous step to promote it to the master node
  • 4) Sentinel leader sends commands to other slaves to make the remaining slaves become the slaves of the new master node
  • 5) Sentinel leader will demote the original master to slave. When it returns to normal operation, sentinel leader will send a command to copy from the new master
    All the above operations are completed by sentinel without human intervention.


Sentinel is used to realize the high availability of redis. When the master fails, it can realize the failure transfer without manual intervention. Avoid the impact on business, improve the efficiency of operation and maintenance.
When deploying sentinel, it is recommended to use an odd number of sentinel nodes and at least three sentinel nodes.

Write at the end

Since sentinel has many knowledge points, here is just an introduction to let you have an understanding.

Original text:…

Redis backup, disaster recovery and high availability

Recommended Today

Practice of query operation of database table (Experiment 3)

Following the previous two experiments, this experiment is to master the use of select statements for various query operations: single table query, multi table connection and query, nested query, set query, to consolidate the database query operation.Now follow Xiaobian to practice together!Based on the data table (student, course, SC, teacher, TC) created and inserted in […]