Learn about persistence and sentinel architecture of redis

Time:2021-5-11

Redis persistence

What is persistence

All redis data is stored in memory, and the update of data willasynchronousSave the file to disk smoothly.

Learn about persistence and sentinel architecture of redis

The way of persistence

snapshot

  • MySQL Dump
  • Redis RDB

journal

  • MySQL binlog
  • Redis AOF

RDB

What is RDB

Learn about persistence and sentinel architecture of redis

RDB persistence refers to writing the snapshot of data set in memory to disk within a specified time interval.

YesdefaultThis method is to write the data in memory to the binary file by snapshot, and the default file name is dump.rdb.

When redis is running, the RDB program saves the database snapshot in the current memory to the disk file. When redis is restarted, the RDB program can restore the state of the database by loading the RDB file.

operation mode

When redis needs to save the dump.rdb file, the server performs the following operations:

  1. Redis calls forks. Have both parent and child processes.
  2. The child process writes the data set to a temporary RDB file.
  3. When the subprocess finishes writing the new RDB file, redis replaces the original RDB file with the new RDB file and deletes the old RDB file.

This way of working enables redis to benefit from the copy on write mechanism.

Three trigger mechanisms

Save command

saveCommand to execute a synchronous operation to save the snapshot of all data in the form of RDB file.

127.0.0.1:6379> save
OK

Learn about persistence and sentinel architecture of redis

It should be noted thatsaveThe order issynchronizationOrders. If there is too much data, it will cause blocking.

Learn about persistence and sentinel architecture of redis

In addition, we need to pay attention to the implementationsaveThe command will override the previous RDB file.

Bgsave command

bgsaveCommand to perform an asynchronous operation to save a snapshot of all data in the form of an RDB file.

127.0.0.1:6379> bgsave
Background saving started

Learn about persistence and sentinel architecture of redis

Redis uses Linux systemfock()A sub process is generated to save DB data to disk, and the main process continues to provide services for clients to call.

If the operation is successful, you can use the client commandLASTSAVETo check the results of the operation.

LASTSAVEIt will return the last time redis successfully saved data to disk, expressed in UNIX timestamp format.

127.0.0.1:6379> LASTSAVE
(integer) 1609294414

saveAndbgsavecontrast

command save bgsave
IO type synchronization asynchronous
block yes Yes (blocking occurs at Fock (), usually very fast)
Complexity O(n) O(n)
advantage No additional memory is consumed Do not block client commands
shortcoming Block client command Need Fock subprocess, consume memory

Auto save

Learn about persistence and sentinel architecture of redis

We can set redis through the configuration file to save the dataset automatically when the condition of “there are at least m changes to the dataset in N seconds” is met.

Related configuration

#RDB automatic persistence rules
#When at least one key is changed in 900 seconds, the data set will be saved automatically
save 900 1
#When at least 10 keys are changed in 300 seconds, the data set is saved automatically
save 300 10
#When at least 10000 keys are changed within 60 seconds, the data set saving operation is automatically performed
save 60 10000

#RDB persistent file name
dbfilename dump-<port>.rdb

#Data persistence file storage directory
dir /var/lib/redis

#Whether to stop writing when bgsave error occurs. The default is yes
stop-writes-on-bgsave-error yes

#Does RDB file use compressed format
rdbcompression yes

#Check and verify the RDB file. The default value is yes
rdbchecksum yes

advantage

  • Suitable for large-scale data recovery.
  • If the business does not require high data integrity and consistency, RDB is a good choice.

shortcoming

  • The integrity and consistency of the data is not high, because the RDB may be down at the last backup.
  • It takes up memory during backup, because redis creates a sub process independently during backup, writes data to a temporary file, and finally replaces the previous backup file with the temporary file. So we have to consider about twice the data expansion.
  • Aiming at the problem that RDB is not suitable for real-time persistence, redis provides AOF persistence to solve it.

AOF

Aof (append only file) persistence: record each write command in the form of independent log, and re execute the commands in AOF file when restarting to recover data. The main function of AOF is to solve the real-time problem of data persistence, which has been the mainstream way of redis persistence.

How to create AOF

Learn about persistence and sentinel architecture of redis

Aof recovery principle

Learn about persistence and sentinel architecture of redis

Three strategies

always

Every time a new command is appended to the AOF file, it is executedfsyncVery slow and very safe.

Learn about persistence and sentinel architecture of redis

everysec

Fsync once per second: fast enough, and only one second of data is lost in the event of a failure.

Recommended (and also recommended)default)For fsync once per second, this fsync strategy can balance speed and security.

Learn about persistence and sentinel architecture of redis

no

The data is processed by the operating system, which decides when to synchronize the data.

Learn about persistence and sentinel architecture of redis

Three comparisons

command always everysec no
advantage No data loss Fsync once per second may lose one second of data save worry
shortcoming High IO overhead One second of data may be lost Uncontrollable

Aof rewriting

Because the operation of AOF is to add commands to the end of the file continuously, the volume of AOF file will become larger and larger with the increase of write commands.

Therefore, redis will eventually rewrite expired and repeated commands into effective ones.

The AOF file can be rebuilt without interrupting the service client. After executing the bgrewriteaof command, redis will generate a new AOF file, which contains the minimum commands needed to rebuild the current dataset.

Learn about persistence and sentinel architecture of redis

The role of AOF rewriting

  • Reduce disk usage
  • Accelerate data recovery

Two ways to implement AOF rewriting

BGREWRITEAOF

Perform an AOF file rewrite operation. Rewriting creates a volume optimized version of the current AOF file.

Even if bgrewriteaof fails, there will be no data loss, because the old AOF file will not be modified until bgrewriteaof succeeds.

Rewriting will only be triggered when no other persistence work is executed in the background

  • If the child process of redis is saving the snapshot, the operation of AOF rewriting will be scheduled, and the AOF rewriting will be executed after the saving is completed. In this case, the return value of bgrewriteaof is still zeroOK
  • If other AOF file rewriting is already in progress, bgrewriteaof returns an error, and the new bgrewriteaof request will not be scheduled for the next execution.

Since redis 2.4, AOF rewriting is triggered by redis itself, and bgrewriteaof is only used to trigger rewriting manually.

127.0.0.1:6379> BGREWRITEAOF
Background append only file rewriting started

Aof rewrite configuration

Configuration name meaning
auto-aof-rewrite-min-size Size required for AOF file rewriting
auto-aof-rewrite-percentage Aof file growth rate
Statistical name meaning
aof_current_size Current size of AOF file (bytes)
aof_base_size Size of AOF file when it was last started and rewritten (bytes)

When the timing is automatically triggered and satisfied at the same time:

  • aof_current_size > auto-aof-rewrite-min-size
  • (aof_current_size – aof_base_size) * 100 / aof_base_size > auto-aof-rewrite-percentage

Aof rewriting process

Learn about persistence and sentinel architecture of redis

Aof related configuration

#Enable AOF persistence mode
appendonly yes

#Aof persistent file name
appendfilename appendonly-<port>.aof

#Synchronize the buffer data to disk per second, and implement the synchronization strategy
appendfsync everysec

#Data persistence file storage directory
dir /var/lib/redis

#Do you want to unsynchronize data to AOF files when rewriting
no-appendfsync-on-rewrite yes

#Minimum size to trigger rewriting of AOF file
auto-aof-rewrite-min-size 64mb

#Growth rate of triggering AOF file rewriting
auto-aof-rewrite-percentage 100

Advantages of AOF

  • Aof can better protect data from loss. Generally, AOF will be executed every 1 second through a thread in the backgroundfsyncIf the redis process hangs up, it will lose at most one second of data.
  • Aof toappen-onlySo there is no disk addressing overhead and the write performance is very high.
  • The commands of AOF log file are recorded in a very readable way, which is very suitable for catastrophic accidental deletion and emergency recovery.

Disadvantages of AOF

  • For the same file, AOF file is larger than RDB data snapshot.
  • When AOF is turned on, the QPS for writing will be lower than that for writing supported by RDB, because AOF is generally configured to write per secondfsyncOperations, per secondfsyncThe operation is still very high.
  • Data recovery is relatively slow, so it is not suitable for cold standby.

RDB and AOF

command RDB AOF
boot priority low high
volume Small large
Recovery speed fast slow
Data security Lose data Decision based on Strategy
Weight heavy light

How to choose

  • Don’t just use RDB, it will lose a lot of data.
  • Don’t just use AOF, because there are two problems. First, the recovery speed of AOF as cold standby is faster than RDB as cold standby; Second, RDB simply generates data snapshot each time, which makes it more robust.
  • Integrating AOF and RDB, AOF is the first choice for data recovery; Using RDB to do different degrees of cold standby, when AOF files are lost or damaged and unavailable, RDB can be used for rapid data recovery.

Master slave replication

Before we learn about master-slave replication, let’s see what’s wrong with stand-alone?

  • Single machine failure, such as CPU failure, memory failure, downtime.
  • Capacity bottlenecks.
  • QPS bottleneck. Although the official website of redis says that it can reach 10W QPS, if we want to reach 100W QPS, it is obviously impossible for a single machine.

What is master-slave replication

Master slave replication is used to establish a database environment exactly like the master database, which is called slave database. The master database is generally a quasi real-time business database. In the most commonly used MySQL database, single item and asynchronous assignment are supported. In the process of assignment, one server acts as the master server, while the other server acts as the slave server; At this time, the master server will write the update information to a specific binary file.

An index of the file is maintained to track the log loop. This log can be recorded and sent to the update from the server. When a slave server connects to the master server, the slave server will inform the master server to read the location of the last successful update from the log file of the server. Then the slave server will receive any updates from which time, lock and wait until the master server notifies the new updates.

The role of master-slave replication

  • Ensure data security; Do data hot standby, as a backup database, after the main database server failure, can switch to the secondary database to continue to work, to avoid data loss.
  • Improve I / O performance; With the increasing business volume in daily production, I / O access frequency is higher and higher, which can not be met by a single machine. At this time, multi database storage can effectively reduce the frequency of disk I / O access and improve the I / O performance of a single device.
  • Read write separation; The database can support more concurrency.

Learn about persistence and sentinel architecture of redis

It also supports one master and multiple slaves.

Learn about persistence and sentinel architecture of redis

Simple demonstration:

Learn about persistence and sentinel architecture of redis

Summary

  • A master can have multiple slaves
  • A slave can only have one master
  • Data flow is unidirectional, from master to slave

Two ways to realize master-slave replication

Slaveof command

127.0.0.1:6379>slaveof ip port

Learn about persistence and sentinel architecture of redis

Cancel copy

It should be noted that after the master-slave replication is disconnected, the data synchronized by the master will still be retained.

127.0.0.1:6379>slaveof no one

Learn about persistence and sentinel architecture of redis

configuration file

#Configure IP and port of master node
slaveof ip port
#The slave node is read-only to avoid inconsistency between master and slave data
slave-read-only yes

Comparison of the two methods

mode command to configure
advantage No need to restart Unified configuration
shortcoming It’s not easy to manage Need to restart

Practice

First, prepare two CentOS,

My main node here is: 192.168.3.155

From node:192.168.3.156

We use the configuration file way in theredis.confMedium configurationslaveof 192.168.3.155 6379

Note that the firewall opens port 6379.

After starting the master and slave nodes, we can use the info replicationsee.

192.168.3.155:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.3.156,port=6379,state=online,offset=15,lag=1
master_repl_offset:15
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:14
192.168.3.156:6379> info replication
# Replication
role:slave
master_host:192.168.3.155
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:15
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

We can then test it by adding test data to the master node to see if the slave node can get it.

192.168.3.155:6379> set jack hello
OK
192.168.3.156:6379> get jack
"hello"

Full replication

It is a very heavy operation for IE to send all the data in the master node to the slave node when it is used for the first replication or when it cannot be partially replicated. When the amount of data is large, it will cause a lot of overhead to the master and slave nodes and the network.

Learn about persistence and sentinel architecture of redis

  1. Redis will issue a synchronization command internally. At the beginning, it waspsyncCommand,psync ? -1Indicates that the master is required to synchronize data.
  2. The host will send a message to the slaverunid(redis cli info server) andoffsetBecause slave has no corresponding offset, it is full copy.
  3. Save the basic information of the host from the opportunitysave masterinfo
  4. After receiving the command of full copy, the master node executesbgsave(asynchronous execution), generate RDB file (snapshot) in the background, and use a buffer (called copy buffer) to record all commands executed from now on.
  5. The host sends the RDB file to the slave.
  6. Send buffer data.
  7. To refresh the old data, the slave node must clear the old data before loading the data of the master node.
  8. Load the RDB file to update the database state to the master node for executionbgsaveThe state of the database and the loading of buffer data.

Copy offset

The slave node reports its replication offset to the master node every second, because the master node also saves the replication offset of the slave node,slave_repl_offsetIndicators. The statistical indicators are as follows:

192.168.3.156:6379> info replication
# Replication
role:slave
master_host:192.168.3.155
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:15
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Master and slave nodes participating in replication maintain their own replication offset. After the master node processes the write command, it will accumulate the byte length of the command, and the statistical information will be displayed in theinfo replicationInmaster_repl_offsetIn the indicators.slave0The slave node information is recorded.

192.168.3.155:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.3.156,port=6379,state=online,offset=15,lag=1
master_repl_offset:15
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:14

After receiving the command from the master node, the slave node will also accumulate and record its own offset. Statistics ininfo replicationInslave_repl_offset

Partial replication

Learn about persistence and sentinel architecture of redis

  1. If the network jitters (connection lost)
  2. The host master will still write the replication buffer
  3. Slave will continue to try to connect to the host
  4. Slave will transfer its current runid and offset to master, and execute pysnc command to synchronize
  5. If the master finds that your offset is within the range of the buffer, it will return the continue command
  6. Partial data of offset is synchronized, so offset offset is the basis of partial replication.

How to choose

After the slave node sends the offset to the master node, the master node decides whether to perform partial replication according to the offset and buffer size.

If the data after offset is still in the copy backlog buffer, executePartial replication

If the data after the offset is no longer in the copy backlog buffer (the data has been squeezed), executeFull replication

When the master node and the slave node copy for the first time, the master node sends its runid to the slave node, and the slave node saves the runid. When the line is disconnected and reconnected, the slave node sends the runid to the master node, and the master node judges whether it can partially copy according to the runid.

If the runid saved by the slave node is the same as the current runid of the master node, it means that the master and slave nodes have been synchronized before, and the master node will continue to try to use partial replication (whether partial replication can be achieved depends on offset and replication backlog buffer).

If the runid saved by the slave node is different from the current runid of the master node, it means that the redis node synchronized by the slave node before disconnection is not the current master node and can only be copied in full.

Runid can be accessed throughinfo serverCommand to view.

192.168.3.156:6379> info server
# Server
redis_version:3.0.7
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:3fdf3aafcf586962
redis_mode:standalone
os:Linux 3.10.0-1127.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:11306
run_id:116ef394d1999f8807f1d30d1bf0dc79aa8d865d
tcp_port:6379
uptime_in_seconds:3442
uptime_in_days:0
hz:10
lru_clock:14271160
config_file:/usr/local/redis-3.0.7/redis.conf

The problem of master-slave replication

  • Master slave copy, master hang up after the need to manually operate trouble.
  • Write ability and storage capacity are limited (master-slave replication is only backup, single node storage capacity).

If the master is broken at this time, the master-slave copy will be broken. Then it’s a failure.

Learn about persistence and sentinel architecture of redis

At this time, we can only select one slave node to executeslaveof no one. Then let the other slave nodes select the new master node.

Redis sentinel architecture

On the basis of master-slave replication, several redis sentinel nodes are added, which do not store data. When redis fails, it will automatically fail over, and then notify the client.

A set of redis sentinel cluster can monitor multiple sets of redis master-slave, and each set of redis master-slave is identified by master name. The client does not directly connect to the redis service, but to the redis sentinel.

In redis sentinel, it is clear which is the master node.

Learn about persistence and sentinel architecture of redis

Fail over process

  1. Multiple sentinel found and confirmed that there was a problem with the master.
  2. A sentinel was elected as the leader.
  3. Select a slave as the master.
  4. Notify the remaining slaves to become salves of the new master.
  5. Notify client of master-slave changes.
  6. Wait for the old master to revive and become a slave of the new master.

Installation and configuration

sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1

Add in the master nodedaemonize yesAfter configuration. We start

redis-sentinel sentinel.conf

We execute the command to see if it has started. You can see that port 26379 has been started.

[[email protected] redis]# ps -ef | grep redis-sentinel
root     11056     1  0 18:38 ?        00:00:00 redis-sentinel *:26379 [sentinel]
root     11064  9916  0 18:41 pts/0    00:00:00 grep --color=auto redis-sentinel

Then we connect and we can use theinfoCommand to view information

[[email protected] redis]# redis-cli -p 26379
127.0.0.1:26379> info
# Server
redis_version:3.0.7
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:311215fe18f833b6
redis_mode:sentinel
os:Linux 3.10.0-1127.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:11056
run_id:855df973568ff3604a9a373a799c24601b15822a
tcp_port:26379
uptime_in_seconds:260
uptime_in_days:0
hz:17
lru_clock:14279821
config_file:/usr/local/redis-3.0.7/sentinel.conf

# Sentinel
sentinel_ masters:1  #  A master
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
Master0: name = mymaster, status = OK, address = 127.0.0.1:6379, slaves = 2, sentinels = 1 ා two slave nodes

If we go to the sentinel configuration file, we can see that it has changed, and the slave node has been configured in it.

sentinel monitor mymaster 127.0.0.1 6379 2
sentinel known-slave mymaster 192.168.3.156 6379
sentinel known-slave mymaster 192.168.3.157 6379

Then we configure it on other slave nodessentinel.conffile

Add the following code:

daemonize yes
#Configure master information
sentinel monitor mymaster 192.168.3.155 6379 2

Then start.

ReexecutioninfoYou can see that there are now threesentinelsNode.

master0:name=mymaster,status=ok,address=127.0.0.1:6379,slaves=2,sentinels=3

Client connection

Now that high availability has been achieved, why not connect directly?

High availability involves high availability of services and automatic failure transfer; After the failure transfer, the client can’t feel it and can’t guarantee the normal use.

What needs to be guaranteed is thatService highly availableandClient high availability

Basic principles of client implementation

  1. Get all sentinel nodes and mastername, traverse sentinel collection to get an available sentinel node.

    Learn about persistence and sentinel architecture of redis

  2. Send a sentinel message to the available sentinel nodesget-master-addr-by-nameTo get the master node information.

    Learn about persistence and sentinel architecture of redis

  3. The client will execute once after getting the master noderoleperhapsrole replicationTo verify whether it is a master node.

    Learn about persistence and sentinel architecture of redis

  4. When the master node changes, sentinel is aware (sentinel does all the fault discovery and transfer).
    How does sentinel inform the client?
    Inside, it’s aPublish subscribe mode, the client subscribes to a certain channel of sentinel, and the information about who is the master in the channel. If the sentinel publishes a message in the channel, the client subscribes to get the information and connects through the new master information.

    Learn about persistence and sentinel architecture of redis

The complete flow chart is as follows:

Learn about persistence and sentinel architecture of redis

JedisSentinelPool sentinelPool = new JedisSentinelPool(masterName, sentinelSet, poolConfig, timeout);
Jedis jedis = null;
try {
    jedis = redisSentinelPool.getResource();
} catch(Exception e) {
    logger.error(e.getMessage(), e);
}finally {
    if(jedis != null) {
        jedis.close();
    }
}
jedis
  • Jedissentinelpool is not a connection pool to connect to a collection of sentinel nodes.
  • In essence, it is still connected to the master.
  • It’s just a distinction from jedispool.

Fail over drill

It’s still IP192.168.3.155Master node.

156 and 157 are slave nodes, which start three sentinels.

/**
 *@ author is bad and charming
 * official account: rookie programmer Java
 * @date 2020/12/31
 * @Description:
 */
public class RedisSentinelTest {

    private static Logger logger = LoggerFactory.getLogger(RedisSentinelTest.class);

    public static void main(String[] args) {
        String masterName = "mymaster";
        Set<String> sentinels = new HashSet<>();
        sentinels.add("192.168.3.155:26379");
        sentinels.add("192.168.3.156:26379");
        sentinels.add("192.168.3.157:26379");
        JedisSentinelPool jedisSentinelPool = new JedisSentinelPool(masterName, sentinels);
        int count = 0;
        while (true) {
            count++;
            Jedis jedis = null;
            try {
                jedis = jedisSentinelPool.getResource();
                int index = new Random().nextInt(10000);
                String key = "k-" + index;
                String value = "v-" + index;
                jedis.set(key, value);
                if (count % 100 == 0) {
                    logger.info("{} value is {}", key, jedis.get(key));
                }
                TimeUnit.MILLISECONDS.sleep(10);
            } catch (Exception e) {
                logger.error(e.getMessage(), e);
            } finally {
                if (jedis != null) {
                    jedis.close();
                }
            }
        }
    }
}

Let’s start up normally.

11:06:55.050 [main] INFO RedisSentinelTest - k-6041 value is v-6041
11:06:56.252 [main] INFO RedisSentinelTest - k-3086 value is v-3086
11:06:57.467 [main] INFO RedisSentinelTest - k-3355 value is v-3355
11:06:58.677 [main] INFO RedisSentinelTest - k-6767 value is v-6767

At this time, we directly force to stop the master node, which is 155 nodes.

192.168.3.155:6379> info server
# Server
redis_version:3.0.7
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:3fdf3aafcf586962
redis_mode:standalone
os:Linux 3.10.0-1127.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:4129
run_id:3212ce346ed95794f31dc30d87ed2a4020d3b252
tcp_port:6379
uptime_in_seconds:1478
uptime_in_days:0
hz:10
lru_clock:15496249
config_file:/usr/local/redis-3.0.7/redis.conf

getprocess_id:4129We directlykill -9 4129, kill this process

Then we check to see if the process still exists.

ps -ef | grep redis-server | grep 6379

When we find no, we look at the Java console. After a certain period of time, the failover is completed. The program can still run normally.

Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.ConnectException: Connection refused (Connection refused)
    at redis.clients.jedis.Connection.connect(Connection.java:207)
    at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:93)
    at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:1767)
    at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:106)
    at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:868)
    at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
    at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
    at redis.clients.util.Pool.getResource(Pool.java:49)
    ... 2 common frames omitted
Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at redis.clients.jedis.Connection.connect(Connection.java:184)
    ... 9 common frames omitted
December 31, 2020 11:14:13 am redis.clients.jedis.jedis sentinelpool initpool
Info: created jedispool to master at 192.168.3.156:6379
11:14:13.146 [main] INFO RedisSentinelTest - k-7996 value is v-7996
11:14:14.339 [main] INFO RedisSentinelTest - k-9125 value is v-9125
11:14:15.597 [main] INFO RedisSentinelTest - k-2589 value is v-2589

This automatically completes the failover.

Subjective offline and objective offline

  • Subjective offline: each sentinel node’s view on the failure of redis node.

    • sentinel down-after-millseconds masterName timeout
    • Each sentinel node will ping the redis node every second. If it does not get a Pong after more than timeout milliseconds, the redis node is considered offline.
  • Objective offline: all sentinel nodes reach a consensus on the failure of redis node.

    • sentinel monitor masterName ip port quorum
    • More than or equal to quorum sentinels subjectively think that redis nodes fail to go offline.
    • adoptsentinel is-master-down-by-addrI think redis master is offline.

Learn about persistence and sentinel architecture of redis

Leader election

  • Reason: only sentinel nodes complete the failover
  • Election: passedsentinel is-master-down-by-addrI want to be a successful leader.

    • Each subjective offline node sends a command to the other sentinel nodes to set it as the leader.
    • If the sentinel node receiving the command has not agreed with the command sent by other nodes, it will agree with the node, otherwise it will refuse.
    • If the sentinel node finds that it has more than half of the sentinel set and more thanquorumThen you will be a leader.
    • If more than one sentinel becomes a leader in this process, the election will be held again after a period of time.

Fail over

When the Master goes down, a proper slave node is selected to upgrade to the master node. Sentinel will automatically complete this operation without manual implementation.

The specific steps are as follows:

  1. Select a node from the node list as the new master node. The selection method is as follows:

    • Unhealthy filtering (subjective offline, disconnection), no reply to sentinel node within 5 seconds, loss of contact with master node for more than 10 minutesdown-after-millisecondsSet up.
    • Select the node list with the highest slave priority, return if it exists, and continue if it does not exist.
    • Select the slave node with the largest copy offset (the most complete copy), return if it exists, and continue if it does not exist.
    • Select the slave node with the smallest runid.
  2. The sentinel leader node performs the task on the slave node selected in the first stepslaveof no oneCommand to make it the master node.
  3. The sentinel leader node will send commands to the remaining slave nodes to make them the slave nodes of the new master node, copy the rules and rulesparallel-syncsParameters.
  4. Sentinel node collection will update the original master node to the slave node, and keep paying attention to it. When it is restored, it will be ordered to copy the new master node.

High availability read write separation

Let’s first understand the current role of slave nodes

  • When the primary node fails, as the backup “top” of the primary node, redis sentinel has realized the automation of this function and realized the real high availability.
  • It is very suitable to expand the reading ability of the master node, especially in the scenario of more reading and less writing.

Learn about persistence and sentinel architecture of redis

But in the current model,Slave nodes are not highly available

  • If the slave-1 node fails, firstly, the client-1 will lose contact with it. Secondly, the sentinel node will only log off subjectively,This is because the failure transfer of redis sentinel is for the primary node
  • Therefore, most of the time, the slave node in redis sentinel is only a hot standby as the master node, and it is not allowed to participate in the read operation of the client, just to ensure the overall high availability. However, in fact, there is still some waste in this method of use, especially in many scenarios where the slave node or the slave node does need to be read and written separately, So it is necessary to realize the high availability of slave nodes.

thinking

Where is redis sentinelIn the monitoring of each node, if there is a corresponding event, the corresponding event message will be sent out

  • +switch-master:Switch the master node (the original slave node is promoted to the master node).
  • +convert-to-slave:Switch the slave node (the original master node is demoted to the slave node).
  • +sdown:Subjective offline means that a slave node may not be available (because there is no objective offline for the slave node), so when implementing the client, we can use our own strategy to achieve the function similar to subjective offline.
  • +reboot:A node is restarted. If its role is slave, then a slave node is added.

Therefore, when designing the highly available slave nodes of redis sentinel, as long as the status of all the slave nodes can be mastered in real time, all the slave nodes are regarded as a resource poolWhether it is on-line or off-line from the node, the client can sense it in time (add or delete it from the resource pool), so that the high availability goal of the slave node is achieved.

Learn about persistence and sentinel architecture of redis

Recommended Today

Design of multiple login users in ABP

scene In the “school management system”, students, parents, teachers, educational administration may log in and do some of their own operations. These users need different attributes. For example, students have student numbers, but teachers don’t. Application users When coding, you often need to get the information of the current login user, which is the application […]