Redis cluster (1)


1、 Introduction to cluster

Redis cluster is a distributed solution officially launched after redis3.0.

Replication and sentinel were introduced before, which solved the problem of high availability. Through replication, read operations can be distributed to multiple nodes (read achieves load balancing), but write operations still have only one node, which can not achieve load balancing of write operations, but still face the bottleneck of single machine memory and concurrency.

Cluster is used to solve the problem of load balancing of write operations。 Its core has two functions

  1. Data fragmentation: This is the most core function of the cluster. It breaks through the single machine memory limit of redis by distributing films. The data is published to multiple nodes, and each node can provide read and write operations, and the response ability is also improved
  2. High availability: like copy sentry, each node is composed of master and slave, and it also realizes automatic fail over.

2、 Redis cluster data partition principle

2.1 hash partition scheme

There are two common hash partitions

  1. Node redundancy partition
  2. Distributed hash table

2.1.1 node redundancy partition

\[hashCode=hash(key) mod N

N is the number of nodes. The advantage of this scheme is simple, but the disadvantage is that when the number of nodes changes (expansion or reduction), the data node mapping relationship needs to be recalculated, which will lead to data re migration.

It is generally used in scenarios where nodes can be estimated to be unchanged. For example, the database can be divided into tables and sub databases. For example, the order database can be divided into 64 sub databases. OrderID mod 64 can get the order data that should be written to that database.

2.1.2 consistent hashing

The implementation idea is to assign a token to each node in the system, the range is generally 0 ~ 2 to the 32nd power, and these tokens form a hash ring.

When data is written, first calculate the hash value x according to the key, then find the node with the first token greater than x clockwise, and then store the value in the node. In the figure below:

  1. A to node1
  2. B-node2
  3. 。。。

Redis cluster (1)

In the consistent hash partition, if you add or reduce nodes, only the nodes adjacent to the nodes will be affected, and other nodes will not be affected. For example, adding a node before node1 will only transfer part of the data (such as a) originally stored in node1 to the new node, and other nodes will not be affected.

The biggest problem of consistent hashing is that when there are few nodes, adding or deleting nodes will lead to serious imbalance of data allocation.In the figure above, if node1 and node2 are deleted:

  1. All the data stored in node1 and node2 will be migrated to node3, resulting in the change of node3 data from 1 / 6 to 1 / 2
  2. Node (4, 5, 6) has 1 / 2 storage in total, which is seriously unbalanced.

Virtual slot partition is an improvement of consistent hash to solve the problem of load balancing.

2.2 redis cluster data partition scheme

Redis cluster adopts the virtual slot partition, which is a virtual concept between the actual node and the data. Each node corresponds to a certain range of slots, and each slot contains a hash value within a certain range. After using the virtual slot partition, the mapping relationship of the data starts fromSection hash –That’s the pointHash – slot – node

The range of redis cluster slot is 16384 (016383)。 All keys are mapped to 0 based on hash function16383 integer slot (CRC take module), calculation formula:

\[slot = CRC(key) mod 16383

The schematic diagram is as follows:

Redis cluster (1)

After the virtual slot partition is used, the change of nodes has little impact on the system. For example, in the figure above, only 0-3276 slots need to be reassigned to delete node1.

3、 Build a cluster

Build a cluster of three masters and three slaves, which are distinguished by the port number on the same machine.

  1. Three masters: 700070017002
  2. Third from: 800080018002

3.1 preparation node

7000 nodes are configured as follows:

#Port number
port 7000
#Start cluster mode
cluster-enabled yes
#Node timeout (MS)
cluster-node-timeout 15000
#Cluster internal configuration file
cluster-config-file "nodes-7000.conf"
logfile "log-7000.log"
protected-mode no
daemonize yes

Configure 70017002800080018002 in turn.

Start 6 nodes:

src/redis-server redis-7000.conf 
src/redis-server redis-7001.conf 
src/redis-server redis-7002.conf 
src/redis-server redis-8000.conf 
src/redis-server redis-8001.conf 
src/redis-server redis-8002.conf

Configuration related instructions

In the above configuration, cluster enabled and cluster config file are cluster related configurations.

Cluster enabled is set to yes, which represents cluster mode. By default, redis is in stand-alone mode.

Cluster config file is a unique configuration file for a cluster. When redis starts, if no configuration file is found, a configuration file will be created automatically.

Open the configuration file. If the cluster configuration file already exists, read it directly.The cluster configuration file is maintained automatically by redis without manual modification.

The cluster configuration files generated after the first startup of 7000 are as follows:

877e9d061f80cea70285e823cbc4246041752149 :[email protected] myself,master - 0 0 0 connected 5474 5798 11459 11958 12706 13735
vars currentEpoch 0 lastVoteEpoch 0

The initial state of the cluster is recorded. The most important is the first 40 bit hexadecimal string, which is the node ID of the cluster. The node ID is created only once during the initialization of the cluster, and the cluster configuration file will be loaded for reuse after restart. The cluster node ID is not used for the running ID of redis. The running ID will get better after each restart.

3.2 creating clusters

Direct useredis-cliCommand to create (after redis5.0)

Input command

redis-cli --cluster create --cluster-replicas 1

--cluster-replicas 1Indicates that each master node is assigned a slave node.

Redis cluster (1)

(the warning above is because I have deployed all nodes to the same machine)

Enter yes to continue

Redis cluster (1)

The cluster configuration is successful, and 16384 slots are allocated.

Overall structure

Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica to
Adding replica to
Adding replica to

The cluster created with the above command cannot specify the master-slave relationship manually.

4、 Principle of node communication

4.1 visit message

Redis adopts the gossip protocol (P2P). The working principle of gossip protocol is that nodes constantly communicate and exchange information. After a period of time, all nodes will know the complete information of the cluster, which is similar to rumor spreading, as shown in the figure below:

Redis cluster (1)

Communication process:

  1. Each node in the cluster will open a separate TCP channel for mutual communication between nodes. The communication port number is based on the node’s basic port number plus 10000. For example, if the original port number is 7000, the corresponding gossip port number is 17000.
  2. Each node is in a fixed periodSelect several nodes by specific rulesSend Ping message
  3. The node receiving the Ping message responds with the Ping message.

Gossip message type:

Gossip message parsing process:

Redis cluster (1)


4.2 communication node selection

In the above visit messages, Ping / Pong messages need to carry the information of the current node and some other nodes (status, etc.), and these frequent information exchanges are bound to increase the bandwidth and computational burden. It is particularly important to select how many nodes to communicate with each time (how many nodes to send each time)

  1. Too much: high exchange cost
  2. Too few: the message exchange frequency is low, which affects the fault judgment and the speed of node discovery.

Specific selection:

Select sending node

Five nodes refers to randomly finding five nodes in the cluster and taking one of the other nodes to send Ping.

10 times: for a node selected in the previous step, scan the local node list every 100 ms. if it is found that the last time that a node received a Pong message is greater than cluster node timeout / 2, it needs to send a ping message to the node

  • num(node.pong_received > cluster_node_timeout/2)

Number of Ping messages

Number of self nodes + 1 / 10 number of other nodes

It can be seen that the cluster of nodes_ node_ Both timeout and the number of nodes in the whole cluster affect the information exchange between nodes in the cluster.