- 1、 Introduction to cluster
- 2、 Redis cluster data partition principle
- 3、 Build a cluster
- 4、 Principle of node communication
1、 Introduction to cluster
Redis cluster is a distributed solution officially launched after redis3.0.
Replication and sentinel were introduced before, which solved the problem of high availability. Through replication, read operations can be distributed to multiple nodes (read achieves load balancing), but write operations still have only one node, which can not achieve load balancing of write operations, but still face the bottleneck of single machine memory and concurrency.
Cluster is used to solve the problem of load balancing of write operations。 Its core has two functions
- Data fragmentation: This is the most core function of the cluster. It breaks through the single machine memory limit of redis by distributing films. The data is published to multiple nodes, and each node can provide read and write operations, and the response ability is also improved
- High availability: like copy sentry, each node is composed of master and slave, and it also realizes automatic fail over.
2、 Redis cluster data partition principle
2.1 hash partition scheme
There are two common hash partitions
- Node redundancy partition
- Distributed hash table
2.1.1 node redundancy partition
N is the number of nodes. The advantage of this scheme is simple, but the disadvantage is that when the number of nodes changes (expansion or reduction), the data node mapping relationship needs to be recalculated, which will lead to data re migration.
It is generally used in scenarios where nodes can be estimated to be unchanged. For example, the database can be divided into tables and sub databases. For example, the order database can be divided into 64 sub databases. OrderID mod 64 can get the order data that should be written to that database.
2.1.2 consistent hashing
The implementation idea is to assign a token to each node in the system, the range is generally 0 ~ 2 to the 32nd power, and these tokens form a hash ring.
When data is written, first calculate the hash value x according to the key, then find the node with the first token greater than x clockwise, and then store the value in the node. In the figure below:
- A to node1
In the consistent hash partition, if you add or reduce nodes, only the nodes adjacent to the nodes will be affected, and other nodes will not be affected. For example, adding a node before node1 will only transfer part of the data (such as a) originally stored in node1 to the new node, and other nodes will not be affected.
The biggest problem of consistent hashing is that when there are few nodes, adding or deleting nodes will lead to serious imbalance of data allocation.In the figure above, if node1 and node2 are deleted:
- All the data stored in node1 and node2 will be migrated to node3, resulting in the change of node3 data from 1 / 6 to 1 / 2
- Node (4, 5, 6) has 1 / 2 storage in total, which is seriously unbalanced.
Virtual slot partition is an improvement of consistent hash to solve the problem of load balancing.
2.2 redis cluster data partition scheme
Redis cluster adopts the virtual slot partition, which is a virtual concept between the actual node and the data. Each node corresponds to a certain range of slots, and each slot contains a hash value within a certain range. After using the virtual slot partition, the mapping relationship of the data starts fromSection hash –That’s the pointHash – slot – node。
The range of redis cluster slot is 16384 (016383）。 All keys are mapped to 0 based on hash function16383 integer slot (CRC take module), calculation formula:
The schematic diagram is as follows:
After the virtual slot partition is used, the change of nodes has little impact on the system. For example, in the figure above, only 0-3276 slots need to be reassigned to delete node1.
3、 Build a cluster
Build a cluster of three masters and three slaves, which are distinguished by the port number on the same machine.
- Three masters: 700070017002
- Third from: 800080018002
3.1 preparation node
7000 nodes are configured as follows:
#Port number port 7000 #Start cluster mode cluster-enabled yes #Node timeout (MS) cluster-node-timeout 15000 #Cluster internal configuration file cluster-config-file "nodes-7000.conf" logfile "log-7000.log" protected-mode no daemonize yes
Configure 70017002800080018002 in turn.
Start 6 nodes:
src/redis-server redis-7000.conf src/redis-server redis-7001.conf src/redis-server redis-7002.conf src/redis-server redis-8000.conf src/redis-server redis-8001.conf src/redis-server redis-8002.conf
Configuration related instructions
In the above configuration, cluster enabled and cluster config file are cluster related configurations.
Cluster enabled is set to yes, which represents cluster mode. By default, redis is in stand-alone mode.
Cluster config file is a unique configuration file for a cluster. When redis starts, if no configuration file is found, a configuration file will be created automatically.
Open the configuration file. If the cluster configuration file already exists, read it directly.The cluster configuration file is maintained automatically by redis without manual modification.
The cluster configuration files generated after the first startup of 7000 are as follows:
877e9d061f80cea70285e823cbc4246041752149 :[email protected] myself,master - 0 0 0 connected 5474 5798 11459 11958 12706 13735 vars currentEpoch 0 lastVoteEpoch 0
The initial state of the cluster is recorded. The most important is the first 40 bit hexadecimal string, which is the node ID of the cluster. The node ID is created only once during the initialization of the cluster, and the cluster configuration file will be loaded for reuse after restart. The cluster node ID is not used for the running ID of redis. The running ID will get better after each restart.
3.2 creating clusters
redis-cliCommand to create (after redis5.0)
redis-cli --cluster create 192.168.118.129:7000 192.168.118.129:7001 192.168.118.129:7002 192.168.118.129:8000 192.168.118.129:8001 192.168.118.129:8002 --cluster-replicas 1
--cluster-replicas 1Indicates that each master node is assigned a slave node.
(the warning above is because I have deployed all nodes to the same machine)
Enter yes to continue
The cluster configuration is successful, and 16384 slots are allocated.
Master -> Slots 0 - 5460 Master -> Slots 5461 - 10922 Master -> Slots 10923 - 16383 Adding replica 192.168.118.129:8001 to 192.168.118.129:7000 Adding replica 192.168.118.129:8002 to 192.168.118.129:7001 Adding replica 192.168.118.129:8000 to 192.168.118.129:7002
The cluster created with the above command cannot specify the master-slave relationship manually.
4、 Principle of node communication
4.1 visit message
Redis adopts the gossip protocol (P2P). The working principle of gossip protocol is that nodes constantly communicate and exchange information. After a period of time, all nodes will know the complete information of the cluster, which is similar to rumor spreading, as shown in the figure below:
- Each node in the cluster will open a separate TCP channel for mutual communication between nodes. The communication port number is based on the node’s basic port number plus 10000. For example, if the original port number is 7000, the corresponding gossip port number is 17000.
- Each node is in a fixed periodSelect several nodes by specific rulesSend Ping message
- The node receiving the Ping message responds with the Ping message.
Gossip message type:
Gossip message parsing process:
4.2 communication node selection
In the above visit messages, Ping / Pong messages need to carry the information of the current node and some other nodes (status, etc.), and these frequent information exchanges are bound to increase the bandwidth and computational burden. It is particularly important to select how many nodes to communicate with each time (how many nodes to send each time)
- Too much: high exchange cost
- Too few: the message exchange frequency is low, which affects the fault judgment and the speed of node discovery.
Select sending node
Five nodes refers to randomly finding five nodes in the cluster and taking one of the other nodes to send Ping.
10 times: for a node selected in the previous step, scan the local node list every 100 ms. if it is found that the last time that a node received a Pong message is greater than cluster node timeout / 2, it needs to send a ping message to the node
- num(node.pong_received > cluster_node_timeout/2)
Number of Ping messages
Number of self nodes + 1 / 10 number of other nodes
It can be seen that the cluster of nodes_ node_ Both timeout and the number of nodes in the whole cluster affect the information exchange between nodes in the cluster.