This article starts with WeChat official account of vivo Internet technology.

Link: https://mp.weixin.qq.com/s/LGLqEOlGExKob8xEXXWckQ

Author: Qian Xingchuan

In the distributed environment, we often define the data distribution through certain rules. The modular algorithm and consistent hash described in this paper generate a key through certain rules, and perform regular operations on the key to get where the data should go.

This article uses the software environment: Java 8

## 1、 Data distribution interface definition

**summary**

In the distributed environment, we often define the data distribution through certain rules, such as the data of user 1 is stored in database 1, the data of user 2 is stored in database 2

Generally speaking, there are several common ways:

- There is a unique central distribution node in the distributed environment. Each time the data is stored, it will ask the central node where the data should go, and the distribution node will tell the data exactly where to go.
- A key is generated by certain rules, and the key is calculated according to certain rules to find out where the data should go. This paper describes the modular algorithm and consistency hash, is such a way.

**Interface definition**

```
/**
*Data distribution hash algorithm interface definition
* @author xingchuan.qxc
*
*/
public interface HashNodeService {
/**
*Add a data storage node to the cluster
* @param node
*/
public void addNode(Node node);
/**
*When storing data, find out which node is used to store the data
* @param key
* @return
*/
public Node lookupNode(String key);
/**
*Hash algorithm
* @param key
* @return
*/
public Long hash(String key);
/**
*Simulate an unexpected situation to break a node, which is used to test the cache hit rate
* @param node
*/
public void removeNodeUnexpected(Node node);
}
```

## 2、 Implementation of data distribution algorithm — modulus algorithm

**summary**

The application scenarios of the modular algorithm are described as follows:

It is necessary to realize load balancing of user data storage in the cluster. There are N storage nodes in the cluster. How to evenly distribute the data to these n nodes?

The implementation steps are roughly divided into two steps:

- Get a hash value through the user’s key
- The hash value is used to modulus the number of storage nodes n, and an index is obtained
- The above index is the node ID to be stored

Note: in this paper, I use CRC32 to generate hash value.

**Code implementation:**

```
/**
*Implementation of modular data distribution algorithm
* @author xingchuan.qxc
*
*/
public class NormalHashNodeServiceImpl implements HashNodeService{
/**
*Storage node list
*/
private List<Node> nodes = new ArrayList<>();
@Override
public void addNode(Node node) {
this.nodes.add(node);
}
@Override
public Node lookupNode(String key) {
long k = hash(key);
int index = (int) (k % nodes.size());
return nodes.get(index);
}
@Override
public Long hash(String key) {
CRC32 crc32 = new CRC32();
crc32.update(key.getBytes());
return crc32.getValue();
}
@Override
public void removeNodeUnexpected(Node node) {
nodes.remove(node);
}
}
```

From the above example, we can see that when looking up a node, we need to get the CRC32 value of this key first, then modulus the number of nodes in the cluster to get R, and finally return the node with the subscript R.

The test code is as follows:

```
HashNodeService nodeService = new NormalHashNodeServiceImpl();
Node addNode1 = new Node("xingchuan.node1", "192.168.0.11");
Node addNode2 = new Node("xingchuan.node2", "192.168.0.12");
Node addNode3 = new Node("xingchuan.node3", "192.168.0.13");
Node addNode4 = new Node("xingchuan.node4", "192.168.0.14");
Node addNode5 = new Node("xingchuan.node5", "192.168.0.15");
Node addNode6 = new Node("xingchuan.node6", "192.168.0.16");
Node addNode7 = new Node("xingchuan.node7", "192.168.0.17");
Node addNode8 = new Node("xingchuan.node8", "192.168.0.18");
nodeService.addNode(addNode1);
nodeService.addNode(addNode2);
nodeService.addNode(addNode3);
nodeService.addNode(addNode4);
nodeService.addNode(addNode5);
nodeService.addNode(addNode6);
nodeService.addNode(addNode7);
nodeService.addNode(addNode8);
//Used to check data distribution
Map<String, Integer> countmap = new HashMap<>();
Node node = null;
for (int i = 1; i <= 100000; i++) {
String key = String.valueOf(i);
node = nodeService.lookupNode(key);
node.cacheString(key, "TEST_VALUE");
String k = node.getIp();
Integer count = countmap.get(k);
if (count == null) {
count = 1;
countmap.put(k, count);
} else {
count++;
countmap.put(k, count);
}
}
System.out.println (initialization data distribution: + countmap);
```

The operation results are as follows:

`Initialization data distribution: {192.168.0.11 = 12499, 192.168.0.12 = 12498, 192.168.0.13 = 12500, 192.168.0.14 = 12503, 192.168.0.15 = 12500, 192.168.0.16 = 12502, 192.168.0.17 = 12499, 192.168.0.18 = 12499}`

As you can see, the number of storage distributions per node is roughly the same.

**shortcoming**

We can clearly see that the modulus algorithm is based on the number of data storage nodes, so when the number of storage nodes changes, it will cause catastrophic cache failure.

give an example:

In the initial cluster, there are only four storage nodes (node0, node1, node2, node3). At this time, I want to store users with ID 1 ~ 10. I can calculate the distribution nodes of each ID by ID% 4

At this time, what happens if a new storage node node4 is added to the cluster?

Here we will find that the keys of a large number of storage nodes do not match the original ones. At this time, if we are in the production environment, we need to do a lot of data migration.

Delete a node, the principle is the same as above, no more repeat.

Code simulation of a distributed cache storage, using the way of module, add a node brought about by the problem. The test code is as follows:

```
HashNodeService nodeService = new NormalHashNodeServiceImpl();
Node addNode1 = new Node("xingchuan.node1", "192.168.0.11");
Node addNode2 = new Node("xingchuan.node2", "192.168.0.12");
Node addNode3 = new Node("xingchuan.node3", "192.168.0.13");
Node addNode4 = new Node("xingchuan.node4", "192.168.0.14");
Node addNode5 = new Node("xingchuan.node5", "192.168.0.15");
Node addNode6 = new Node("xingchuan.node6", "192.168.0.16");
Node addNode7 = new Node("xingchuan.node7", "192.168.0.17");
Node addNode8 = new Node("xingchuan.node8", "192.168.0.18");
nodeService.addNode(addNode1);
nodeService.addNode(addNode2);
nodeService.addNode(addNode3);
nodeService.addNode(addNode4);
nodeService.addNode(addNode5);
nodeService.addNode(addNode6);
nodeService.addNode(addNode7);
nodeService.addNode(addNode8);
//Used to check data distribution
Map<String, Integer> countmap = new HashMap<>();
Node node = null;
for (int i = 1; i <= 100000; i++) {
String key = String.valueOf(i);
node = nodeService.lookupNode(key);
node.cacheString(key, "TEST_VALUE");
String k = node.getIp();
Integer count = countmap.get(k);
if (count == null) {
count = 1;
countmap.put(k, count);
} else {
count++;
countmap.put(k, count);
}
}
System.out.println (initialization data distribution: + countmap);
//Under normal circumstances to get data, hit rate
int hitcount = 0;
for (int i = 1; i <= 100000; i++) {
String key = String.valueOf(i);
node = nodeService.lookupNode(key);
if (node != null) {
String value = node.getCacheValue(key);
if (value != null) {
hitcount++;
}
}
}
double h = Double.parseDouble(String.valueOf(hitcount))/ Double.parseDouble(String.valueOf(100000));
System.out.println ("initialization cache hit rate: + H)";
//Remove a node
Node addNode9 = new Node("xingchuan.node0", "192.168.0.19");
nodeService.addNode(addNode9);
hitcount = 0;
for (int i = 1; i <= 100000; i++) {
String key = String.valueOf(i);
node = nodeService.lookupNode(key);
if (node != null) {
String value = node.getCacheValue(key);
if (value != null) {
hitcount++;
}
}
}
h = Double.parseDouble(String.valueOf(hitcount))/ Double.parseDouble(String.valueOf(100000));
System.out.println ("cache hit rate after adding a node: + H)";
```

The operation results are as follows:

```
Initialization data distribution: {192.168.0.11 = 12499, 192.168.0.12 = 12498, 192.168.0.13 = 12500, 192.168.0.14 = 12503, 192.168.0.15 = 12500, 192.168.0.16 = 12502, 192.168.0.17 = 12499, 192.168.0.18 = 12499}
Initialization cache hit rate: 1.0
Cache hit rate after adding a node: 0.11012
```

## 3、 Distributed data distribution algorithm — consistent hash

**summary**

The disadvantage of the modular algorithm is obvious, when adding and deleting nodes, it will involve a lot of data migration problems. To solve this problem, a consistent hash is introduced.

The principle of consistent hash algorithm is very simple, which is described as follows:

- Imagine a huge ring. For example, the value distribution of this ring can be 0 ~ 4294967296
- In this case, we assume that our four nodes are distributed over this huge ring through some key hashes.
- When user data comes, which node should be stored? Through the hash of the key, a value R is obtained. The hash value nodehash corresponding to the nearest node is found clockwise. This time, the user data is stored on the corresponding node.

Then the problem comes. If there are only four nodes, the data distribution may be uneven. For example, node3 and node4 in the figure above are very close. At this time, node1 will be under great pressure. How to solve this problem? Virtual nodes can solve this problem.

**What is a virtual node?**

In short, it is to simulate many non-existent nodes in the ring. At this time, these nodes can be evenly distributed on the ring as much as possible. After the hash of the key, the nearest storage node is found clockwise. After the storage is completed, the data in the cluster is basically evenly distributed. The only thing to do is to maintain the relationship between a virtual node and a real node.

**Implementation of consistent hash**

Next, we will implement a consistent hash through two advanced steps.

We don’t introduce virtual nodes in the first step, but we introduce virtual nodes in the second step

**The key codes are as follows:**

```
@Override
public void addNode(Node node) {
nodeList.add(node);
long crcKey = hash(node.getIp());
nodeMap.put(crcKey, node);
}
@Override
public Node lookupNode(String key) {
long crcKey = hash(key);
Node node = findValidNode(crcKey);
if(node == null){
return findValidNode(0);
}
return node;
}
/**
* @param crcKey
*/
private Node findValidNode(long crcKey) {
//Find the nearest node clockwise
Map.Entry<Long,Node> entry = nodeMap.ceilingEntry(crcKey);
if(entry != null){
return entry.getValue();
}
return null;
}
@Override
public Long hash(String key) {
CRC32 crc = new CRC32();
crc.update(key.getBytes());
return crc.getValue();
}
```

Here we find that the algorithm for calculating the hash of the key is the same as that in the module taking algorithm. This is not the point. The point is that when addnode is used, we hash through the IP address and drop it into a treemap. The key is a long and can be sorted automatically.

In lookupnode, we find the nearest node clockwise. If it is not found, the data will exist in the first node in the ring.

**The test code is as follows:**

```
The only difference is that the line of algorithm implementation is changed
HashNodeService nodeService = new ConsistentHashNodeServiceImpl();
```

The operation results are as follows:

```
Initial data distribution: {192.168.0.11 = 2495, 192.168.0.12 = 16732, 192.168.0.13 = 1849, 192.168.0.14 = 32116, 192.168.0.15 = 2729, 192.168.0.16 = 1965, 192.168.0.17 = 38413, 192.168.0.18 = 3701}
Initialization cache hit rate: 1.0
Cache hit ratio after adding a node: 0.97022
```

Here we can see that the data distribution is uneven. At the same time, we also find that if a node fails, the impact on the cache hit rate is much better than that in the scenario of the modular algorithm.

**The implementation of consistent hash, advanced level 2, introduces virtual nodes, and the code is as follows:**

When we add new nodes, each real node corresponds to 128 virtual nodes

The code for deleting the node is as follows, and the corresponding virtual node is also deleted.

**Test data distribution and cache hit rate again**

The test code remains unchanged, and the running results are as follows:

```
Initialization data distribution: {192.168.0.11 = 11610, 192.168.0.12 = 14600, 192.168.0.13 = 13472, 192.168.0.14 = 11345, 192.168.0.15 = 11166, 192.168.0.16 = 12462, 192.168.0.17 = 14477, 192.168.0.18 = 10868}
Initialization cache hit rate: 1.0
Cache hit ratio after adding a node: 0.91204
```

At this time, we find that the data distribution is much better than the situation without introducing virtual nodes.

**summary**

I understand that consistent hash is to solve the problem of data migration involved in the expansion of distributed storage.

However, in consistent hash, if the data of each node is very average and each node is a hot spot, there will still be a large amount of data migration during data migration.

Please pay attention to more details**Vivo Internet technology**WeChat official account

Note: please contact the wechat:**labs2020**Contact.