Theoretical basis of distributed system election, majority and lease

Time:2020-9-29

Election is a common problem in the practice of distributed system. By breaking the peer-to-peer relationship between nodes, the selected leader (or master, coordinator) helps to realize the atomicity of transactions and improve the efficiency of decision-making. The idea of the majority helps us reach consensus in the case of network differentiation, and helps us select the only leader in the context of leader election. Lease gives node specific rights within a certain period of time, and can also be used to achieve leader election.

Let’s take a look at elections, majorities, and leases in distributed systems theory.

Election

Consistency is a problem of how to reach a decision between independent nodes, and the selection of leaders recognized by everyone is also a problem of consistency in essence. Therefore, how to deal with downtime recovery and network differentiation should also be considered in the leader election.

Bull algorithm [1] is the most common election algorithm, which requires that each node corresponds to a sequence number, and the node with the highest number is the leader. After the leader goes down, the node with the next highest sequence number is re selected as the leader. The process is as follows:

(a) When node 4 finds that the leader is unreachable, it initiates a re election to the node whose sequence number is higher than itself, and the re election message carries its own serial number

(b) (c) after receiving the reselection information, nodes 5 and 6 compare their serial numbers and find that their own serial numbers are larger. They return OK messages to node 4 and initiate re election to nodes with higher order numbers

(d) If node 5 receives the OK message from node 6, but node 6 fails to receive the OK message from the node with higher order number after the timeout, it considers itself as the leader

(e) Node 6 broadcasts the information that it is the leader to all nodes

Reviewing the theoretical basis of distributed systems consistency, 2pc and 3pc, we can see that there are 2pc in bull algorithm, which has the process of proposing and collecting feedback.

In the consistency algorithm Paxos, Zab [2], raft [3], in order to improve the resolution efficiency, all nodes act as leaders. Zab and raft describe the specific implementation of leader election. Similar to bull algorithm, Zab uses zxid to identify the node, and the node with the largest zxid represents the latest transaction and is selected as the leader.

The majority

In the scenario of network differentiation, the above bull algorithm will encounter a problem. The separated nodes think that they have the largest sequence number and will produce multiple leaders, so it is necessary to introduce quorum [4]. The idea of majority is very common in distributed system, which ensures that the resolution is unique in the case of network differentiation.

The principle of majority school is very simple. If the total number of nodes is 2F + 1, a resolution will be passed if more than f nodes agree. In the leader election, only the part with majority nodes can select the leader, which avoids the generation of multiple leaders.

The idea of majority is also applied to replica management, which adjusts the number of write copies VW and the number of read copies VR according to the actual read-write ratio of the business to achieve a balance between reliability and performance [5].

Lease

A very important issue in the election, which has not been mentioned above: how to judge whether the leader is unavailable and when a re election should be launched? First of all, it is possible to judge whether the leader state is normal by heart beat. However, in the case of network congestion or transient interruption, it is easy to lead to the emergence of dual master.

Lease is a common method to solve this problem. It was first proposed to solve the problem of distributed cache consistency [6], and then it was applied in many aspects such as distributed lock [7].

Theoretical basis of distributed system election, majority and lease

The principle of lease is also not complicated. The central idea is that only one node in each lease duration obtains the lease, and the lease must be re issued after expiration. Suppose we have lease issuing node Z, and nodes 0, 1 and 2 run for leader. The lease process is as follows:

(a) Nodes 0, 1 and 2 register themselves on Z. Z issues a lease to the node according to certain rules (such as first come first served), and the lease corresponds to a valid duration. Here, assume that node 0 obtains the lease and becomes the leader

(b) When the leader is down, the election will be re initiated only after the lease expires. Here, node 1 obtains the lease and becomes the leader

The lease mechanism ensures that there is at most one leader at a time, which avoids the problem of using only the heartbeat mechanism to produce double masters. In practice, zookeeper and ectd can be used to issue lease.

Summary

In the theory and practice of distributed system, leader, quorum and lease are common. In the distributed system, everything is not necessarily negotiated and democratic. The existence of leader helps to improve the resolution efficiency.

This paper takes leader election as an example to introduce and describe quorum and lease. Of course, quorum and lease are two kinds of ideas, which are not limited to the application of leader election.

Finally, an interesting question is raised. The essence of leader election is consistency. Paxos, raft, Zab and other protocols and algorithms to solve the consistency problem need or rely on the leader. How to understand this seemingly “egg laying chicken, chicken laying egg” problem? [8]

[1] Elections in a Distributed Computing System, Hector Garcia-Molina, 1982

[2] ZooKeeper’s atomic broadcast protocol: Theory and practice, Andre Medeiros, 2012

[3] In Search of an Understandable Consensus Algorithm, Diego Ongaro and John Ousterhout, 2013

[4] A quorum-based commit protocol, Dale Skeen, 1982

[5] Weighted Voting for Replicated Data, David K. Gifford, 1979

[6] Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency, Cary G. Gray and David R. Cheriton, 1989

[7] The Chubby lock service for loosely-coupled distributed systems, Mike Burrows, 2006

[8] Why is Paxos leader election not done using Paxos?