Time：2021-10-26

# Distributed (1) – consistency, 2pc and 3pc

## 1. Introduction

In a narrow sense, a distributed system refers to a computer system connected by a network. Each node independently undertakes computing or storage tasks, and the nodes work together through the network. Generalized distributed system is a relative concept, just asLeslie LamportSaid [1]:

What is a distributed systeme. Distribution is in the eye of the beholder.
To the user sitting at the keyboard, his IBM personal computer is a nondistributed system.
To a flea crawling around on the circuit board, or to the engineer who designed it, it’s very much a distributed system.

Consistency is a fundamental problem in distributed theory. For nearly half a century, scientists have put forward many theoretical models around the problem of consistency. According to these theoretical models, there have been many engineering practices in the industry. Let’s start with the consistency problem and the two methods to solve the consistency problem under specific conditions (2pc and 3pc) to understand the most basic distributed system theory.

## 2. Consistency (consumers)

What is consistency? In short, the consistency problem is how to reach a resolution between independent nodes. In distributed systems, consistency problems will be encountered in database transaction commit, leader election, serial number generation, etc. This problem is also very common in our daily life. For example, how do card players agree on a few points and where to play a few rounds of Mahjong:

Assuming a distributed system with n nodes, we say that the system meets consistency when it meets the following conditions:

1. Total agreement: all N nodes agree with one result
2. Legal value: the result must be proposed by the node of N nodes
3. Termination: the resolution process ends within a certain period of time and will not go on endlessly

Some people may say that it’s OK for four people to discuss when and where to play mahjong. Isn’t it very simple?

However, for such seemingly simple things, the implementation of distributed system is not easy, because it faces these problems:

• Asynchronous messaging: the real network is not a reliable channel, with message delay and loss, and the message transmission between nodes can not be synchronized
• Node failure stop: the node goes down continuously and will not be recovered
• Node failure recovery: node recovery after a period of downtime is the most common in distributed systems
• Network partition: there is a problem with the network link, isolating n nodes into multiple parts
• Byzantine failure[2] : the node is down or logic fails, and even does not play cards according to the routine, throwing out information that interferes with the resolution

Suppose there are such problems in the real scene, let’s see the results:

``````Me: Lao Wang, at the old place at 7 o'clock tonight, rub enough 48 circles and see you!
……
(at 3 a.m. the next day) Lao Wang next door: no problem// Message delay
Me:
----------------------------------------------
Me: Xiao Zhang, at the old place at 7 o'clock tonight, rub enough 48 circles and see you!
Xiao Zhang: No
(two hours later...)
Xiao Zhang: no problem// Downtime node recovery
Me:
-----------------------------------------------
Me: Lao Li Tou, at the old place at 7 o'clock tonight, rub enough for 48 circles and stay together!
Lao Li: Yes, big health care, let's go// Byzantine general (do you want to play mahjong? Or do you want big health care? Or do you want big health care while playing mahjong...)``````

Can we play happily together

We call the problems listed above system model. When discussing distributed system theory and engineering practice, we must delimit the model first. For example, there are two models:

1. In an asynchronous environment, a node fails to stop
2. In asynchronous environment, node failure recovery and network partition

2:1 has more considerations of node recovery and network differentiation, so the theoretical research and engineering solutions of the two models must be different. Talking about solutions before the problems to be solved are serious hooligans.

Consistency also has two attributes: one is strong consistency, which requires all nodes to be in the same state and advance and retreat together; One is liveness, which requires 24 * 7 uninterrupted external services of distributed systems. FLP impossibility (FLP impossibility) 3 has proved that in a narrow model (asynchronous environment and only node downtime), safety and liveness cannot be satisfied at the same time.

FLP theorem is the basic theory in distributed system theory. Just as the law of energy conservation in physics completely denies the existence of perpetual motion machine, FLP theorem denies the existence of consistency protocol satisfying both safety and liveness.

In engineering practice, according to the specific business scenarios, either strong consistency is guaranteed, or availability is guaranteed in case of node downtime and network differentiation. 2pc and 3pc are relatively simple protocols to solve consistency problems. Let’s learn about 2pc and 3pc.

## 3、2PC

2pc (now phase commit) [5] as the name suggests, it is divided into two stages. First, one party makes a proposal and collects the feedback from other nodes, and then decides to commit or abort the transaction according to the feedback. We refer to the proposed node as the coordinator, and other nodes participating in the decision-making are called participants or cohorts:

In phase 1, the coordinator initiates a proposal and asks each participant whether they accept it.

In phase 2, the coordinator commits or aborts the transaction according to the feedback of the participants. If all the participants agree, it commits. As long as one participant disagrees, it aborts.

Under the asynchronous environment and no node failure stop model, 2pc can meet the requirements of full identity, legal value and termination. It is a protocol to solve the consistency problem. But if you add the consideration of node failure recovery, can 2pc still solve the consistency problem?

If the coordinator goes down after initiating the proposal, the participant will enter the block state and wait for the coordinator’s response to complete the decision. At this time, another role is needed to bring the system out of the never ending state. We call the new role coordinator watchdog. After the coordinator goes down for a certain time, watchdog takes over the work of the original coordinator and determines whether to submit or abort phase 2 by querying the status of each participant. This also requires the coordinator / participant to record (log) the historical status, so that the watchdog can query the participant after the coordinator goes down, and recover the status after the coordinator goes down.

After receiving a transaction request from the coordinator, initiating a proposal and completing the transaction, two RTTS (propose + commit) are added after 2pc protocol, resulting in a relatively small increase in latency.

## 4、3PC

3pc (three phase commit) means three-phase commit 6. Since 2pc can achieve consistency under the asynchronous network + node downtime recovery model, what else does 3pc need to do and what the hell is 3pc?

In 2pc, the status of a participant is known only to itself and the coordinator. If the coordinator proposes to go down, and another participant goes down before the watchdog is enabled, other participants will enter a blocking state that can neither roll back nor force commit until the participant goes down again. This raises two questions:

1. Can the blocking be removed so that the system can roll back to the initial state before the commit / abort
2. In this resolution, can participants know each other’s status, or do participants not depend on each other’s status at all

Compared with 2pc, 3pc adds a prepare to commit phase to solve the above problems:

After receiving the participant’s feedback (vote), the coordinator enters phase 2 and sends a prepare to commit instruction to each participant. Participants can lock resources after receiving the prepare to commit instruction, but the related operations must be rollback. After receiving the ACK, the coordinator enters phase 3 and performs commit / abort. Phase 3 of 3pc is no different from phase 2 of 2pc. Coordinator watchdog and logging are also applied to 3pc.

If the participant goes down at different stages, let’s see how 3pc responds:

• Phase 1:

• The coordinator or watchdog does not receive the vote of the downtime participant, and directly aborts the transaction;
• After the shutdown participant recovers, it reads the logging and finds that no affirmative vote is issued, so it automatically aborts the transaction
• Phase 2:

• The coordinator did not receive the pre commit ack from the downtime participant, but because the approval feedback from the downtime participant has been received before (otherwise it will not enter phase 2), the coordinator will commit;
• Watchdog can obtain this information by asking other participants, and the process is the same;
• After the downtime participant recovers, if it finds that it has received a precommit or issued a vote in favor, it will commit the transaction by itself
• Phase 3:

• Even if the coordinator or watchdog does not receive the commit ack of the downtime participant, the transaction is ended;
• After the downtime participant recovers, if it finds that it has received a commit or precommit, it will commit the transaction itself

Because of the prepare to commit phase, the transaction delay of 3pc has also increased from one RTT to three RTTS (propose + commit + commit). However, it prevents the whole system from entering the blocking state after the participant goes down, and enhances the availability of the system. It is very worthwhile for some real business scenarios.

## 5. Summary

The above introduces some basic knowledge of distributed system theory, expounds the definition of consensus and the problems to be faced in realizing consistency, and finally discusses how 2pc and 3pc solve the consistency problem under asynchronous network and node failure recovery models.

Read the previous theoretical studies on distributed systems, including rigorous reasoning and proof, and a kind of mathematical beauty; The realization of distributed system in reality is the result of compromise under various factors.

[1] Solved Problems, Unsolved Problems and Problems in Concurrency, Leslie Lamport, 1983

[2] The Byzantine Generals Problem, Leslie Lamport,Robert Shostak and Marshall Pease, 1982

[3] Impossibility of Distributed Consensus with One Faulty Process, Fischer, Lynch and Patterson, 1985

[4] Proof of FLP impossibility, Daniel Wu, 2015

[5] Consensus Protocols: Two-Phase Commit, Henry Robinson, 2008

[6] Consensus Protocols: Three-phase Commit, Henry Robinson, 2008

[7] Three-phase commit protocol, Wikipedia