The large-scale distributed system architecture will involve the problem of distributed transaction processing. Basically, there is a common principle between distributed transaction and common transaction:
- A (atomic) atomicity, transactions either complete together or roll back together
- C (consistent) consistency, the integrity of the database remains unchanged before and after data submission
- I (isolation) isolation. Different transactions are independent of each other and do not affect each other
- D (duration) persistence, data state is permanent
Another design principle is that the distributed system shall comply with the following principles:
- C (Consistent) conformance, multiple copies in distributed system, exactly the same in data and content.
- A (availability) high availability to ensure system response in limited time
- Fault tolerance of P (partition tolerance) partition. When a replica or partition goes down or has other errors, the overall operation of the system will not be affected
When designing a distributed system, it is difficult to ensure that all caps are in line with each other. For example, in a distributed system, two of them can be guaranteed at most:
To ensure the strong consistency of the system database, it is necessary to occupy database resources across tables and databases. In the case of high complexity, it is difficult to guarantee the time-consuming and availability.
So there is the theory of base
- Basic available is basically available. When the distributed system fails, it is allowed to lose some available functions to ensure the availability of core functions
- Soft state a flexible state that allows the system to have an intermediate state that does not affect system availability
- Finally consistent refers to the result that all data states are consistent after a period of time
In fact, base theory is an extension of AP. A distributed transaction does not need strong consistency, but only needs to achieve a final consistency.
Distributed transaction is to deal with the data operation of each node in the distributed system, so that it can complete the data consistency operation of multiple nodes when the nodes are isolated from each other.
Step by step transaction – 2pc
2pc is a two-stage submission, where you need to understand two concepts
- Participants: participants are the nodes that actually participate in the transaction. These nodes are isolated from each other and do not know whether the other party submits the transaction.
- Coordinator: collect and manage transaction information of participants, and be responsible for unified management of node information, that is, the third party in distributed transactions.
Preparation stage, also known as voting stage
In the preparation stage, the coordinator sends the precommit message to the participant. The participant executes the transaction logic locally. If there is no timeout and no error, the redo and undo logs will be recorded and the ACK information will be sent to the coordinator. When the coordinator receives the ACK information sent by all the node participants, he is ready to enter the next stage. Otherwise, the rollback information will be sent to each node, Each node rolls back according to the previously recorded undo information, and this transaction commit fails.
In the submission phase, the coordinator has received the response information of all nodes, and then sends the commit message to each node to inform each node participant that they can submit the transaction. After each participant submits, they send the completion information to the coordinator one by one, and release the local resources. After the coordinator receives all completion messages, the transaction is completed.
The figure submitted in phase II is as follows:
Issues submitted in phase II
Although two-phase commit can solve most of the problems of distributed transactions, and the probability of data errors is very small, there are still several problems as follows:
- Single point of failure: if the coordinator is down, the participants will block all the time. If the coordinator is down before the participant receives the submission message, the participants will keep the status of unsuccessful submission all the time, which will occupy resources all the time, resulting in other transactions waiting all the time.
- Synchronous blocking problem: because the participants are processing transactions locally, which are blocking type. Before the final commit, they always occupy the lock resources. If the coordinator fails to send back the commit for a long time, the lock will wait.
- Data consistency problem: if the coordinator fails to send the commit to some participants in the second phase, some participants commit the transaction and some do not commit, which results in data inconsistency.
- The problem that can’t be solved in two stages: if the coordinator goes down and the last participant goes down at the same time, when the coordinator participates in the transaction management through re election, it has no way to know which step the transaction goes.
In view of the above problems, 3pc is derived.
Distributed transaction – 3pc
3pc is the three-phase submission agreement.
Compared with 2pc, 3pc has the following changes:
- The timeout mechanism is introduced, and each participant and coordinator has its own timeout policy.
- A new phase, cancommit, has been added, so the phase is divided into three parts: cancommit, precommit and docommit.
This is a preparation stage. In this stage, the coordinator sends cancommit message to the participant. After receiving the message, the participant judges whether the transaction operation can be performed according to its own resources. If it can and does not timeout, it sends yes to the coordinator. Otherwise, the coordinator will interrupt the transaction.
This is the preparatory stage. In this stage, the coordinator sends the precommit message to the participants. After the participants receive it, they will perform the transaction operation and record the undo and redo logs. After the transaction is completed, they will send the ACK message to the coordinator. The coordinator will collect the information of all participants without timeout. Otherwise, the transaction will be interrupted and the abort message will be notified to all participants.
When all participants complete the ACK message, the coordinator will confirm to send the docommit message to each participant to execute the commit transaction. In principle, all participants need to send the committed to the coordinator after the commit is completed, and the coordinator completes the transaction. Otherwise, the coordinator interrupts the transaction, and the participant receiving the abort message will roll back according to the previously recorded undo.
However, it should also be noted that in this stage, if the participant cannot receive the docommit message sent by the coordinator in time, the transaction will also be submitted by itself. Because in terms of probability, the precommit stage can indicate that all nodes can execute transaction submission normally without abort, so generally speaking, the submission of a single node does not affect data consistency, except in extreme cases.
Difference between 2pc and 3pc
Compared with 2pc, 3pc mainly solves the single point of failure problem and reduces blocking, because once the participant fails to receive the information from the coordinator in time, he will execute the commit by default. Instead of holding the transaction resource all the time and being blocked. However, this mechanism also causes data consistency problems. Because of network reasons, the abort response sent by the coordinator is not received by the participant in time, then the participant performs the commit operation after the timeout. This is inconsistent with other participants who receive the abort command and perform the rollback.
As mentioned above, it’s a question about who will be the coordinator in the practice process. Generally speaking, the coordinator is asynchronous, can communicate in two directions between participants and coordinators, has a good message delivery mechanism, and can ensure the stable delivery of messages. Therefore, the general coordinator is a high-performance message middleware such as rocketmq, Kafka, rabbitmq, etc.
As shown in the figure, it is a 2pc practice: