Talking about TiDB’s New Features: Pessimistic Transactions


Author: Huang Dongxu

Friends who follow TiDB will probably notice that TiDB introduced an experimental new feature in 3.0: the pessimistic transaction model. This function is also a function that calls for a thousand times.

As you know, TiDB is not only widely used in the Internet industry, but also in some traditional financial industries. Pessimistic business is an indispensable feature in most financial scenarios. In addition, transaction as a core function of relational database, any improvement in transaction model will affect numerous applications, and how to beautifully implement pessimistic transaction model in a distributed system is a very challenging task, so today we will talk about this “hard bone”.

ACID and Distributed Transactions?

Before we talk about business, let’s take a brief look at ACID business. Here’s the definition of ACID extracted from Wikipedia:

  • Atomicity: All operations in a transaction, complete or incomplete, do not end in the middle. When a transaction fails in execution, it is rolled back to the state before the transaction starts, as if the transaction had never been executed.
  • Consistency: The integrity of the database is not compromised before and after the transaction begins.
  • Isolation: The ability of a database to allow multiple concurrent transactions to read, write and modify their data at the same time. Isolation prevents inconsistencies in data due to cross-execution of multiple transactions concurrently.
  • Durability: After a transaction is completed, changes to the data are permanent and will not be lost even if the system fails.

An intuitive example is a bank transfer, which either succeeds or fails. Under no circumstances should there be a situation where money is withheld from one side and not added to the other.

The so-called distributed transaction is simply to realize ACID transaction function in a distributed database, which is the same as traditional database.

What is optimism? What is pessimism? A small example

Many people introduce optimistic and pessimistic affairs with a lot of database textbooks, which are professional. In fact, the concept is not complicated, or even very understandable. Here I will introduce it with a small example of life.

Imagine that you’re going to a restaurant right away, but you’re not sure if the table will be full before you go, and you don’t want to line up. You have two choices. If you’re an optimistic person, the inner drama may be “Whatever you do, come back if you don’t have a seat.” On the contrary, if you are a pessimist, you may call first to make an appointment, first confirm that there must be a seat, and at the same time, make a deposit for the restaurant to reserve this seat, so you can go directly.

The above example corresponds intuitively to the behavior of two transaction models. Optimistic transaction model is to commit directly and roll back when conflicting. Pessimistic transaction model is to try to lock the resources that need to be modified before actually committing a transaction. Only after ensuring that the transaction can be successfully executed, can it start committing.

After understanding the above examples, the pros and cons of optimistic and pessimistic affairs are well understood. For optimistic transaction model, it is more suitable for scenarios with low conflict rate, because direct submission (“go to restaurant directly”) is likely to succeed (“restaurant has seats”), conflict (“restaurant has no seats”) is a small probability event, but once there is a transaction conflict, the cost of rollback (back) will be higher. The advantage of pessimistic transactions is that for high conflict rate scenarios, the cost of locking in advance (“call deposit reservation”) is less than the cost of rollback afterwards, and it can also solve multiple concurrent transactions conflict with each other at a relatively low cost, leading to no one can succeed in the scenario.

TiDB transaction model – Percolator

Percolator’s model has been used for distributed transaction implementation in TiDB. Before we talk about the implementation of our pessimistic business, let’s briefly introduce Percolator.

Percolator is a model of building distributed transactions on a distributed KV system proposed by Google in an OSDI 2010 paper. It is essentially a standard 2PC (2 Phase Commit), and 2PC is a classic distributed transaction algorithm. There are a lot of articles submitted in the two stages on the internet, so we will not start here. But the biggest problem for 2PC in general is the Transaction Manager. In a distributed scenario, there may be a disconnection between a participant and the coordinator after the first stage, when the participant is not sure whether the transaction was finally submitted or rolled back, because theoretically, after the first stage, the coordinator can annotate the transaction if he confirms that all participants have dropped the data. Succeed. Then it enters the second stage, but if a participant does not receive COMMIT messages in the second stage, after the participant resurrects, it needs to go to a place to confirm whether the local transaction has been successfully committed, and then it needs the intervention of the transaction manager.

Smart friends may see the problem here. This transaction manager is a single point in the whole system. Even participants and coordinators can be extended, but the transaction manager needs atoms to maintain the commit and rollback state of transactions.

Percolator’s model essentially improves this problem. Here’s a brief introduction to the Percolator model’s write transaction process:

In fact, it is inaccurate to say that there is no singleton in Percolator’s model. There is a single TSO (Timestamp Oracle) used to allocate monotonically increasing timestamps. However, in the implementation of TiDB, TSO is a part of PD leader, because PD naturally supports high availability, so it naturally has high availability.

Whenever a transaction begins, the coordinator (tikv-client within TiDB acts as this role) gets a timestamp from the PD leader and then uses this TS as the only ID to mark the transaction. The standard Percolator model uses an optimistic transaction model. Before submitting, all the rows involved in the modification (key-value pairs) are collected and selected randomly from them. The remaining rows are automatically treated as secondary rows. Note here that the primary is random, and the specific line is not important at all. Primary’s sole meaning is to be responsible for the target. Note the completion status of this transaction.

After selecting Primary row, we begin to commit in two normal stages. The first stage is to lock + write a new version. The so-called lock key is actually to write a lock key, for example, a transaction operation A, B, C, three lines. The original Layout in the database is as follows:

Talking about TiDB's New Features: Pessimistic Transactions

Assuming that we need Update (A, B, C, Version 4) for this transaction, in the first stage, we select Primary row as A, then after the first stage, the Layout of the database will become:

Talking about TiDB's New Features: Pessimistic Transactions

The above is just an illustration. In fact, we have done some optimization in TiKV, but the principle is the same. The reddish one in the figure above is the newly written data in the database in the first stage, which you can notice.A_LockB_LockC_LockThese are the so-called locks. As you can see, the contents of the locks of B and C are actually the Primary locks that store this transaction. In the second phase of 2PC, the key to successful transaction submission is the processing of Primary lock. If the submission of Primary row is completed (write to the new version of the submission record + clear the Primary lock), then the transaction is completed, and vice versa, it fails. Cleaning up the Secondary rows does not need to be concerned, and it can be done asynchronously (why not care about this issue). Leave it to the reader to think.

When you understand Percolator’s model, you will know that in fact, Percolator adopts the idea of zero integration, which disperses centralized transaction status information in each row of data (Primary row of each transaction). For the pending cases, you only need to find Primary through lock information to determine the status of the transaction.

Limitations of optimistic business and why we need pessimistic business

For many common Internet scenarios, although the concurrency and data volume are very large, the conflict rate is not high. Take a simple example, e.g. e-commerce or social network, to eliminate some of the more extreme cases such as “secondkill” or “Big V”. The access mode can be considered to be relatively random, and many of these extremely high conflict rate scenarios in Internet companies are not handled directly at the database level, mostly through asynchronous queues or caching, which can be solved. There is no excessive expansion.

But for some traditional financial scenarios, for various reasons, there will be some business scenarios with high conflict rate but also need to ensure strict transactional nature. Take a simple example: payroll, for an employer, the process of payroll is actually the process of transferring money from an enterprise account to a personal account of multiple employees. Generally speaking, it is a batch operation. In a large transfer transaction, thousands of updates may be involved. Imagine if this big transaction is executed during this period of time, a personal account occurs. Consumption (change), if the big transaction is an optimistic transaction model, it must be rolled back when submitting. Consumption involving tens of thousands of personal accounts is a high probability event. If nothing is done, the worst case is that the big transaction will never be executed, and it will always be retried and rolled back (hunger).

Another more important reason is that in some business scenarios, the pessimistic transaction model is simpler to write. How do you say that?

Because TiDB supports MySQL protocol, it supports interactive transactions in MySQL, such as a program written like this (pseudocode):

txn = mysql.Begin();
affected_rows = txn.Execute(“UPDATE t SET v = v + 1 WHERE k = 100”);
if affected_rows > 0 {
} else {

Notice that the judgment statement on the fourth line is returned directly through the UPDATE statement above.affected_rowsTo decide whether to execute path A or path B, but smart friends must see the problem.On a database with an optimistic transaction model, before COMMIT is executed, the end result is actually unknown.affected_rowsHow much is it?So the value here is meaningless, and the program may go into the wrong process. This problem is almost unanswered in databases supported by optimistic transactions and needs to be tried in business.

The essence of the problem here is that MySQL protocol supports interactive transactions, but MySQL does not have native optimistic transaction support (MySQL InnoDB locks can be considered pessimistic locks), so native MySQL will lock first when executing the above UPDATE, confirming that its Update can complete before continuing, so it returns.affected_rowsThat’s right. But for TiDB, TiDB is a distributed system. If we want to achieve the pessimistic lock behavior similar to MySQL on a single machine (as we did in 3.0), it will be more challenging. For example, we need to introduce some new mechanisms to manage distributed locks. So, we choose to implement the optimistic transaction model according to the paper first, and we will not move until 3.0. Hands achieve pessimistic business. Let’s take a look at the realization behind this “magic”.

Pessimistic Transaction Implementation in TiDB 3.0

Before discussing implementation, let’s talk about some important design goals:

  1. Compatibility, maximizing compatibility with MySQL’s pessimistic transaction behavior, minimizing the cost of user business transformation.
  2. Flexibility supports pessimistic / optimistic behavior changes at the Session level and even at the transaction level, so the coexistence of optimistic and pessimistic transactions needs to be considered.
  3. High performance, deadlock detection and maintenance can not be too expensive.
  4. High availability + scalability, no single point of failure in the system, and scalability.

Tidb implements pessimistic transactions in a smart and elegant way. After careful consideration of percolator’s model, we found that, in fact, we only need to slightly modify the behavior of two-stage commit when the client calls commit, and advance the first stage lock and wait lock to the process of executing DML in the transaction, just like this:

Talking about TiDB's New Features: Pessimistic Transactions

Talking about TiDB's New Features: Pessimistic Transactions

This is true of TiDB’s pessimistic lock implementation. In the process of a transaction executing DML (UPDATE/DELETE), TiDB not only caches the rows that need to be modified locally, but also directly pessimistic locks on these rows. The pessimistic lock format here is almost identical to the locks in optimistic transactions, but the content of the lock is empty, just a placeholder, until Commit, These pessimistic locks are directly rewritten to the standard Percolator model locks, and the follow-up process is consistent with the original. The only change is:

For reading requests, when faced with such pessimistic locks, instead of waiting for the unlock like an optimistic transaction, you can directly return the latest data (as for why, the reader can think carefully).

As for writing requests, when you encounter pessimistic locks, you just need to wait for locks as usual.

This scheme is compatible with the original transaction implementation to a large extent, and guarantees scalability, high availability and flexibility (there is no problem with basic reuse of the original Percolator).

But introducing pessimistic locks and interactive transactions may introduce another problem: deadlocks. In fact, this problem does not exist in the optimistic transaction model, because all the rows that need to be locked can be locked in sequence, so deadlock can be avoided naturally. (In the actual implementation of TiKV, optimistic locks are not sequential locks, but concurrent locks, but the lock timeout time is very short, deadlock can also be retried quickly.) But the order of locking pessimistic transactions is uncertain because they are interactive transactions, for example:

  • Transaction 1 operation sequence: UPDATE A, UPDATE B
  • Transaction 2 Operation Order: UPDATE B, UPDATE A

If these two transactions are executed concurrently, deadlock may occur.

So in order to avoid deadlock, TiDB needs to introduce a deadlock detection mechanism, and the performance of this deadlock detection must be good. In fact, deadlock detection algorithm is relatively simple, as long as the dependency relationship between ongoing pessimistic transactions can not appear ring.

For example, when transaction 1 locks A, if another transaction 2 waits for A, a dependency will arise: transaction 2 depends on transaction 1, and if transaction 1 intends to wait for B (assuming transaction 2 already holds the lock of B), then the deadlock detection module will discover a cyclic dependency and terminate (or retry). ) This transaction is good, because the transaction does not actually prewrite + commit, so the cost is relatively small.

Talking about TiDB's New Features: Pessimistic Transactions

Deadlock detection for <center>TiDB pessimistic locks </center>

In specific implementations, TiKV dynamically elects a TiKV node for deadlock detection (in fact, we directly use the TiKV node where Region1 resides), which opens up a memory record and detects the dependencies of these transactions being executed. When pessimistic transactions are waiting for locks, the first step will pass through the deadlock detection module, so this part may introduce an RPC for deadlock detection. In practice, deadlock detection is asynchronous and will not increase the delay (recall deposit to hotel: P). Because it is pure memory, so the performance is good. We simply benchmark the deadlock detection module. The results are as follows:

Talking about TiDB's New Features: Pessimistic Transactions

It can basically reach 300K + QPS throughput, which can already adapt to most concurrent transaction scenarios.。 In addition, there are some optimizations, for example, the first lock waiting for a pessimistic transaction will not lead to deadlock or send a request to the deadlock detector. In fact, in the actual test, the overhead brought by the pessimistic transaction model is not high. On the other hand, because tikv itself supports the high availability of region, it can guarantee that region 1 will exist, which indirectly solves the high availability problem of deadlock detection service.

As for pessimistic locks, we also need to consider the problem of long transaction timeouts. This part is relatively simple and will not start.

How to use it?

There is a column in the configuration file of TiDB 3.0:

Talking about TiDB's New Features: Pessimistic Transactions

Take thisenableSet uptrueThat’s fine. It’s off by default.

In the second step, we introduce two grammars in practical use:

  • BEGIN /*!90000 PESSIMISTIC */

Transactions initiated with these two BEGINs will enter a pessimistic transaction mode, which is as simple as that.

Pessimistic transaction model is a very important feature for financial scenarios, and for TiDB whose goal is to be compatible with MySQL semantics, this feature is also an important part of improving compatibility. I hope you like it, Enjoyit!

Original reading:

Talking about TiDB's New Features: Pessimistic Transactions