Multi version concurrency control (mvcc) of MySQL


1、 Snapshot read and current read

Snapshot read is a consistent and unlocked read, which is one of the core reasons for the high concurrency of InnoDB.

Under the read committed transaction isolation level, consistent unlocked reading means that the latest snapshot data of the locked row is always read. Therefore, if other transactions modify the row data, the transaction can also read it, which also fits the problem of non repeatable reading under the RC isolation level;

At the repeatable read transaction isolation level, consistent and unlocked reads refer to the data read by the transaction, either existing before the start of the transaction, or inserted or modified by the transaction itself. (this isolation level will be described below);

Simple unlocked selections are snapshot reads, for example:


Corresponding to the snapshot read is the current read. The current read is to read the latest data, not the data of the historical version. The locked select belongs to the current read, for example:


Select… For update adds an X lock to the read row record. Other transactions cannot add any lock to the locked row.

Select… Lock in share mode adds an s lock to the read row record. Other transactions can add an s lock to the locked row, but if an X lock is added, it will be blocked.

2、 Multi version concurrency control based on snapshot read

The full English name of multi version concurrency control technology is: multi version concurrency control, referred to as mvcc. It realizes database concurrency control by saving the historical version of data and managing multiple versions of data rows. In this way, we can determine whether the data is displayed by comparing the version number. When reading the data, we don’t need to lock it to ensure the isolation effect of transactions (which can be understood as optimistic lock).

Multi version concurrency control (mvcc) works only under the two isolation levels of repeatable read and read committed. The other two isolation levels are incompatible with mvcc because read uncommitted always reads the latest data row instead of the data row conforming to the current transaction version; Serializable locks all rows read.

Most of MySQL’s transactional storage engines do not implement simple row level locks. Based on the consideration of improving concurrency performance, they generally implement multi version concurrency control (mvcc) at the same time. Not only mysql, but also other database systems such as Oracle and PostgreSQL have implemented mvcc, but their implementation mechanisms are different, because mvcc does not have a unified implementation standard, and typical are optimistic concurrency control and pessimistic concurrency control.

The process of mvcc is very similar to that of version control systems such as SVN. In other words, version control systems such as SVN use the idea of mvcc.

3、 What problems does multi version concurrency control solve?

1. Blocking between reading and writing

Through mvcc, read and write can not block each other, that is, read does not block write, write does not block read, which can improve the concurrent transaction processing ability.

Evolution ideas for improving Concurrency:

  • Ordinary lock, which can only be executed serially;
  • Read write lock, which can realize read concurrency;
  • Data multi version concurrency control can realize read-write concurrency.

2. The probability of deadlock is reduced

Because InnoDB mvcc adopts optimistic locking, it does not need to lock when reading data, and only lock the necessary rows for write operations.

3. Solve the problem of consistent reading

Consistent non locked read is also called snapshot read, which is also the default reading method of InnoDB storage engine. When we query the snapshot of the database at a certain point in time, we can only see the update results submitted by transactions before this point in time, but not after this point in time.

4、 How does InnoDB’s mvcc work?

1. How does InnoDB store multiple versions of records?

Transaction version number:Every time we start a transaction, we will get a transaction ID (i.e. transaction version number) from the database. This transaction ID is self increasing. Through the ID size, we can judge the time sequence of transactions.

Hidden columns for row records:The leaf segment of InnoDB stores data pages in which row records are saved, and there are some important hidden fields in the row records:

  • DB_ROW_ID: 6-byte, hidden row ID, used to generate the default cluster index. If the cluster index is not specified when we create the data table, InnoDB will use this hidden ID to create the cluster index. Clustering index can improve the efficiency of data search.
  • DB_TRX_ID: 6-byte, the transaction ID of the data, that is, the last transaction ID to insert or update the data. (InnoDB’s insert, update and delete will update the transaction ID, and delete will mark a special bit as deleted)
  • DB_ROLL_PTR: 7-byte, rollback pointer, that is, undo log information pointing to this record.

Undo Log:InnoDB saves row record snapshots in undo log. We can find them in the rollback segment, as shown in the following figure. The rollback pointer connects all snapshot records of data rows through the structure of linked list, and the records of each snapshot are saved in dB at that time_ trx_ ID, which is also the transaction ID for operating this data at that time point. In this way, if we want to find the historical snapshot, we can find it by traversing the rollback pointer.

For the delete operation, the InnoDB storage engine only sets the delete flag of the record to 1. The record has not been physically deleted, that is, the record still exists in the B + tree. This design is because the InnoDB storage engine supports mvcc, so the record cannot be processed immediately when the transaction is committed. At this time, other transactions may reference this line, When will this record be deleted? This is handled by the purge thread. The purge thread will judge whether the record is not referenced by any other transaction, so the real delete operation can be performed.

Reference link:Multi version concurrency control (mvcc) of MySQL