Understand MySQL mvcc



Concurrency control is multi version control. Mvcc is a method of concurrency control. Generally, it realizes the concurrent access to the database in the database management system.

Why mvcc? Databases often use locks for isolation. The most primitive lock, after locking a resource, will prohibit any other thread from accessing the same resource. However, one of the characteristics of many applications is the scenario of reading more and writing less. The reading times of many data are much greater than the modification times, and it is not necessary for the read data to be mutually exclusive. Therefore, a method of read-write lock is used. Read lock and read lock are not mutually exclusive, but write lock, write lock and read lock are mutually exclusive. This greatly improves the concurrency of the system. Later, people found that concurrent reading was not enough, and proposed a method to ensure that there was no conflict between reading and writing, that is, when reading data, they saved the data in a way similar to snapshot, so that the read lock did not conflict with the write lock, and different transaction sessions would see their own specific version of data. Of course, snapshot is a conceptual model, and different databases may implement this function in different ways.

InnoDB and mvcc

Mvcc only works under the two isolation levels of read committed and repeatable read. The other two isolation levels are incompatible with mvcc, because read uncommitted always reads the latest data row, not the data row that conforms to the current transaction version. Serializable locks all rows read.

Redo log, bin log, Undo log

InnoDB realizes multi version of data through undo log, while concurrency control is realized through lock.

Undo log is not only used to implement mvcc, but also used for transaction rollback. There are many logs in MySQL InnoDB. In addition to error logs and query logs, there are also many logs related to data persistence and consistency.

Binlog is a log generated by the MySQL service layer, which is commonly used for data recovery and database replication. The common MySQL master-slave architecture is realized by using the binlog of the slave synchronization master. In addition, by parsing binlog, the data replication from Mysql to other data sources (such as elasticsearch) can be realized.

Redo log records the modification of data operations at the physical level. MySQL uses a large amount of cache. The cache exists in memory. During the modification operation, the memory will be modified directly rather than the disk immediately. When the data in memory and disk are inconsistent, the data in memory is called dirty page. In order to ensure the security of data, redo logs will be generated continuously during the transaction. A flush operation will be performed when the transaction is committed and saved to the disk. Redo logs are written in order, and the speed of sequential reading and writing on the disk is much faster than that of random reading and writing. When the database or host fails and restarts, the data will be recovered according to the redo log. If there is a transaction commit in the redo log, the transaction commit will be performed to modify the data. In this way, the atomicity, consistency and persistence of transactions are realized.

Undo log: in addition to recording redo log, undo log is also recorded when data is modified. Undo log is used for data withdrawal. It records the reverse operations of modification, such as inserting corresponding deletion, modifying corresponding modification to the original data. Through undo log, business rollback can be realized, and mvcc can be realized by backtracking to a specific version of data according to undo log.

Version chain and undo log

InnoDB uses the B + tree as the data structure of the index, and the index of the primary key is clusterindex (cluster index), and the corresponding data content is saved in the leaf node in clusterindex. A table can only have one primary key, so it can only have one cluster index. If the table has no primary key defined, select the first non null unique index as the cluster index. If not, generate a hidden ID column as the cluster index.

Indexes other than cluster index are secondary indexes. The leaf node in the secondary index stores the value of the leaf node of the cluster index.

In addition to the ROWID mentioned just now, there is also TRX in the InnoDB line record_ ID and DB_ roll_ ptr, trx_ ID indicates the ID of the recently modified transaction, DB_ roll_ PTR points to undo log in undo segment.

When adding a transaction, the transaction ID will increase, TRX_ ID can indicate the order in which the transaction starts.

Undo log is divided into insert and update. Delete can be regarded as a special update, that is, modify the deletion mark on the record.

Update undo log records the data information before the data. Through this information, you can restore to the state of the previous version.

When inserting, the generated insert undo log can be deleted after the transaction is committed, because other transactions do not need this undo log.

When deleting or modifying, the corresponding undo log will be generated and the DB in the current data record will be deleted_ roll_ PTR points to the new undo log.

Understand MySQL mvcc


Having finished undo log, let’s take a look at readview. The difference between committed reads and repeatable reads is that they have different strategies for generating readviews.

Readview mainly has a list to store the currently active read-write transactions in our system, that is, begin the uncommitted transactions. Use this list to determine whether a version of the record is visible to the current transaction. The main visibility related attributes are as follows:

up_limit_id: currently submitted transaction number + 1, transaction number < up_ limit_ ID, which is visible to the current read view. It is understood that when creating the read view, the previously committed transaction must be visible to the transaction.

low_limit_id: current maximum transaction number + 1, transaction number > = low_ limit_ ID, which is invisible to the current read view. It is understood that the transaction created after creating the read view must be invisible to the transaction.

trx_ids: is the list of active transaction IDS, that is, the list of uncommitted transactions at the time of read view initialization. So when reading RR, TRX_ The transaction in IDS is invisible to this transaction (except for its own transaction, the modification of its own transaction to the table is certainly visible to itself). It is understood that when creating an RV, the current active transaction ID will be recorded, and even if they commit later, it will not be visible to this transaction.

Use a picture to better understand:

Understand MySQL mvcc

Finally, let’s give an example to better understand the above content.

For example, we have the following table:

Understand MySQL mvcc

Now there is a transaction ID of 60. Execute the following statement and commit:

Update user set name = 'Qiang Ge 1' where id = 1;

At this time, the undo log has a version chain as follows:

Understand MySQL mvcc

After submitting the record with transaction ID of 60, there is a transaction with transaction ID of 100. Modify name = qiangge 2, but the transaction has not been submitted. The version chain at this time is:

Understand MySQL mvcc

At this time, another transaction initiates a select statement to query the record with id = 1 because of TRX_ IDS currently only has a transaction ID of 100, so this record is not visible. Continue to query the next one and find TRX_ Transaction number with id = 60 is less than up_ limit_ ID, it can be seen and the result Qiang Ge 1 is returned directly.

At this time, we commit the transaction with transaction ID of 100, create a new record with transaction ID of 110 and modify ID of 1, name = Qiang Ge 3, and do not commit the transaction. At this time, the version chain is:

Understand MySQL mvcc

At this time, the select transaction executed another query to query the record with ID 1.

If you are a committed read isolation level read_ Committed. At this time, you will re create a readview, and the value in your active transaction list will change to [110]. According to the above statement, you go to the version chain through TRX_ The appropriate result found by ID comparison is strong brother 2.

If you are a repeatable read isolation level repeatable_ Read. At this time, your readview is still the readview generated during the first selection, that is, the value of the list is still [100]. So the result of select is strong brother 1. Therefore, the result of the second select is the same as that of the first, so it is called repeatable reading!

In other words, transactions under the committed read isolation level will generate an independent readview at the beginning of each query, while the repeatable read isolation level will generate a readview at the first reading, and subsequent reads will reuse the previous readview.

This is the mvcc of MySQL. Through the version chain, multiple versions can be realized, which can be read-write and write read concurrently. Different isolation levels are achieved through different readview generation policies.

reference resources:https://zhuanlan.zhihu.com/p/66791480