Enjoy learning MySQL series: deep understanding of transactions in MySQL

Time:2021-5-12

1 Introduction

It must be known to all the partners who have participated in the background development interview that MySQL transaction is the knowledge that the major interviewers are never tired of asking, but what level do you know about the transactionACIDLast month, this article will accompany you to go deep into the transactions in MySQL.

2. Characteristics of transaction

The acid mentioned in the introduction is exactly the four characteristics of transactionAtomicity(Atomicity)Consistency(Consistency)、Isolation(Isolation)、persistence(Durability)

  • Atomicity: the transaction is executed as a whole, and all the operations on the database contained in it are executed or not executed.
  • Consistency: transactions should ensure that the state of the database changes from one consistent state to another. Consistent state means that the data in the database should meet the integrity constraints.
  • Isolation: when multiple transactions are executed simultaneously, the execution of one transaction should not affect the execution of other transactions.
  • Durability: changes to the database by committed transactions should be stored permanently in the database.

Consistency is not easy to understand. Consistency means that whether a transaction is committed or rolled back, it will not destroy the integrity of the data. For example, if a transfers 100 yuan to B, a’s account will be deducted 100 yuan, and B’s account will be increased 100 yuan; If it fails, the account balance of a and B will not change.The change of account amount in a and B must be a complete process (it is impossible for a to deduct 50 and B to increase 50). The whole process must be consistent.

2.1 atomicity

The atomicity of a transaction refers to:Multiple operations in a transaction are inseparable, and can only be executed successfully or failed.
The atomicity of MySQL transactions is achieved throughundo logTo achieve.undo logIt is unique to InnoDB storage engine. The specific implementation mechanism is to write all the changes (add, delete, change) to the log(undo log)。

undo logIt is a logic log, which can be understood as: record the SQL statement opposite to the transaction operation, the transaction executes the insert statement, and the undo log records the delete statement. It records the log in the form of append write, and does not cover the previous log. In addition, undo log is also used to implement mvcc.

If some operations in a transaction are successful, but the other operations cannot be successfully executed due to power failure / system crash / other software and hardware errors, the successful operations can be cancelled by tracing back the log, so as to achieve the purpose of all operations failing.

2.2 persistence

Transaction persistence refers to:All changes of a transaction to the data will be permanently saved in the database.
The persistence of MySQL transactions is achieved throughredo logTo achieve.redo logIt is also unique to InnoDB storage engine. The specific implementation mechanism is: when the data is modified (add, delete, modify), InnoDB engine will first write the record toredo logAnd update the memory. At this time, the update is complete. At the same time, InnoDB engine will brush records to disk at the right time.

redo logIt is a physical log that records what changes have been made in a data page, not in the form of SQL statements. It has a fixed size. It records logs in a circular way. When the space is used up, it will cover the previous logs.

undo logandredo logInstead of writing directly to disk, write firstlog buffer. And then wait for the right time to synchronizeOS bufferThen the operating system decides when to swipe to the disk
Enjoy learning MySQL series: deep understanding of transactions in MySQLsinceundo logandredo logAll fromlog bufferreachOS buffer, and then to disk. Therefore, it is possible to lose the log due to power failure / hardware failure. For this reason, MySQL provides three persistence methods: here is a parameterinnodb_flush_log_at_trx_commitThis parameter mainly controlsInnoDBtakelog bufferData writing inOS buffer, and the time point to the disk. The values are 0, 1 and 2 respectively. The default value is 1. The meaning of these three values is shown in the figure below:
Enjoy learning MySQL series: deep understanding of transactions in MySQL
First of all, check MySQL default setting mode 1, that is, write directly after each submissionOS bufferAnd call system functionsfsync()Write the log to disk. In terms of data consistency, this method is undoubtedly the safest. But we all know that safety means inefficiency most of the time. Write directly to each commitOS bufferAnd writing to disk will undoubtedly lead to too many IO times per unit time and low efficiency. In addition, there are mode 0 and mode 2. Basically, it is written to disk once per second, so the efficiency is higher than that of mode 1. However, mode 0 writes the data to log buffer first and then to log bufferOS bufferThen write to disk, and mode 2 is direct writeOS buffer, and then write it to disk, which reduces the process of data copy (fromlog bufferreachOS buffer)So mode 2 is more efficient than mode 0.

I seeundo logandredo logThen how do these two logs make the database recover from the abnormal state to the normal state?

After the database system crashes, restart, at this time the database is in an inconsistent state, you must first execute acrash recoveryProcess: read firstredo logThe data that has been submitted successfully but has not been written to the disk is written to the disk again to ensure the persistence. Rereadundo logThe transaction that has not been successfully committed is rolled back to ensure atomicity.crash recoveryAfter completion, the database is restored to a consistent state and can continue to be used.

2.3 isolation

The isolation of database transactions means that when multiple transactions are executed simultaneously, the execution of one transaction should not affect the execution of other transactions. Under normal circumstances, multiple transactions must operate the same database at the same time, so the isolation between transactions is essential.
If there is no isolation, the following problems will occur:

2.3.1 type I missing updates

When a transaction is revoked, it covers the update data submitted by another transaction.
Suppose that there are two transactions a and B operating the amount of the same account at the same time, as shown in the figure below:
Enjoy learning MySQL series: deep understanding of transactions in MySQL
Obviously, when transaction B cancels the transaction, it covers the update data submitted by transaction a in T4. A has taken 200 yuan at T3, and the balance should be 800 yuan. However, since the balance is 1000 yuan at the beginning of transaction B, the balance will become 1000 yuan after rollback. In this way, the user withdraws the money, but the balance remains unchanged, and the bank loses money to grandma’s house.

2.3.2 dirty reading

One transaction read the uncommitted update data of another transaction.
This is illustrated by the following figure
Enjoy learning MySQL series: deep understanding of transactions in MySQL
Transaction a took 200 yuan at T3, but did not submit it. Transaction B queries the balance at T4 to see the uncommitted updates of transaction a.

2.3.3 unreal reading

Unreal reading (unreal reading) means that one transaction reads the new data submitted by another transaction
It is still illustrated in the figure:
Enjoy learning MySQL series: deep understanding of transactions in MySQL
When transaction B performs two statistical operations in the same transaction, another transaction inserts a record, resulting in different results, which seems to be an illusion. Another case is that transaction B updates a field of all the records in the table, and then transaction a inserts another record. Transaction B queries again and finds that one record has not been updated, which is also phantom reading.

2.3.4 non repeatable reading

Non repeatable read: one transaction has read the update data submitted by another transaction
Non repeatable reading, as the name suggests, means that repeated reading of data in the same transaction will lead to inconsistency, as shown in the following figure:
Enjoy learning MySQL series: deep understanding of transactions in MySQL
Transaction B queries the balance in T2 and T5, but the result is different each time. This is not allowed in development. If the same transaction queries several times and returns different results each time, people will doubt the reliability of the database.

2.3.5 type II missing updates

When a transaction is committed, it covers the update data submitted by another transactionEnjoy learning MySQL series: deep understanding of transactions in MySQL
As can be seen from the above figure, after transaction a is submitted, the account balance has changed, and then transaction B still deducts the withdrawal amount based on the original amount (i.e. 1000). Transaction B submits transaction a to completely cover it. This is the second type of missing update.

Note that it is different from the first type of lost updates. The first type of lost updates focuses on transaction B’s final revocation of the transaction, and the second type is the final submission of the transaction.

To solve these five problems, MySQL provides four isolation levels:

  • Serializable (serializable): transactions are executed in a serial way, which has high security and low efficiency
  • Repeatable read: YesMySQL default isolation level, the same query in the same transaction will see the same data row, with high security and efficiency
  • Read committed: one transaction can read the data submitted by another transaction, which has low security and high efficiency
  • Read uncommitted: one transaction can read the uncommitted data of another transaction, which has low security and high efficiency
Isolation level Is the first type of missing update present Is there dirty reading Is there false reading Is there non repeatable reading Is there a second type of missing update
Serializable no no no no no
Repeatable Read no no yes no no
Read Commited no no yes yes yes
Read Uncommited no yes yes yes yes

3 Repeatable Read

Repeatable readMySQL is the default isolation level of MySQL, and it is also the most used isolation level, so it is necessary to take it out for in-depth understanding.Repeatable ReadIt can’t solve the problem of unreal reading. Let’s take a look at an example.
First, create a table and insert a record

CREATE TABLE `student` (
  `id` bigint(20) unsigned NOT NULL AUTO_ Input comment 'primary key',
  `stu_ ID ` bigint (20) not null default '0' comment 'Student ID',
  `stu_ Name ` varchar (100) default null comment 'student name',
  `created_ Date ` datetime not null comment 'creation time',
  `modified_ date` datetime NOT NULL ON UPDATE CURRENT_ Timestamp comment 'modification time',
  `ldelete_ Flag ` tinyint (1) not null default '0' comment 'logical deletion flag, 0: not deleted, 2: deleted',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_ Bin comment ='student information table ';


INSERT INTO `student` VALUES (1, 230160340, 'Carson', '2016-08-20 16:37:00', '2016-08-31 16:37:05', 0);

Similarly, start two transactions, as shown in the following table:

time Transaction a Transaction B
T1 SELECT * FROM student
T2 INSERT INTO student VALUES (2, 230160310, ‘Kata’, ‘2016-08-20 16:37:00’, ‘2016-08-31 16:37:05’, 0)
T3 commit
T4 SELECT * FROM student

According to the above theory, there will be illusory reading phenomenon. In other words, transaction a’s query select in T4 will see the new data submitted by transaction B.
But it’s going to disappoint you.

The results are as follows
Enjoy learning MySQL series: deep understanding of transactions in MySQL
It is not consistent with the expected results, and there is no illusory reading phenomenon.
In fact, MySQLRepeatable ReadIn isolation level, mvcc is used(Multiversion Concurrency Control, multi version concurrency control) solves the unreal reading phenomenon of select ordinary query.
The specific implementation is that at the beginning of the transaction, the first select statement generates a snapshot of the query result set(snapshot)And before the end of this transaction, the same select statement returns the result of this snapshot instead of the latest query result, which is the result of MySQL inRepeatable ReadThe effect of isolation level on the use of ordinary select statementsSnapshot readsnapshot read)。

What is the relationship between snapshot read and mvcc?

Mvcc is multi version concurrency control, and snapshot is one of them. So it can be said thatMvcc implements snapshot readingThe specific implementation involves the hidden columns of MySQL. MySQL automatically creates three hidden columns for each table:

  • DB_TRX_ID: transaction ID, which records the transaction ID of the data transaction for the operation (add, delete, change)
  • DB_ROLL_PTR: rollback pointer to record the position of the data of the previous version in undo log
  • DB_ROW_ID: Hidden ID. when there is no suitable index for creating a table as a clustered index, the hidden ID will be used to create a clustered index

becauseundo logThe data of each version is recorded in theDB_ROLL_PTREach historical version can be found, and theDB_TRX_IDDecide which version (snapshot) to use. So it’s equivalent toundo logMvcc is implemented, and mvcc implements snapshot reading.

In this way, MySQL’sRepeatable ReadIsolation level utilizationSnapshot readThe problem of unreal reading has been solved.
But this is not the case. Let’s take another example

time Transaction a Transaction B
T1 SELECT * FROM student
T2 INSERT INTO student VALUES (3, 230160312, ‘Luffy’, ‘2016-08-20 16:37:00’, ‘2016-08-31 16:37:05’, 0)
T3 commit
T4 UPDATE student SET stu_name = ‘Katakuri’ WHERE stu_name = ‘Luffy’;
T4 SELECT * FROM student

Transaction a generates a snapshot at T1, and transaction B inserts a piece of data at T2Luffy, and then submit. Put it in T4LuffyUpdate toKatakuriAccording to the experience of the previous example, transaction a cannot be seen at this timeLuffyThis data is invalid, so the update will not succeed, and the query at T5 is the same as that at T1CarsonandKataTwo pieces of data.

But I’m going to let you down again

The results are as follows
Enjoy learning MySQL series: deep understanding of transactions in MySQL

However, the execution result is not as expected, and transaction a not only sees itLuffyAnd successfully changed it toKatakuri. Even after transaction a successfully commits, the query again is still like this.

This is actually MySQL’s rightinsertupdateanddeleteStatementCurrent reading(current read)。 Because it involves the modification of data, MySQL must get the latest data to modify, so the modification of data must not be usedSnapshot read(snapshot read)。 Since transaction a reads the new data submitted by transaction B, the unreal read mentioned above is generated.

So inRepeatable ReadHow does the isolation level solve the problem of unreal reading?

It’s throughClearance lock(gap lock). We all know thatInnoDB supports row lock, and row lock locks index. The gap lock is used to lock the gap of index record to ensure that the gap of index record remains unchanged. Gap lock is used for transaction isolation level ofRepeatable ReadOr above,Clearance lockTogether with the row lockNext-Key Lock. When InnoDB scans an index record, it will first add an uplink lock to the index record, and then add an uplink lock to the gap on both sides of the index recordClearance lock(Gap Lock)。 addClearance lockAfter that, < font color = “red” > other transactions cannot insert records in this gap. This effectively prevents the occurrence of unreal reading < / font >.

By default, InnoDB works inRepeatable ReadUnder the isolation level of, and withNext-Key LockLock the index row in the same way. When the index of the query is unique (primary key, unique index), the InnoDB storage engine willNext-Key LockOptimize to reduce it to row lock, locking only the index itself, not the range (unless locking values that don’t exist). If it is a normal index, it will be usedNext-Key LockLock the record with the gap</ font>

Query statement using snapshot read

SELECT * FROM ...

Use the currently read statement

SELECT * FROM ... lock in share mode
SELECT * FROM ... for update
INSERT INTO table ...
UPDATE table SET ...
DELETE table WHERE ...

4 Summary

This article mainly explains the MySQL transaction managementACIDFour characteristics,undo logandredo logThree ways of atomicity, persistence and log persistence are realized respectively. Five kinds of problems under database concurrency, four isolation levels and select unreal reading under RR isolation level are solved by mvcc mechanismselect ... lock in share mode/select ... for update/insert/update/deleteThe magic of reading throughClearance lockTo solve the problem.
This article involves more in-depth, master the knowledge of this article, let you not only stay in the level of acid, isolation level, in the interview can turn passive into active, harvest big factory offer.

A little attention, not lost