It must be known to all the partners who have participated in the background development interview that MySQL transaction is the knowledge that the major interviewers are never tired of asking, but what level do you know about the transaction
ACIDLast month, this article will accompany you to go deep into the transactions in MySQL.
2. Characteristics of transaction
The acid mentioned in the introduction is exactly the four characteristics of transaction
- Atomicity: the transaction is executed as a whole, and all the operations on the database contained in it are executed or not executed.
- Consistency: transactions should ensure that the state of the database changes from one consistent state to another. Consistent state means that the data in the database should meet the integrity constraints.
- Isolation: when multiple transactions are executed simultaneously, the execution of one transaction should not affect the execution of other transactions.
- Durability: changes to the database by committed transactions should be stored permanently in the database.
Consistency is not easy to understand. Consistency means that whether a transaction is committed or rolled back, it will not destroy the integrity of the data. For example, if a transfers 100 yuan to B, a’s account will be deducted 100 yuan, and B’s account will be increased 100 yuan; If it fails, the account balance of a and B will not change.The change of account amount in a and B must be a complete process (it is impossible for a to deduct 50 and B to increase 50). The whole process must be consistent.
The atomicity of a transaction refers to:Multiple operations in a transaction are inseparable, and can only be executed successfully or failed.
The atomicity of MySQL transactions is achieved through
undo logTo achieve.
undo logIt is unique to InnoDB storage engine. The specific implementation mechanism is to write all the changes (add, delete, change) to the log（
undo logIt is a logic log, which can be understood as: record the SQL statement opposite to the transaction operation, the transaction executes the insert statement, and the undo log records the delete statement. It records the log in the form of append write, and does not cover the previous log. In addition, undo log is also used to implement mvcc.
If some operations in a transaction are successful, but the other operations cannot be successfully executed due to power failure / system crash / other software and hardware errors, the successful operations can be cancelled by tracing back the log, so as to achieve the purpose of all operations failing.
Transaction persistence refers to:All changes of a transaction to the data will be permanently saved in the database.
The persistence of MySQL transactions is achieved through
redo logTo achieve.
redo logIt is also unique to InnoDB storage engine. The specific implementation mechanism is: when the data is modified (add, delete, modify), InnoDB engine will first write the record to
redo logAnd update the memory. At this time, the update is complete. At the same time, InnoDB engine will brush records to disk at the right time.
redo logIt is a physical log that records what changes have been made in a data page, not in the form of SQL statements. It has a fixed size. It records logs in a circular way. When the space is used up, it will cover the previous logs.
redo logInstead of writing directly to disk, write first
log buffer. And then wait for the right time to synchronize
OS bufferThen the operating system decides when to swipe to the disk
redo logAll from
OS buffer, and then to disk. Therefore, it is possible to lose the log due to power failure / hardware failure. For this reason, MySQL provides three persistence methods: here is a parameter
innodb_flush_log_at_trx_commitThis parameter mainly controls
log bufferData writing in
OS buffer, and the time point to the disk. The values are 0, 1 and 2 respectively. The default value is 1. The meaning of these three values is shown in the figure below:
First of all, check MySQL default setting mode 1, that is, write directly after each submission
OS bufferAnd call system functions
fsync()Write the log to disk. In terms of data consistency, this method is undoubtedly the safest. But we all know that safety means inefficiency most of the time. Write directly to each commit
OS bufferAnd writing to disk will undoubtedly lead to too many IO times per unit time and low efficiency. In addition, there are mode 0 and mode 2. Basically, it is written to disk once per second, so the efficiency is higher than that of mode 1. However, mode 0 writes the data to log buffer first and then to log buffer
OS bufferThen write to disk, and mode 2 is direct write
OS buffer, and then write it to disk, which reduces the process of data copy (from
OS buffer）So mode 2 is more efficient than mode 0.
redo logThen how do these two logs make the database recover from the abnormal state to the normal state?
After the database system crashes, restart, at this time the database is in an inconsistent state, you must first execute a
crash recoveryProcess: read first
redo logThe data that has been submitted successfully but has not been written to the disk is written to the disk again to ensure the persistence. Reread
undo logThe transaction that has not been successfully committed is rolled back to ensure atomicity.
crash recoveryAfter completion, the database is restored to a consistent state and can continue to be used.
The isolation of database transactions means that when multiple transactions are executed simultaneously, the execution of one transaction should not affect the execution of other transactions. Under normal circumstances, multiple transactions must operate the same database at the same time, so the isolation between transactions is essential.
If there is no isolation, the following problems will occur:
2.3.1 type I missing updates
When a transaction is revoked, it covers the update data submitted by another transaction.
Suppose that there are two transactions a and B operating the amount of the same account at the same time, as shown in the figure below:
Obviously, when transaction B cancels the transaction, it covers the update data submitted by transaction a in T4. A has taken 200 yuan at T3, and the balance should be 800 yuan. However, since the balance is 1000 yuan at the beginning of transaction B, the balance will become 1000 yuan after rollback. In this way, the user withdraws the money, but the balance remains unchanged, and the bank loses money to grandma’s house.
2.3.2 dirty reading
One transaction read the uncommitted update data of another transaction.
This is illustrated by the following figure
Transaction a took 200 yuan at T3, but did not submit it. Transaction B queries the balance at T4 to see the uncommitted updates of transaction a.
2.3.3 unreal reading
Unreal reading (unreal reading) means that one transaction reads the new data submitted by another transaction。
It is still illustrated in the figure:
When transaction B performs two statistical operations in the same transaction, another transaction inserts a record, resulting in different results, which seems to be an illusion. Another case is that transaction B updates a field of all the records in the table, and then transaction a inserts another record. Transaction B queries again and finds that one record has not been updated, which is also phantom reading.
2.3.4 non repeatable reading
Non repeatable read: one transaction has read the update data submitted by another transaction。
Non repeatable reading, as the name suggests, means that repeated reading of data in the same transaction will lead to inconsistency, as shown in the following figure:
Transaction B queries the balance in T2 and T5, but the result is different each time. This is not allowed in development. If the same transaction queries several times and returns different results each time, people will doubt the reliability of the database.
2.3.5 type II missing updates
When a transaction is committed, it covers the update data submitted by another transaction。
As can be seen from the above figure, after transaction a is submitted, the account balance has changed, and then transaction B still deducts the withdrawal amount based on the original amount (i.e. 1000). Transaction B submits transaction a to completely cover it. This is the second type of missing update.
Note that it is different from the first type of lost updates. The first type of lost updates focuses on transaction B’s final revocation of the transaction, and the second type is the final submission of the transaction.
To solve these five problems, MySQL provides four isolation levels:
- Serializable (serializable): transactions are executed in a serial way, which has high security and low efficiency
- Repeatable read: Yes
MySQL default isolation level, the same query in the same transaction will see the same data row, with high security and efficiency
- Read committed: one transaction can read the data submitted by another transaction, which has low security and high efficiency
- Read uncommitted: one transaction can read the uncommitted data of another transaction, which has low security and high efficiency
|Isolation level||Is the first type of missing update present||Is there dirty reading||Is there false reading||Is there non repeatable reading||Is there a second type of missing update|
3 Repeatable Read
Repeatable readMySQL is the default isolation level of MySQL, and it is also the most used isolation level, so it is necessary to take it out for in-depth understanding.
Repeatable ReadIt can’t solve the problem of unreal reading. Let’s take a look at an example.
First, create a table and insert a record
CREATE TABLE `student` ( `id` bigint(20) unsigned NOT NULL AUTO_ Input comment 'primary key', `stu_ ID ` bigint (20) not null default '0' comment 'Student ID', `stu_ Name ` varchar (100) default null comment 'student name', `created_ Date ` datetime not null comment 'creation time', `modified_ date` datetime NOT NULL ON UPDATE CURRENT_ Timestamp comment 'modification time', `ldelete_ Flag ` tinyint (1) not null default '0' comment 'logical deletion flag, 0: not deleted, 2: deleted', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_ Bin comment ='student information table '; INSERT INTO `student` VALUES (1, 230160340, 'Carson', '2016-08-20 16:37:00', '2016-08-31 16:37:05', 0);
Similarly, start two transactions, as shown in the following table:
|time||Transaction a||Transaction B|
|T1||SELECT * FROM student||–|
|T2||–||INSERT INTO student VALUES (2, 230160310, ‘Kata’, ‘2016-08-20 16:37:00’, ‘2016-08-31 16:37:05’, 0)|
|T4||SELECT * FROM student||–|
According to the above theory, there will be illusory reading phenomenon. In other words, transaction a’s query select in T4 will see the new data submitted by transaction B.
But it’s going to disappoint you.
The results are as follows
It is not consistent with the expected results, and there is no illusory reading phenomenon.
In fact, MySQL
Repeatable ReadIn isolation level, mvcc is used（
Multiversion Concurrency Control, multi version concurrency control) solves the unreal reading phenomenon of select ordinary query.
The specific implementation is that at the beginning of the transaction, the first select statement generates a snapshot of the query result set（
snapshot）And before the end of this transaction, the same select statement returns the result of this snapshot instead of the latest query result, which is the result of MySQL in
Repeatable ReadThe effect of isolation level on the use of ordinary select statementsSnapshot read（
What is the relationship between snapshot read and mvcc?
Mvcc is multi version concurrency control, and snapshot is one of them. So it can be said thatMvcc implements snapshot readingThe specific implementation involves the hidden columns of MySQL. MySQL automatically creates three hidden columns for each table:
DB_TRX_ID: transaction ID, which records the transaction ID of the data transaction for the operation (add, delete, change)
DB_ROLL_PTR: rollback pointer to record the position of the data of the previous version in undo log
DB_ROW_ID: Hidden ID. when there is no suitable index for creating a table as a clustered index, the hidden ID will be used to create a clustered index
undo logThe data of each version is recorded in the
DB_ROLL_PTREach historical version can be found, and the
DB_TRX_IDDecide which version (snapshot) to use. So it’s equivalent to
undo logMvcc is implemented, and mvcc implements snapshot reading.
In this way, MySQL’s
Repeatable ReadIsolation level utilizationSnapshot readThe problem of unreal reading has been solved.
But this is not the case. Let’s take another example
|time||Transaction a||Transaction B|
|T1||SELECT * FROM student||–|
|T2||–||INSERT INTO student VALUES (3, 230160312, ‘Luffy’, ‘2016-08-20 16:37:00’, ‘2016-08-31 16:37:05’, 0)|
|T4||UPDATE student SET stu_name = ‘Katakuri’ WHERE stu_name = ‘Luffy’;||–|
|T4||SELECT * FROM student||–|
Transaction a generates a snapshot at T1, and transaction B inserts a piece of data at T2Luffy, and then submit. Put it in T4LuffyUpdate toKatakuriAccording to the experience of the previous example, transaction a cannot be seen at this timeLuffyThis data is invalid, so the update will not succeed, and the query at T5 is the same as that at T1CarsonandKataTwo pieces of data.
But I’m going to let you down again
The results are as follows
However, the execution result is not as expected, and transaction a not only sees itLuffyAnd successfully changed it toKatakuri. Even after transaction a successfully commits, the query again is still like this.
This is actually MySQL’s right
deleteStatementCurrent reading（current read）。 Because it involves the modification of data, MySQL must get the latest data to modify, so the modification of data must not be usedSnapshot read（snapshot read）。 Since transaction a reads the new data submitted by transaction B, the unreal read mentioned above is generated.
Repeatable ReadHow does the isolation level solve the problem of unreal reading?
Clearance lock(gap lock). We all know thatInnoDB supports row lock, and row lock locks index. The gap lock is used to lock the gap of index record to ensure that the gap of index record remains unchanged. Gap lock is used for transaction isolation level of
Repeatable ReadOr above,Clearance lockTogether with the row lock
Next-Key Lock. When InnoDB scans an index record, it will first add an uplink lock to the index record, and then add an uplink lock to the gap on both sides of the index recordClearance lock（Gap Lock）。 addClearance lockAfter that, < font color = “red” > other transactions cannot insert records in this gap. This effectively prevents the occurrence of unreal reading < / font >.
By default, InnoDB works in
Repeatable ReadUnder the isolation level of, and with
Next-Key LockLock the index row in the same way. When the index of the query is unique (primary key, unique index), the InnoDB storage engine will
Next-Key LockOptimize to reduce it to row lock, locking only the index itself, not the range (unless locking values that don’t exist). If it is a normal index, it will be used
Next-Key LockLock the record with the gap</ font>
Query statement using snapshot read
SELECT * FROM ...
Use the currently read statement
SELECT * FROM ... lock in share mode SELECT * FROM ... for update INSERT INTO table ... UPDATE table SET ... DELETE table WHERE ...
This article mainly explains the MySQL transaction management
redo logThree ways of atomicity, persistence and log persistence are realized respectively. Five kinds of problems under database concurrency, four isolation levels and select unreal reading under RR isolation level are solved by mvcc mechanism
select ... lock in share mode/
select ... for update/
deleteThe magic of reading throughClearance lockTo solve the problem.
This article involves more in-depth, master the knowledge of this article, let you not only stay in the level of acid, isolation level, in the interview can turn passive into active, harvest big factory offer.