MySQL is one of the most frequently contacted databases in our daily production and learning. Today, let’s talk about the isolation level in MySQL (or other similar databases) and the multi version concurrency control (mvcc) used to improve efficiency.
First, we need to mention a concept: transaction. What is a transaction? A transaction is a collection of a series of operation statements that complete a basic operation. For example, if I want to transfer 200 yuan from account a to account B, I may perform the following operations:
a. Verify whether the balance in account a is greater than 200 yuan.
b. Reduce the balance in account a by 200 yuan.
c. Add 200 yuan to the balance in account B.
Let’s turn the above three ABC operations into a transaction.
At this time, we will notice that a transaction we are talking about may be composed of multiple statements, and the transaction has atomicity, that is, the execution process of the transaction can not be interrupted, which brings a problem. If another sentence is inserted during the three-step execution process, will it affect the result, Because the atomicity of the transaction is destroyed. This kind of insertion is very common in the concurrent environment. Therefore, we (or database engine) need to “protect” a transaction during its execution, that is, to ensure that the statements of other external transactions cannot be inserted into the executing transaction statements at will, so as to ensure the normal execution of transactions. At this time, we can easily think of the method of “locking”. This is actually a very general statement, because locking can ensure the normal execution of transactions, but it will bring large additional overhead. Therefore, choosing the appropriate locking method at the right time will have a great impact on the search efficiency. Whether the “lock” is strict or not, we can distinguish different isolation levels in addition to concentration.
Under such an isolated domain, data reading will not be affected. That is, you can even read a data being modified by other transactions. If you want to read it, you can change it if you want to. Of course, the cost is small, but it will bring many problems, such as “dirty reading”. That is, the data being modified but not submitted is read, which will cause data reading errors. In terms of performance, read uncommitted is not much better than other levels, but it brings a lot of troublesome problems. Therefore, it is rarely used in practice.
This level adds some provisions on the basis of read uncommitted, which is the default isolation level of some databases. The difference between read uncommitted and read uncommitted is that it stipulates that the data read during reading can only be the submitted data. For example, the value of data a after the last commit is 1. At this time, a thread comes in to modify a to 2, but the transaction (COMMIT) is not committed at this time. In this case, the value of a read under the read uncommitted level is the current 2, but the value read under the read committed level is still the value after the last commit, that is, a is 1, The modified thread must change the value of a to 2. The value of a read after the transaction is committed is 2. The problem with this level is that it is not repeatable. That is, the value of a read at the last time is 1, but with the submission of the transaction by the modification thread, the value of a becomes 2. At this time, the read value is 2, that is, the values obtained by performing the same read operation twice are different.
The difference between non repeatable read and dirty read is that a dirty read is that a transaction reads the data in the execution of another unfinished transaction, while a non repeatable read is that another transaction commits and modifies the data being read by the current transaction in the execution of a transaction.
Repeated read adds some restrictive rules on the basis of read committed, which is also the default isolation level of MySQL database. To put it simply, other transactions are prohibited from modifying the corresponding data during the execution of a transaction, which completely makes the data queried during the execution of a transaction consistent, that is, it solves the problems of dirty reading and non repeatable reading, but it brings a new problem, namely “phantom reading”.
“Unreal reading” means that during the execution of a transaction, although the modification of the corresponding data is prohibited, other transactions can still insert data. At this time, the first transaction will find that there will be “inexplicably” more data, like an illusion. Both unreal reading and non repeatable reading read another committed transaction (which is different from dirty reading). The difference is that the non repeatable reading query is the same data item, while unreal reading is for a batch of data as a whole (such as the number of data).
This is the most stringent isolation level. It avoids the problem of unreal reading by forcing the serial execution of transactions. However, this isolation level is very expensive and is generally not often used.
The relationship between various isolation levels and possible problems is as follows:
|Isolation level||Dirty reading||Non repeatable reading||Unreal reading||Lock|
Just imagine that if each SQL operation needs to add a row level lock to ensure data consistency and accuracy, it is very reliable, but the system overhead and search efficiency are also very obvious. Therefore, mvcc is produced to solve this contradiction.
First, mvcc will save two hidden columns after each row record in the table, one is the creation time of the saved row, and the other is the expiration (deletion) time of the saved row. This time value is not a real time, but a system version number. The system version number at the beginning of the transaction is used as the transaction version number to compare with the version number of each row of records queried.
- Insert: save the current system version number as the line version number for each newly inserted line.
- Delete: saves the current system version number for each deleted line, and deletes the version number for the most deleted line.
- Update: update should be understood as the process of inserting a new piece of data and deleting the original data, that is, save the current system version number as the line version number for the newly inserted data, and save the current system version number as the deletion version number for the deleted data.
- Select: query only the rows that meet the following conditions:
a. The line version number is less than or equal to the transaction version number
b. The deletion version number is undefined or greater than the transaction version number
After saving these two version numbers, most operations can be performed correctly without locking, ensuring performance and efficiency.
It is worth noting that mvcc only works under the two isolation levels of read committed and repeatable read.
The above is a detailed explanation of MySQL database isolation level and mvcc. For more information about MySQL database isolation level and mvcc, please pay attention to other relevant articles of developeppaer!