MySQL operation principle [transaction]

Time:2021-9-23

The article is “inside MySQL Technology: InnoDB storage engine (Second Edition)”Nuggets brochure “how MySQL works: understanding MySQL from the root”My notes

Transaction overview

In real life, we all have the following experiences (which is also one of the classic scenes of business):

  1. I have no money recently. I need to go to the bank to withdraw some money, and then go to the ATM to withdraw money
  2. Withdraw 100 yuan at the ATM
  3. The bank program will reduce the bank account by 100 yuan accordingly
  4. Finally get the money

The above steps are carried out under ideal conditions. If the ATM suddenly loses power / breaks down in the second step, and the money is not received, but the account is 100 yuan less, is it particularly wronged? Or, if the bank server suddenly goes down in the third step, the user gets the money, but the account capital is not reduced accordingly, The bank will lose 100 yuan accordingly.

Under such circumstances, the funds of both sides can not be guaranteed, which is unfavorable to both sides.

The purpose of the transaction is to ensure the normal operation of the above scenarios. The transaction will convert one consistent state of the database to another consistent state. When the database submits work, you can ensure that either all modifications are saved or all modifications will not be saved. In the above scenario, if the ATM machine suddenly loses power or the bank server goes down, No matter what exception occurs, the transaction will be rolled back directly and the changes will not be committed

Acid properties

In fact, we just want the database operation to fully comply with the state transition of our normal logic, and the transactions in the InnoDB storage engine also fully comply with the characteristics of acid.

Atomicity

As we all know, atom is the smallest physical unit in reality and can no longer be divided. In the database, atomicity means that the whole database transaction is an indivisible unit. Only when the transaction is executed successfully can the whole transaction be executed successfully. If any SQL statement fails to execute, the whole transaction fails and the successfully executed SQL will be withdrawn, The database state should be returned to the state before the transaction started.

Consistency

There are many constraints in real life. For example, our ID card numbers cannot be repeated, only men and women (normally speaking), traffic lights have only three colors, etc. there are also constraints in the database. For example, if the index of a column in a table is a unique index, then this column cannot have duplicate row data, Of course, more consistency requirements still need to rely on the people who write business code to ensure the consistency. The final consistency ensures the visibility of the data. The data in the intermediate state is invisible to the outside, and only the data in the initial state and the final state are visible to the outside.

Isolation

Isolation is also called “concurrency control, serialization, lock, etc.” in the real world, the two state transitions should not affect each other. For example, if you withdraw money from two ATMs at the same time, only the money withdrawn from one ATM is deducted, which is unreasonable, right, Therefore, the transaction isolation requires that the objects of each read-write transaction can be separated from the objects of other transaction operations independently, that is, they are invisible to other transactions before the transaction is committed, which is usually realized by locks.

Permanence

Once the transaction is committed, you can’t go back on it, because once the transaction is committed, the result is permanent. Even if the database goes down, the data can be recovered.

Transaction classification

  • Flat transaction: the simplest and most frequently used transactionbeginStart,commit workperhapsrollbackAt the end of the period, all operations are atomic operations, either executed or rolled back
  • Flat transaction with savepoint
    • In addition to the operations supported by flat transactions, rollback to an earlier state in the same transaction is allowed during transaction execution
    • The savepoint is used to inform the system that it should remember the current state of the transaction. Once an error occurs during the transaction, the transaction can return to the current state of the savepoint
  • Chain transaction
    • It can be regarded as a variant of the savepoint mode. For flat transactions with savepoints, when a system crash occurs, all savepoints will disappear because savepoints are volatile rather than persistent
    • This means that when recovering, the transaction needs to be re executed from the beginning, not from the nearest savepoint
  • Nested transaction: a top-level transaction controls transactions at all levels. The transactions nested under the top-level transaction are called sub transactions, which controls each local transformation
  • Distributed transaction: it is usually a flat transaction running in a distributed environment. Therefore, it is necessary to access different nodes in the network according to the location of the data

Transaction implementation

Atomicity, consistency and persistence pass through the databaseredo logandundo logTo finish,redo logIt is called redo log, which is used to ensure the atomicity and persistence of transactions,undo logUsed to ensure the consistency of transactions. Some people might thinkundo logyesredo logThe inverse process of, in fact, is not,redoandundoCan be regarded as a recovery operation,redoRestore the page operations that commit transaction modifications, and undo rollback records to a specific version, so the contents of the two records are different,redoIt is usually a physical log, which records the physical modification of the page,undoIt is a logical log, which is recorded according to each line of records.

redo log

It’s also mentioned aboveredo logFor the sake of transaction persistence (d), the modified content will be refreshed to the disk after the transaction is committed. Even if the database goes down, it will be refreshed after restartredo logThe recorded modifications are refreshed to disk.

useredo logBenefits:

  • redo logSmall footprint
  • redo logSequential write to disk: during transaction execution, several SQL statements will be generated without executing one SQL statementredo log, these logs are written to the disk in the generation order, that is, sequential IO is used

log block
In InnoDB storage engine, redo logs are stored in 512 bytes. Redo log cache and redo log files are saved in blocks, which are called redo log blocks, and the size of each block is 512 bytes

Redo log format
Since the storage management of InnoDB storage engine is page based, its redo log format is also page based. Its header format consists of three parts:

  1. redo_ log_ Type: type of redo log
  2. Space: ID of the tablespace
  3. page_ No: offset of the page

undo log

Redo is stored in the redo log file. Different from redo, Undo is stored in a special segment inside the database. This segment is called undo segment. The redo log records the behavior of transactions and can be used to “redo” pages. However, transactions sometimes need to be rolled back. In addition to rollback, Another function of Undo is mvcc (multi version concurrency control), that is, the implementation of mvcc of InnoDB storage engine is completed by Undo.

Undo storage management

The InnoDB storage engine has rollback segments. 1024 undo log segments are recorded for each rollback segment. The undo page is applied for each undo log segment.

Undo log format
  • Insert undo log: the undo log generated by the insert operation is deleted after the transaction is committed
  • Update undo log: the undo log generated by delete and update operations. The undo log may need to provide mvcc mechanism, so it cannot be deleted when the transaction is committed

purge

  • Purge is used to finally complete update and delete operations to support mvcc
  • Whether it can be completely deleted is determined by purge. If the row record is not referenced by the transaction, the real delete operation will be performed

group commit

If the transaction is a non read-only transaction, you need to perform a fsync operation every time the transaction is committed to ensure that the redo logs have been written to the disk. In order to improve the efficiency of disk fsync, the current database provides the function of group commit, that is, one fsync refresh can ensure that multiple transaction logs are written to files.

For InnoDB, there are two stages to commit a transaction:

  • Modify the information corresponding to the transaction in memory and write the log to the redo log buffer
  • Calling fsync ensures that all logs are written to disk from the redo log buffer

Transaction control statement

  • Start transaction, begin: start displaying transactions
  • Commit: commit a transaction
  • Rollback: rollback transaction
  • Savepoint identifier: allows you to create a savepoint in a transaction. A transaction can be saved by multiple savepoints
  • Release savepoint identifier: delete a transaction savepoint. When there is no savepoint, an exception will be thrown when the statement is executed
  • Rollback to [savepoint] identifier: this statement is used with the savepoint command to roll back the transaction to the marker point without rolling back any work before the marker point
  • Set transition: sets the isolation level of transactions

Bad habits when using transactions

  • Commit in loop
  • Use auto submit
  • Use automatic rollback

Long affairs

  • Long running transactions
  • For long transactions, sometimes they can be converted to small batch transactions. When an error occurs in a transaction, only part of the data needs to be rolled back, and then the last completed transaction continues to be executed

This work adoptsCC agreement, reprint must indicate the author and the link to this article