1、 Transaction classification
Transaction is a program execution unit that accesses and updates various data items in the database. Transaction will convert the database from one consistent state to another consistent state. This is the purpose of transaction and one of the important characteristics that distinguish the transaction model from the file system.
Transactions in the InnoDB storage engine (read repeatable isolation level) fully comply with the characteristics of acid. Acid is the abbreviation of the following four words:
- Atomicity: a database transaction is an inseparable unit of work. The database operations in the transaction are either successful or unsuccessful.
- Consistency: change the database from one state to the next. Before and after the transaction, the integrity constraints of the database are not destroyed.
- Isolation: the objects of each read-write transaction can be separated from the operation objects of other transactions, that is, the transaction is invisible to other transactions before it is committed. This is usually achieved by locking.
- Persistence: once a transaction is committed, the result is permanent.
From the perspective of transaction theory, transactions can be divided into the following types:
- Flat transactions
- Flat transactions with savepoints
- Chained transactions
- Nested transactions
- Distributed transactions
1. Flat transactions
The simplest and most frequently used transactions, in which operations are atomic, are either executed or rolled back, that is, the transaction concept we understand in the general sense.
2. Flat transactions with savepoints
It is allowed to roll back to an earlier state of the same transaction during transaction execution. The savepoint is used to inform the system that it should remember the current state of the transaction, so that when an error occurs later, the transaction can return to the current state of the savepoint.
3. Chained transactions
When a transaction is submitted, the unnecessary data objects are released and the necessary processing context is implicitly passed to the next transaction to start, which means that the next transaction will see the result of the previous transaction, as if it were carried out in a transaction.
4. Nested transactions
A top-level transaction controls all levels of transactions. The transactions nested under the top-level transaction are called sub transactions, which control each local transformation. Moss describes nested transactions as follows:
1) A nested transaction is a tree composed of several transactions. The subtree can be either a nested transaction or a flat transaction.
2) The transaction at the leaf node is a flat transaction, but the distance from the root to the leaf node can be different for each sub transaction.
3) Transactions at the root node are called top-level transactions, other transactions are called child transactions, the precursor of a transaction is called a parent transaction, and the next layer of a transaction is called a child transaction.
4) A child transaction can be committed or rolled back, but its commit operation will not take effect immediately unless its parent transaction has been committed.
5) The rollback of any transaction in the tree will cause all its sub transactions to be rolled back together, so the sub transactions only retain the characteristics of a, C and I, and do not have the characteristics of D.
5. Distributed transactions
Distributed transaction refers to allowing multiple independent transactional resources to participate in a global transaction. Global transaction requires that all participating transactions either commit or roll back.
In addition, when using distributed transactions, the transaction isolation level of the InnoDB storage engine must be set to serial.
2、 Transaction control statement
Under the default setting of MySQL command line, transactions are automatically committed, that is, the commit operation will be executed immediately after the SQL statement is executed. Therefore, to display a transaction, you need to use the transaction control statement.
- Start transaction | begin: explicitly start a transaction;
- Commit | commit work: commit the transaction and make all modifications to the database permanent;
- Rollback | rollback work: rollback the user’s transactions and undo all ongoing uncommitted transactions;
- Savepoint [identifier]: create a savepoint. There can be multiple savepoints in a transaction;
- Rollback to [savepoint]: roll back the transaction to the marked point without rolling back any work before the marked point;
- Set transaction: used to set the isolation level of transactions;
Commit and commit work statements are basically the same. They are used to commit transactions. The difference is that commit work is used to control whether the behavior after the transaction is chain or release. This behavior is controlled by
SHOW VARIABLES LIKE '%completion_type%';
The following SQL statements will generate an implicit commit operation, that is, after executing these statements, there will be an implicit commit operation, that is, these SQL statements cannot be rolled back.
- DDL statement;
- Management statements: analyze table, cache index, check table, load index into cache, optimize table, repair table;
- Operations for implicitly modifying MySQL schema: create user, drop user, grant, rename user, revoke, set password;
3、 Transaction isolation level
ISO and anis SQL standards establish four transaction isolation levels:
- Read uncommitted
- Read committed
- Repeatable read repeatable read
- Serializable serialization
For details on transaction isolation levels, seeLocks in InnoDB storage engineThere are explanations for reference.
In the InnoDB storage engine, you can set the current session or global transaction isolation level using the following command:
SET [GLOBAL | SESSION] TRANSACTION ISOLATION LEVEL [READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE]
It is understood that most users question the performance problems caused by the serializable isolation level, but according to Jim Gray’s book transaction processing, the overhead of the two is almost the same, and even serializable may be better!!! Therefore, selecting the transaction isolation level of repeatable read in the InnoDB storage engine will not cause any performance loss. Similarly, even if the isolation level of read committed is used, users will not get a significant improvement in performance.
Under the transaction isolation level of serializable, the InnoDB storage engine will automatically add lock in share mode after each select statement, that is, add a shared lock for each read operation.
4、 Distributed transaction
Xa (Extended Architecture) refers to the specification of distributed transaction processing proposed by X / open organization. Xa is a distributed transaction protocol proposed by tuxedo, so distributed transactions are also called XA transactions.
Xa protocol mainly defines the interface between transaction manager TM (transaction manager, coordinator) and resource manager RM (resource manager, participant). Among them, the resource manager is often implemented by databases, such as Oracle, DB2 and mysql. These commercial databases all implement the XA interface, and the transaction manager, as the global scheduler, is responsible for the submission and rollback of each local resource.
Xa transactions consist of one or more resource managers (RMS), a transaction manager (TM), and an application program.
Xa transaction is implemented based on two-phase commit (2pc) protocol, which can ensure the strong consistency of data. Many distributed relational data management systems use this protocol to complete distributed transactions. Phase 1 is the preparation phase, that is, all participants are ready to execute transactions and lock the required resources. When participants are ready, report to TM that they are ready. Phase II is the submission phase. When TM confirms that all participants are ready, it sends a commit command to all participants.
The disadvantages of XA transactions are poor performance (too frequent interaction between coordinators and participants) and unable to meet high concurrency scenarios. The performance of XA transactions between a database and multiple databases will differ greatly. Therefore, XA transactions should be avoided as far as possible. For example, data can be written locally, data can be distributed with high-performance message system, or database replication and other technologies can be used. XA can only be used when other methods cannot meet business requirements and performance is not a bottleneck.
Xa transaction is rarely used in practical engineering, and its theoretical value is greater than its practical significance.
SHOW VARIABLES LIKE 'innodb_support_xa';
In addition to Xa, which can solve distributed transaction problems, common problems include TCC, local consumption table, final consistency of reliable messages, best effort notification, etc. you can refer to this article of Xiaomi community:https://xiaomi-info.github.io/2020/01/02/distributed-transaction/
The method of distributed transaction based on mvcc is to assign an incremental transaction number to each transaction, which also represents the version number of data. When the transaction is executed on each node, each node only needs to record the update operation and transaction number. When the transaction is completed at each node, record the number of this transaction in the global meta information. When reading data, first read the maximum successful transaction number in the meta information, then read data on each node, only read the operations whose update operation number is less than or equal to the last maximum successfully committed transaction number, and apply these operations to the basic data to form the reading result.
Redo log is called redo log, which restores the page operation of submitting transaction modifications to ensure the atomicity and persistence of transactions; Undo log is called rollback log, which helps to record rollback lines to a specific version and mvcc to ensure transaction consistency;
innodb_flush_log_at_trx_commitThe policy used to control the refresh of redo logs to disk. The default value of this parameter is 1, which means that fsync operation must be called to refresh to disk when the transaction is committed; 0 means that the redo log is not refreshed to disk when the transaction is committed. This operation is only completed in the master thread (MySQL goes down, and all data not refreshed to disk will be lost); 2 means that when the transaction is committed, only the redo log is written to the cache of the file system without fsync operation (as long as the operating system is not down, the data will not be lost);
The performance of disk fsync is limited. In order to improve the efficiency of disk fsync, the current databases provide the function of group commit, that is, one fsync refresh can ensure that multiple transaction logs are written to files;
Users often misunderstand undo as follows: Undo is used to physically restore the database to its original appearance before executing statements or transactions – but this is not the case. Undo is a logical log, so it only logically restores the database to its original appearance;
The most common XA transaction in MySQL exists between binlog and InnoDB storage engine. The binlog log and redo log of InnoDB storage engine must be written at the same time, which is guaranteed by XA transaction;
Instead of committing a transaction in a loop, treat the loop as a transaction;
Long lived transactions refer to transactions that take a long time to execute. The problem of long transactions can sometimes be processed by converting them into mini batch transactions.