Disadvantages of hash table index:
Using hash storage requires adding all files to memory, which consumes more memory space
If all queries are equivalent queries, the hash is indeed fast, but in the actual scene, more data is found, and not all queries are equivalent queries, so the hash table is not suitable
Fast range lookup is not supported. Range lookup can only be performed by scanning the full table.
Disadvantages of binary tree index format:
Whether binary tree or red black tree, it will cause more IO times due to the depth of the tree, which will affect the efficiency of data reading.
The balanced binary tree does not support the fast search of range query. The range query needs to be traversed from the root node many times, and the query efficiency is not high
B-tree index format:
If the data in the non leaf node is too large, such as 1.9k each time, the disk size is 4K. At this time, it will degenerate into a binary tree, and the depth of the tree will deepen, resulting in more IO times and affecting the reading efficiency.
B-tree does not support fast search of range query. If we want to find the data between 10 and 35, after finding 15, we need to go back to the root node and traverse the search again. We need to traverse from the root node many times, and the query efficiency needs to be improved.
If data stores row records, the size of rows will increase with the number of columns. At this time, the amount of data that can be stored in a page will become less, the tree will become higher, and the disk IO times will become larger.
The joint index must meet the leftmost principle. If you skip the first leftmost index and start the search directly from the second index, the second index is not necessarily in good order, and a full stack search will be required.
InnoDB and MyISAM
One page of MySQL is 16kb large
1. MyISAM data files and index files are stored separately. When MyISAM uses the B + tree to build the index tree, the key value stored in the leaf node is the value of the index column, and the data is the disk address of the row where the index is located.
Data in InnoDB stores row data, while MyISAM stores disk address.
2. In MyISAM, the structure of secondary index and primary key index is the same without any difference. The data of leaf node is stored in the disk address of row record. Only the key value of the primary key index is unique, while the key value of the secondary index can be repeated.
3. When querying data, because the key value of the auxiliary index is not unique, there may be multiple records with the same value. Therefore, even if it is an equivalent query, it is necessary to retrieve data in the auxiliary index tree in the way of range query.
InnoDB stores all information together, including data and index
Primary key index (clustered index)
Each InnoDB table has a cluster index, which is built using B + tree,The data stored in the leaf node is the whole row of records。 In general, a clustered index is equivalent to a primary key index. When a table does not create a primary key index, InnoDB will automatically create a ROWID field to build a clustered index. The specific rules for InnoDB to create an index are as follows:
- Define the primary key on the table, and InnoDB uses the primary key index as the cluster index.
- If the table does not have a primary key defined, InnoDB will select the first unique index column that is not null as the cluster index.
- If there are neither of the above, InnoDB will use a 6-byte long implicit field ROWID field to build the cluster index. The ROWID field is automatically incremented when a new row is inserted.
In InnoDB, the data stored in the leaf node in the secondary index is the data of this rowPrimary key value, when retrieving, InnoDB uses this primary key value to search for row records in the cluster index.
According to the scope of locking, the locks in MySQL can be roughly divided into three types: Global locks, table level locks and row locks
Global lock is to lock the entire database instance.
In this state, the following operations of other threads will be blocked:
Update of data
Definition of table (establishment of table and modification of structure)
Commit of update transaction
Make full database logical backup, i.e. each select in the whole library will be generated into text.
If the backup is performed on the main database, the update process will not be executable during the whole backup process, and all businesses will have to be shelved.
If the backup is performed on the slave table, the slave database cannot execute the binlog synchronized by the master database during the backup, which will cause the master-slave delay.
There are two types of table level locks in MySQL: one is table lock and the other is metadata lock (MDL)
The table lock needs to be displayed. It is for the access thread. If executed in a thread a
lock tables t1 read,t2 wirte;If this statement is used, the statements written by other threads to T1 and read and write to T2 will be blocked. At the same time, thread a can only read T1 and write T2 before executing unlock tables. Even writing T1 is not allowed
MDL does not need to be used explicitly. It will be automatically added when accessing a table.The role of MDL is to ensure the correctness of reading and writing。 The table is being modifiedAdd, delete, modify and checkWhen, the table is modifiedMDL read lock； Check the tableStructural change, add MDL write lock to the table.
MySQL row locks are implemented by each engine at the engine layer. But not all engines support row locks. For example, MyISAM engine does not support row locks
Two stage lock protocol
The transaction is divided into two parts, locking and unlocking.
Deadlock and deadlock detection
There are two strategies to solve deadlock:
One strategy is to go straight into waiting until it times out. This timeout can be set via the parameter InnoDB_ lock_ wait_ Timeout. The default timeout of InnoDB is less than 50s
Another strategy is to initiate deadlock detection. After a deadlock is found, actively roll back a transaction in the deadlock chain so that other transactions can continue to execute. Set parameter InnoDB_ deadlock_ If detect is set to on, this logic is turned on
If all transactions need to update the scene of the same row, each new blocked thread must judge whether it will cause deadlock due to its own joining. This is an operation with time complexity of O (n)
How to solve the performance problems caused by this hot line update?
If you ensure that this service will not have deadlock, you can turn off the deadlock detection temporarily
Change one row into logical multiple rows to reduce lock conflicts. Take the cinema account as an example. You can consider putting it on multiple records, such as 10 records. The total amount of the cinema account is equal to the sum of the values of these 10 records. In this way, each time you want to add money to the theater account, you can randomly select one of the records to add. In this way, the probability of each conflict is changed to 1 / 10 of the original member, which can reduce the number of lock waiting and reduce the CPU consumption of deadlock detection
MySQL memory and log
Buffer pool model
There are three linked lists in buffer pool:
Free linked list: its function is to manage free pages. In fact, it stores small control blocks, in whichAddress pointer of corresponding node
Flush linked list: its function is to manage dirty pages. Its structure is the same as that of free linked list, but its expression meaning is different
LRU linked list: its function is to eliminate pages and replace the latest and longest unused pages. The LRU linked list is usually divided into hot data part and cold data part. The default ratio is5 ： 3。
The reason why it is set as part 2 is that if the hot and cold distinction is set, sequential full table scanning will replace most of the hot pages of the principle, but the actual visits of the pages replaced later are very small, which is not worth the loss.
When cold data area data enters hot data area: the time interval of accessing the same page>1s。 The reason why the interval is greater than but not less than 1 second is that if the whole table is scanned, the data is traversed row by row, and the same page will be accessed quickly, and the interval is far less than 1 second.
Redo log (physical log)
If the dirty pages are not refreshed to the disk in time, MySQL hangs up. Redo log can be used for recovery ——– > read the original data in the disk + operation in redolog = = repeat the previous operation on the original data to achieve the effect of recovery.
When redoing, you only need to read the redolog in sequence. YesSequential IO, fast.
The redolog file on the disk defaults to2, when both are full, a checkpoint will be triggered to refresh the operations in one of the logfile files to the disk in combination with the dirty pages in the buffer pool. For example, if file 0 is full, it will continue to be written to file 1. If file 1 is also full, the subsequent action is to trigger checkpoint, refresh the corresponding dirty page in file 0 to disk, and then overwrite the redo log record in file 0.
That is, write the cache in the log buffer to the disk (redo log).
|0||It means that the redo log will not be persisted immediately when the transaction is committed. This task is handed over to the background thread (equivalent to regular refresh)|
|1||Indicates that the redo log is persisted immediately when the transaction is committed（Default use）|
|2||It means that the redo log will be written to the operating system buffer immediately when the transaction is committed, and the redo log will not be persisted directly. In this case, if the database is hung but the operating system is not hung, the persistence of the transaction can be guaranteed.|
Implementation of binlog MySQL server layer
Binlog is a logical log, which records the original logic of a statement
Binlog is unlimited in size. It is written additionally and will not overwrite the previous log
Difference between bin log and redo log
Bin log records SQL statements one by one, and redolog recordsA pageSomewhere on the. When restoring, the binlog needs to write and change the SQL again (query, etc.), and the redolog can be modified directly.
It is equivalent to the reverse operation of the original SQL
Flow chart of data updating and writing
Phase II submission
MySQL page is 16kb, while the operating system page is 4KB, so it takes 4 times to write the database page data to disk. If the system is powered down during a write, the data will be inconsistent.
1. Write the double write buffer. If the write is successful, follow-up will be carried out; The database is still in the consistency state
2. Write to the table space. When the writing is completed, the data persistence is completed. If the data in the double write buffer is not written, it will be powered down and will not be affected. The data in the double write buffer is well persisted and can be used directly.
Update the data in the page in buffer pool
Generate a redo log
Commit to persist the redo log object
The executor of the server service layer calls the API interface of the storage engine layer to query data.
InnoDB storage engine layer query
buffer poolData in cache.
If the query cache contains the data to be queried, it will be directly returned to the executor of the server service layer.
If there is no result in the cache, read the data from the disk, and then return it to the server service layer. At the same time, update the queried data to the data content in the buffer pool.
After receiving the queried data, the executor of the server layer executes the update operation.
The server layer calls the API interface of InnoDB storage engine layer to update data.
InnoDB storage engine layer updates data to the change buffer cache pool.
The innddb storage engine layer records the redolog and sets its state to prepare state.
InnoDB storage engine layer notifies the executor of the server,
change bufferIt has been updated, and the redolog has entered the ready state of prepare to be submitted. The binlog log can be recorded.
The executor of the server layer records binlog into the buffer pool of binlog. (here, the parameter is used to determine when the logs in the buffer pool are flushed.)
After recording binlog, the executor of server layer notifies InnoDB storage engine layer that binlog has been recorded.
After the InnoDB storage engine layer receives the notification that the server has recorded the binlog, it updates the redolog in the redolog buffer to the commit state. (in this case, when the log in the redolog buffer is flushed is determined by the parameter.)
Transaction and isolation levels
Acid feature implementation principle
Atomic undo log
Persistent redo log
Read uncommitted may be dirty. Read session a will obtain uncommitted data from other sessions in the database
Read – committed may be non repeatable
Session B starts a transaction, modifies data, but fails to commit. At this time, Session B can see the current data. Session a still sees the data before modification. That is, only after a transaction is committed can another session see the modified value
Problem: it cannot be read repeatedly. If you query the table twice in the same transaction, the results are different
- 1. Session a starts the transaction, and Session B also starts the transaction. When two transactions are queried at the same time, the results are consistent. It is assumed that the result is 22
- 2. Session B, modify the value to 99 and commit the transaction. At this time, Session B reads the value again as the modified value 99
- 3. Session a reads the value again, and the value is also 99 At this time, the transaction of session a has not ended. The value that should have appeared is 22 before modification, and the value is 99 The phenomenon of non repeatable reading occurs, that is, the same data of a table is read, and the results are inconsistent.
Repeatable – read: phantom reading may occur
If your transaction is not committed or rolled back, you cannot see the operation of the transaction.
Phantom reading: the second query has more records than the first query.
- Both session a and Session B open transactions.
- Session B adds a new piece of data to the table and submits it. At this time, session a queries the table for the first time. Because the transaction has not been committed or rolled back, the transaction of Session B does not affect the transaction of session a, that is, the session still sees the previous data.
- Session a commits or rolls back the transaction at this time. Conduct the second query. At this time, one more piece of data will appear from the query data. Phantom reading
When querying or modifying, the table will be locked to ensure serial operation
Dirty reading, unrepeatable reading, unreal reading
Mvcc solves dirty reads and non repeatable reads
The time for RC reading committed to generate readview is to generate a corresponding readview for each selection in the same transaction, one by one
RR repeatable read generates readview in transaction units
Phantom reading solution
Solve phantom reading with gap lock
Locking mechanism and principle of InnoDB
InnoDB row lock: lock the index entries on the index
Intent sharing lock
Intentional exclusive lock
Self increasing lock
> < between Locking section left opening right closing
- Clearance lock
If the critical lock does not match the record, it will degenerate into a gap lock
- Record lock
As the name suggests, record lock is to lock a row of records, which blocks the index records of that row The index must be a primary key lock or a unique index column.
Dirty page refresh
In any case, refresh regularly
When the list of empty free pages in LRU is insufficient, force LRU to delete some end pages. If there are dirty pages, checkpoint refresh is required
Timing refresh mechanism in master thread
1)InndoDB1. Master thread before version 0. X.
Once per second
dirty too much checkpoint。