Talking about the rotten mysql, let’s review it today

Time:2021-2-15

MySQL transaction:

MySQL transaction is mainly used to process data with large amount of operation and high complexity. For example, in the personnel management system, if you delete a person, you need to delete not only the basic information of the person, but also the information related to the person, such as mailbox, article and so on. In this way, these database operation statements constitute a transaction!

In mysql, only databases or tables that use InnoDB database engine can support transactions. Transaction processing can be used to maintain the integrity of the database to ensure that all or none of the batch SQL statements are executed. Transaction is used to manage insert, update and delete statements. Generally speaking, transaction must satisfy four conditions (acid): atomicity, consistency, isolation and durability. Atomicity: all operations in a transaction are either completed or not completed, and will not end in the middle. If an error occurs during the execution of a transaction, it will be rolled back to the state before the transaction starts, as if the transaction has never been executed. Consistency: the integrity of the database is not compromised before and after the transaction. This means that the written data must fully comply with all the preset rules, including the accuracy and seriality of the data, and the subsequent database can spontaneously complete the scheduled work. Isolation: database allows multiple concurrent transactions to read, write and modify its data at the same time. Isolation can prevent data inconsistency caused by cross execution when multiple transactions are executed simultaneously. Transaction isolation is divided into different levels, including read uncommitted, read committed, repeatable read and serializable. Persistence: after the end of the transaction, the modification of the data is permanent, even if the system fails, it will not be lost

MySQL index:

The establishment of MySQL index is very important for the efficient operation of MySQL. Index can greatly improve the retrieval speed of MySQL. For example, if a reasonably designed and indexed MySQL is a Lamborghini, then a MySQL without an index is a human powered tricycle. Take the index page of a Chinese dictionary for example, we can quickly find the required words in the index page sorted by pinyin, strokes, radicals, etc. Index is divided into single column index and composite index. Single column index, that is, an index only contains a single column. A table can have multiple single column indexes, but this is not a composite index. Composite index, that is, an index contains multiple columns. When creating an index, you need to make sure that the index is a condition applied to the SQL query statement (usually as a condition of the where clause). In fact, the index is also a table, which holds the primary key and index fields and points to the records of the entity table. All of the above are about the benefits of using index, but excessive use of index will cause abuse. Therefore, the index also has its disadvantages: Although the index greatly improves the query speed, it also reduces the speed of updating the table, such as insert, update and delete. Because when updating a table, MySQL should not only save the data, but also save the index file. An index file that takes up disk space to build an index.

Data structure of index

The above describes the basic principle of index, the complexity of database, and some contents of the operating system. The purpose is to let you know that any data structure is not created out of thin air. It must have its background and usage scenarios. So what can we do with these data structures? In fact, it’s very simple: limit the disk I / O times to a very small order of magnitude, preferably a constant order of magnitude, every time you look up data. So we thought, if a highly controllable multi-channel search tree, can it meet the demand? In this context, B + tree came into being.

Talking about the rotten mysql, let's review it today

Explain B + tree in detail

This is a B + tree. The definition of B + tree, children’s shoes can be Baidu, we just talk about some key points. The light blue block in the figure is called a disk. As you can see, each disk block contains several data items (dark blue) and pointers (yellow). For example, disk block 1 contains data 17 and data 35, including pointers P1, P2 and P3. P1 points to the disk block with data less than 17, P2 points to the disk block with data between 17 and 35, and P3 points to the disk block with data more than 35. Real data exists in leaf nodes, namely 3, 5, 9, 10, 13, 15, 28, 29, 36, 60, 75, 79, 90, 99. Non leaf nodes do not store real data, only store data items that guide the search direction. For example, 17 and 35 do not exist in the data table.

Searching process of B + tree

Or use the B + tree above. Suppose that we want to find data item 29, then we will first load disk block 1 from disk to memory. At this time, we perform an I / O. in memory, we use binary search to determine that 29 is between 17 and 35, and lock the P2 pointer of disk block 1. Because the memory calculation time is very short (compared with I / O), it can be ignored. The disk address of P2 pointer of disk block 1 points to disk block 3, and the memory calculation time is very short When the disk is loaded into the memory, I / O is performed for the second time. When 29 is between 26 and 30, the P2 pointer of disk block 3 is locked, and disk block 8 is loaded into the memory through the pointer. At this time, I / O is performed for the third time. At the same time, the binary search is performed in the memory to find 29, and the query ends. In this process, I / O is performed three times. In real use scenarios, the three-tier B + tree can represent millions of data. If millions of data queries only need three I / O, the performance improvement will be huge. B + tree is a kind of index data structure. If there is no such index, each data item will have one I / O, then the cost will be greatly increased.

Properties of B + trees

In the above example, we can analyze some properties of B + tree

1. The number of I / O depends on the height h of the B + tree. Assuming that the data in the current data table is n and the number of data items in each disk block is m, there are: H = log (M + 1) n. when the amount of data n is fixed, the larger m is, the smaller h is; and M = disk block size / data item size, the size of a disk block, that is, the size of a data page, is fixed The more items there are, the lower the height of the tree. That’s why each data item, that is, the index field, should be as small as possible. For example, int takes up 4 bytes, which is half as small as bigint’s 8 bytes. This is also why B + trees require real data to be placed in leaf nodes instead of inner nodes. Once it is placed in inner nodes, the data items of disk blocks will decrease greatly, resulting in the increase of tree level. When the data item is 1, the B + tree will degenerate into a linear table.

2. When the data item of B + tree is a composite data structure, such as (name, age, gender), the B + tree establishes the search tree from left to right. For example, when (Xiao Zhang, 22, female) data is retrieved, the B + tree will first compare the name to determine the next search direction. If the name is the same, then compare the age and gender in turn, and finally get the number of searches According to. However, when (22, female) doesn’t have the data of name, the B + tree doesn’t know which node to look up next, because name is the first comparison factor when the search tree is established, and it must search according to name to know where to look up next. For example, when data such as (Xiao Zhang, male) is retrieved, the B + tree can specify the search direction according to the name, but the next field age is missing, so we can only find all the data with the name of “Xiao Zhang”, and then match the data with the gender of “male”. This is a very important property, which is the leftmost matching property of index.

Type of index

In mysql, indexes are divided into two categories: clustered indexes and non clustered indexes. Clustering index is based on the physical location of data, but non clustering index is different; clustering index can improve the speed of multi row retrieval, while non clustering index can improve the speed of single row retrieval.

Under the index types of these two categories, indexes can also be divided into four sub categories:

1. Ordinary index: the most basic index, without any restrictions, is the index we use in most cases.

2. Unique index: different from ordinary index type, the column value of unique index must be unique, but null value is allowed.

3. Full text index: full text index can only be applied to data tables of MyISAM engine; it can be applied to columns of char, varchar and text data types.

4. Composite index: retrieve several columns as an index, and use the leftmost matching principle.

Principles of indexing

Let’s go back to the slow query mentioned at the beginning. After we understand the indexing principle, we should have some ideas on the optimization of slow query. Here we summarize some principles of indexing

1. Leftmost prefix matching principle. This is a very important, very important and very important principle. MySQL will match to the right until it encounters a range query (>, <, between, like). For example, a = 1 and B = 2 and C > 3 and d = 4. If you build an index in (a, B, C, d) order, D can’t use the index. If you build an index in (a, B, C, d), D can’t use the index, c) Index, can be used, a, B, D order can be arbitrarily adjusted.

2, equal to (=) and in can be out of order. For example, a = 1 and B = 2 and C = 3 can be used to build (a, B, c) indexes in any order, and MySQL’s query optimizer will help you optimize the indexes into recognizable patterns.

3. Try to select the column with high discrimination as the index, and the formula of discrimination is count (distinct Col) / count (*). It indicates the ratio of non duplicate fields. The larger the ratio, the fewer records we scan. The discrimination of unique key is 1, while some status and gender fields may be 0 in front of big data. Some people may ask, what is the experience of this ratio? In different scenarios, this value is also difficult to determine. Generally, we require that the fields that need to be joined should be above 0.1, that is, an average of 10 records should be scanned.

4. The index column cannot participate in the calculation. Try to keep the column “clean”. For example, from_ UNIXTIME(create_ Time) =’2016-06-06 ‘, we can’t use the index. The reason is very simple. The B + tree stores all the field values in the data table. However, when searching, we need to apply functions to all the elements to compare. Obviously, the cost is too high. So the statement should be written as: create_ time = UNIX_ TIMESTAMP(‘2016-06-06’)。

5. Expand the index as much as possible, and do not create a new index. For example, if you want to add the index of (a, b) now that you have the index of a in the table, you only need to modify the original index.

6. The retrieval and query effects of a single multi column composite index and multiple single column indexes are different, because MySQL can only use one index when executing SQL, and it will select the most restrictive index from multiple single column indexes.

According to the above principles, let’s modify the slow query at the beginning

SELECTcount(*) AS countFROM trade_bASe AS aWHEREa.trade_status = 7 AND a.create_time BETWEEN ‘2015-09-01’ AND ‘2016-01-14’ AND a.booking_source = ‘2’

According to this SQL, the index to be established is: trade_ status, booking_ source,create_ The union index of time; where, trade_ status、booking_ The order of source can be reversed, and create_ The interval query of time is put behind. This is to use the leftmost matching principle of index. This is the opposite of Oracle, which is from right to left

Optimization steps of slow query

1. Check the running effect to see if it is really slow, and mainly set SQL_ NO_ CACHE。

2. Where condition single table query, lock the minimum return record table. The meaning of this sentence is to apply the where of the query statement to the table with the smallest number of returned records, and start to query each field of a single table to see which field has the highest discrimination

3. Explain to check whether the execution plan is consistent with 1’s expectation (starting from the table with less records locked)

4. SQL statements in the form of order by limit give priority to the sorted tables

5. Learn more about business scenarios

6. When adding indexes, we should refer to the principles of establishing indexes

7. If the observation result does not meet the expectation, the analysis shall be started from 1.

Optimization method of index

1. When to use clustered index or non clustered index:

2. The index will not contain the column with null value: as long as the column contains null value, it will not be included in the index. As long as there is a column with null value in the composite index, then this column is invalid for this composite index. Therefore, when we design the database, we should not let the default value of the index field be null.

3. Use short index: suppose that if there is a column with char (255) data type, most of the data values are unique in the first 10 or 20 characters, then do not index the whole column. Short index can not only improve query speed, but also save I / O operation.

4. Index column sorting: MySQL queries only use one index, so if the index has been used in the where clause, then the columns in order by will not use the index. Therefore, when the database default sorting can meet the requirements, do not use the sorting operation; try not to include the sorting of multiple columns, if necessary, it is best to create a composite index for these columns.

5. Like statement operation: Generally speaking, it is not recommended to use like operation; if it has to be used, how to use it is also a research topic. Like “AAA%” does not use indexes, but like “AAA%” can use indexes.

6. Don’t operate on the index column: in the principle of establishing index, it is mentioned that the index column can’t be operated, so it won’t be repeated here.

Finally, to sum up, in fact, any optimization at the database level is not as good as the optimization of the application system. Mysql, Facebook / Google and so on can support it, so go ahead and cherish it!

Some time ago, a big man went to a big factory and gave me a batch of learning materials. After sorting them out, the following documents (database) were formed, mainly includingMySQL interview questions, MySQL basic to advanced to tuning notes, MySQL knowledge summary, MySQL performance tuning and architecture design parsing documents,It’s packed

Add a small assistant VX: xuanwo008 to get~

Talking about the rotten mysql, let's review it today

Talking about the rotten mysql, let's review it today

Due to the number of words, this is only a screenshot of the detailed catalogue and content. If you need a friend, you can get it by adding a little assistant VX: xuanwo008~

Talking about the rotten mysql, let's review it today