(2) MySQL advanced SQL optimization (show status, explain analysis)

Time:2022-5-8

1. General

In the process of application system development, due to the small amount of data in the initial stage, developers pay more attention to the realization of functions when writing SQL statements. However, after the application system is officially launched, with the rapid growth of production data, many SQL statements begin to show performance problems gradually and have a greater impact on the production environment. At this time, these problematic SQL statements become the bottleneck of the performance of the whole system, Therefore, we must optimize them. This chapter will introduce in detail the methods of optimizing SQL statements in MySQL.

2. Understand the execution frequency of various SQL through the show status command

After the MySQL client is successfully connected, you can provide server status information through the show [session|global] status command, or you can use the mysqladmin extended status command on the operating system to obtain these messages. Show [session|global] status can add the parameter “session” or “global” as needed to display the statistical results of session level (current connection) and global level (since the last startup of the database). If not, the default parameter is “session”.
The following command displays the values of all statistical parameters in the current session:

--View all session statistics
SHOW STATUS LIKE 'Com_%';
Or
SHOW SESSION STATUS LIKE 'Com_%';

The following command displays the values of all statistical parameters in the current global:
–View the values of all global statistics

SHOW GLOBAL STATUS LIKE 'Com_%';


Com_ XXX indicates the number of times each XXX statement is executed. We usually care about the following statistical parameters:
●Com_select:The number of times a select operation is performed. A query only accumulates 1.
●Com_insert:The number of times the insert operation is executed. For the insert operation of batch insertion, it is only accumulated once.
●Com_update:The number of times the update operation was performed.
●Com_delete:The number of times the delete operation was performed.
The above parameters will be accumulated for table operations of all storage engines. The following parameters are only for the InnoDB storage engine, and the accumulation algorithm is slightly different.
●Innodb_rows_read:The number of rows returned by the select query.
●Innodb_rows_inserted:The number of rows inserted by the insert operation.
●Innodb_rows_updated:The number of rows updated by the update operation.
●Innodb_rows_deleted:The number of rows deleted by the delete operation.
Through the above parameters, you can easily understand whether the current database application system is mainly based on insert update or query operation, and the approximate execution proportion of various types of SQL. The count of update operations is the count of execution times, which will be accumulated regardless of commit or rollback.
For transactional applications, com_ Commit and com_ Rollback can understand the transaction submission and rollback. For databases with frequent rollback operations, it may mean that there are problems in application writing. In addition, the following parameters are convenient for users to understand the basic situation of the database. 
●Connections:The number of attempts to connect to the MySQL server.
●Uptime:Server working hours.
●Slow_queries:Number of slow queries.

3. Locate SQL statements with low execution efficiency

You can locate SQL statements with low execution efficiency in the following two ways.
Locate the SQL statements with low execution efficiency through the slow query log. When starting with the — log slow queries [= file_name] option, mysqld writes a long containing all the execution time_ query_ Log file of SQL statement with time seconds.
The query can be performed on the MySQL table only when the thread is slow to execute the query. Therefore, the query can reflect the efficiency of the MySQL table when the thread is slow to execute the query, including whether the query can be used to query the log at the same time. When the query is slow to execute the command, the query can be used to check the log of the current application.

4. Analyze the execution plan of inefficient SQL through explain

After locating the SQL statement with low execution efficiency, you can obtain the information of how MySQL executes the select statement through explain or desc command, including how to connect the tables and the order of connection during the execution of the select statement. For example, if you want to count the number of all inventory steps, you need to connect goods_ Stock table and goods_ stock_ Price table, and for goods_ stock_ price. Sum the qty field, and the corresponding SQL execution plan is as follows:

EXPLAIN SELECT SUM(sp.Qty)
FROM goods_stock AS s LEFT JOIN goods_stock_price AS sp
ON s.ID=sp.GoodsStockID; 

As shown in the figure above, the simple explanation of each column is as follows:

●select_type:Indicates the type of select. Common values are:
◎ simple (simple table, i.e. no table connection or sub query is used).
◎ primary (primary query, i.e. outer query), Union (the second or subsequent query statement in Union), ◎ subquery (the first select in sub query), etc.
●table:The table that outputs the result set.
●type:Indicates the connection type of the table. The connection type with good performance to poor performance is:
◎ system (there is only one row in the table, i.e. constant table).
◎ const (there is at most one matching row in a single table, such as primary key or unique index).
  ◎eq_ Ref (for each previous row, only one record is queried in this table. In short, primary key or unique index is used in multi table connection).
◎ ref (similar to eq_ref, the difference is that instead of using primary key or unique index, ordinary index is used).
  ◎ref_ or_ Null (similar to ref, except that the condition contains a query for null).
  ◎index_ Merge (index merge optimization).
  ◎unique_ Subquery (in is followed by a subquery that queries the primary key field).
  ◎index_ Subquery (similar to unique_subquery, except that in is followed by a subquery that queries non unique index fields).
◎ range (range query in single table).
◎ index (for each previous row, the data is obtained by querying the index).
◎ all (for each row in front, the data is obtained through full table scanning).
●possible_keys:Represents the index that may be used when querying.
●key:Represents the index actually used.
●key_len:The length of the index field.
●rows:Number of scan lines.
●filtered:The percentage of the row that returns the result to the row that needs to be read (the value of the rows column).
●Extra:Description and description of implementation.
◎ using index (this value indicates that MySQL will overwrite the index to avoid accessing the table).
◎ using where (MySQL) will be filtered after the storage engine retrieves the row. Many where conditions involve the column in the index. When (and if) it reads the index, it can be verified by the storage engine. Therefore, not all queries with where clause will display “using where”. Sometimes the appearance of “using where” is a hint: queries can benefit from different indexes).
◎ using temporary (MySQL will use temporary tables when sorting query results).
◎ using filesort (MySQL will use an external index to sort the results instead of reading rows from the table according to the index order. MySQL has two file sorting algorithms, both of which can be completed in memory or on disk. Explain will not tell you which file sorting MySQL will use, nor will it tell you whether the sorting will be completed in memory or on disk).
◎ range checked for each record (index map: n).

5. Identify problems and take corresponding optimization measures

After the above positioning steps, we can basically analyze the causes of the problems. At this time, we can take corresponding improvement measures according to the situation to optimize and improve the execution efficiency of statements.
In the above example, it can be confirmed that it is goods_ Stock takes the primary key index, but for goods_ stock_ If the full table scanning of the price sub table leads to unsatisfactory efficiency, the goods sub table should be scanned_ stock_ Create an index in the goodsstock ID field of the price table. The specific commands are as follows:

--Create index idx_ stock_ price_ 1 ON goods_ stock_ price (GoodsStockID);--  Append, delete and query index statement alter table goods_ stock_ price DROP INDEX idx_ stock_ price_ 1; SHOW INDEX FROM goods_ stock_ price;

After creating the index, let’s take a look at the execution plan of this statement, as follows:

EXPLAIN SELECT SUM(sp.Qty)
FROM goods_stock AS s LEFT JOIN goods_stock_price AS sp
ON s.ID=sp.GoodsStockID;


It can be found that after the index is established, there is no change to the goods_ stock_ The number of rows to be scanned in the price sub table is significantly reduced (from 3 rows to 1 row). It can be seen that the use of index can greatly improve the access speed of the database, especially when the table is very large.

reference:
MySQL in simple terms