SQL optimization experience summary

Time:2021-3-5

1、 Optimize SQL steps
1. Understand the execution frequency of various SQL through show status and application characteristics
The server status information can be provided by show status, or obtained by mysqladmin extend d-status command. Show status can display the statistical results of session level and global level as needed.

For example, the current session: show status like “com” is displayed_ %”; global level: show global status;
The following parameters count both MyISAM and InnoDB storage engines:

1). Com_ Select the number of times to perform the select operation. A query only accumulates 1;

2). Com_ The number of insert operations performed by insert is only accumulated once for batch insert operations;

3). Com_ Update the number of times to perform the update operation;

4). Com_ Delete the number of times the delete operation is performed;

The following parameters are for InnoDB storage engine counting, and the accumulation algorithm is slightly different:

1). Innodb_ rows_ The number of rows returned by read select query;

2). Innodb_ rows_ The number of rows inserted by the insert operation;

3). Innodb_ rows_ Updated the number of rows updated by the update operation;

4). Innodb_ rows_ Deleted the number of rows deleted by the delete operation;

Through the above parameters, it is easy to understand whether the current database application is mainly based on insert update or query operation, and the approximate execution ratio of various types of SQL. The count of update operation is the count of execution times, which will be accumulated no matter whether it is submitted or rolled back.
For transactional applications, through COM_ Commit and com_ Rollback can understand the situation of transaction commit and rollback. For the database with frequent rollback operation, it may mean that there is a problem in application writing. In addition, the following parameters are convenient for us to understand the basic situation of the database:

1). The number of times connections tried to connect to the MySQL server
2) uptime server working time
 3). Slow_ The number of slow queries

2. Locate SQL statements with low execution efficiency
There are two ways to locate SQL statements with low execution efficiency:
1) slow query log can be used to locate those SQL statements with low efficiency, and — log slow queries [= file_ When the [name] option is started, mysqld writes a_ query_ Time second SQL statement log file. You can link to the relevant chapters in management and maintenance.
2) use show processlist to view the current MySQL thread. The slow query log is recorded only after the end of the query. Therefore, when the application reflects that there is a problem in the execution efficiency, querying the slow query log can not locate the problem. You can use show processlist to view the current mysql In the process of thread, including thread status, whether to lock the table and so on, you can view the SQL execution in real time, and optimize some lock table operations.
3) analyze the execution plan of inefficient SQL through explain
After querying the inefficient SQL through the above steps, we can obtain the information about how MySQL executes the select statement through explain or DESC, including how to connect the select statement execution process table and the connection order.

2、 MySQL index
1. How to use index in MySQL  
Indexes are used to quickly find rows with a specific value in a column. Using indexes on related columns is the best way to improve the performance of select operation.
The most important condition for a query to use an index is to use index keywords in the query conditions. If it is a multi column index, the index can only be used when the query conditions use the leftmost prefix of the multi column keywords (prefix index), otherwise the index will not be used.

MySQL will not use the existing index in the following cases:
1). If MySQL estimates that using index is slower than full table scanning, it does not use index. For example, if the key_ Part 1 is evenly distributed between 1 and 100. It is not good to use index in the following queries:
        SELECT * FROM table_name where key_part1 > 1 and key_part1 < 90
2) if the heap table is used and the = index column is not used in the where condition, other >, <, > =, < = do not use the index (MyISAM and InnoDB tables use the index);

3) the condition of using or segmentation. If the column in the condition before or has an index and the column after or has no index, the index involved will not be used.
4) if a composite index is created, if the column used in the condition is not the first part of the index column; (not a prefix index)
5) if like starts with%;
6). Quotation marks must be added to the string after where. If the string is a number, MySQL will automatically convert it to a string, but the index is not used.

2. View index usage
If the index is working, the handler_ read_ The value of key will be very high, which represents the number of times a row is read by the index value. A very low value indicates that the performance improvement by increasing the index is not high, because the index is not often used.
Handler_ read_ rnd_ A high value of next means that the query is inefficient and should be indexed. This value means the number of requests to read the next line in the data file. If you’re doing a lot of table scans,
This value is higher. It usually indicates that the table index is incorrect or the written query does not utilize the index.
Grammar:
        mysql> show status like ‘Handler_read%’;

3、 Specific optimization query statement
1. To optimize the query, full table scanning should be avoided as far as possible
To optimize the query, we should try to avoid full table scanning. First, we should consider building indexes on the columns involved in where and order by

. try the following techniques to avoid the optimizer mistaking table scan:

·Using analyze tabletbl_ Name updates the keyword distribution for the scanned table.

·Using forceindex to scan a table tells MySQL that scanning a given index table will be very time-consuming.

            SELECT * FROM t1, t2 FORCE INDEX (index_for_column)   WHERE t1.col_name=t2.col_name;

·Use the — Max sees for key = 1000 option to start mysqld or set max_ seeks_ for_ Key = 1000 tells the optimizer to assume that keyword scanning will not exceed 1000 keyword searches.

1) We should try to avoid null value judgment in where clause

Otherwise, the engine will give up using the index and scan the whole table

        select id from t where num is null

Null requires special processing for most databases, and MySQL is no exception. It requires more code, more checking and special index logic. Some developers are totally unaware that null is the default value when creating tables, but most of the time, not null should be used, or a special value, such as 0, – 1, should be used as the default value.

Null cannot be used as an index. Any column containing null values will not be included in the index. Even if there are multiple columns in the index, as long as one of these columns contains null, the column will be excluded from the index. In other words, if a column has a null value, even if the column is indexed, the performance will not be improved. Any statement optimizer that uses is null or is not null in a where clause is not allowed to use an index.

In this example, you can set the default value of 0 on num to ensure that the num column in the table has no null value, and then query as follows:

         select id    from t where num=0

2) . avoid using the! = or < > operators in the where clause

Otherwise, the engine will give up using the index and scan the whole table.
MySQL only uses indexes for the following operators: <, < =, =, >, > =, between, in, and sometimes like.

The case where an index can be used in a like operation is when another operand is not used as a wildcard (%) or () At the beginning. For example:
Select id from t where col like ‘Mich%’; # this query will use an index,
Select id from t where col like% Ike ‘;? This query does not use an index.

3) . avoid using or to join conditions in where clause

Otherwise, the engine will give up using the index and scan the whole table

        select id from t where num=10 or num=20

You can use union to merge queries: select id from t where num = 10 union all select id from t where num = 20

         

In some cases, or condition can avoid full table scanning.

     1. If there is or condition in where statement, MyISAM table can use index, but InnoDB can’t.

2. All or conditions must be independent indexes

MySQL or condition can use index instead of full table

4) . in and not in should also be used with caution, otherwise full table scan will be caused,

For example:

        select id from t where num in(1,2,3)

For continuous values, use between instead of in

        Select id from t where num between 1 and 3

5) . the following query will also result in a full table scan:

Select id from t where name like% ABC% ‘or

Select id from t where name like% ABC or

To improve efficiency, we can consider full-text retrieval.

Select id from t where name like ‘ABC%’

6) If you use parameters in the where clause, it will also result in a full table scan.

Because SQL only resolves local variables at run time, the optimizer cannot postpone the selection of access plan until run time; it must make the selection at compile time. However, if an access plan is created at compile time, the value of the variable is unknown and cannot be used as an input for index selection. For example, the following statement will scan the whole table:

       select id from t where [email protected]

You can force the query to use an index instead: select id from t with (index name)) where[email protected]

7) We should try to avoid the expression operation on the field in where clause,

This will cause the engine to abandon the index and do a full table scan. For example:

       select id from t where num/2=100

It should be changed to: select id from t where num = 100 * 2

8) We should try to avoid the function operation on the field in the where clause,

This will cause the engine to abandon the index and do a full table scan. For example:

      select id from t where substring(name,1,3)=’abc’   –name

      select id from t where datediff(day,createdate,’2005-11-30′)=0–‘2005-11-30′

The generated ID should be changed to:

      select id from t where name like ‘abc%’

      select id from t where createdate>=’2005-11-30′ and createdate<‘2005-12-1’

9) . do not perform functions, arithmetic operations, or other expression operations to the left of “=” in the where clause,

Otherwise, the system may not be able to use the index correctly.

10) . index field is not prefix index of composite index

For example, when the index field is used as a condition, if the index is a composite index, the first field in the index must be used as a condition to ensure the system to use the index, otherwise the index will not be used, and the field order should be consistent with the index order as far as possible.

2. Other attention should be paid to optimization
11) Don’t write meaningless queries,

If you need to generate an empty table structure:

        select col1,col2 into #t from t where 1=0

This kind of code will not return any result set, but will consume system resources. It should be changed to this: create table # t (…)

12) In many cases, it is a good choice to replace in with exists

       select num from a where num in(select num from b)

Replace with the following statement:

       select num from a where exists(select 1 from b where num=a.num)

13) . not all indexes are valid for queries,

SQL optimizes the query based on the data in the table. When there are a large number of duplicate data in the index column, the SQL query may not use the index. For example, if there are almost half of the fields sex, male and female in a table, even if the index is built on sex, the query efficiency will not be affected.

14) The more indexes, the better,

The index can improve the efficiency of the corresponding select, but it also reduces the efficiency of insert and update, because the index may be rebuilt when insert or update, so how to build the index needs careful consideration, depending on the specific situation. The number of indexes in a table should not exceed 6. If it is too many, we should consider whether it is necessary to build indexes on some infrequently used columns.

15) . avoid updating clustered index data columns as much as possible,

Because the order of clustered index data column is the physical storage order of table records, once the column value changes, the order of the whole table records will be adjusted, which will consume considerable resources. If the application system needs to update the clustered index data column frequently, it needs to consider whether the index should be built as a clustered index.

16) . try to use numeric fields,

If the field only contains numerical information, it should not be designed as character type as far as possible, which will reduce the performance of query and connection, and increase the storage overhead. This is because the engine will compare each character in the string one by one when processing queries and joins. For numeric type, only one comparison is enough.

17) Use varchar / nvarchar instead of char / nchar as much as possible,

Firstly, the storage space of the variable length field is small, which can save the storage space. Secondly, for the query, the search efficiency in a relatively small field is obviously higher.

18) . it is better not to use “*” to return all: select * from t,

Replace “*” with a specific field list, and do not return any fields that cannot be used.

3. Problems of temporary table:
19) . try to use table variables instead of temporary tables.

If the table variable contains a large amount of data, note that the index is very limited (only the primary key index).

20) Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

21). Temporary tables are not unusable,

Using them properly can make some routines more efficient, for example, when you need to repeatedly refer to a large table or a dataset in a common table. However, for one-time events, it is best to use the export table.

22) when creating a new temporary table, if a large amount of data is inserted at one time, you can use select into instead of create table to avoid a large number of logs and improve the speed;

If the amount of data is small, in order to ease the resources of the system table, you should first create a table and then insert it.

23). If temporary tables are used, all temporary tables must be explicitly deleted at the end of the stored procedure. First truncate table, then drop table. This can avoid long-term locking of system tables.

4. Cursor problem:
24). Try to avoid using cursors,

Because the efficiency of cursors is poor, if the data operated by cursors exceeds 10000 rows, then rewriting should be considered.

25). Before using cursor based method or temporary table method,

We should first find a set based solution to solve the problem, which is usually more effective.

26). Like temporary tables, cursors are not unusable.

Using fast for small datasets_ Forward cursors are usually better than other row by row methods, especially when you have to refer to several tables to get the data you need. Routines that include “totals” in the result set are usually faster to execute than using cursors. If development time permits, cursor based method and set based method can be tried to see which method is better.

27). Set NOCOUNT on at the beginning of all stored procedures and triggers and set NOCOUNT off at the end.

There is no need to send done to the client after executing each statement of stored procedures and triggers_ IN_ Proc message.

5. The problem of business:
28) avoid large transaction operation as far as possible, and improve the concurrent ability of the system.

6. Data volume
29). Try to avoid returning large amount of data to the client. If the amount of data is too large, we should consider whether the corresponding requirements are reasonable.
7. Count optimization:
30) count (*) is better than count (1) and count (primary)_ key)

Many people use count (1) and count (primary) to count the number of records_ They think it’s better than count (*). In fact, it’s a mistake. For some scenarios, the performance may be worse, so we should make some special optimization for the count (*) operation of the database.
31) count (column) and count (*) are different

This misunderstanding even exists in many senior engineers or DBAs, and many people will take it for granted. In fact, count (column) and count (*) are totally different operations, and they represent totally different meanings.
Count (column) is a record that indicates how many column fields are not empty in the result set
Count (*) indicates how many records there are in the whole result set

8. Optimize the order by statement
   Index based sorting
One of the weaknesses of MySQL is its sorting. Although MySQL can query about 15000 records in one second, it can only use one index at most. Therefore, if the where condition has occupied the index, then the index will not be used in sorting, which will greatly reduce the query speed. We can look at the following SQL statements:
     SELECT * FROM SALES WHERE NAME = “name” ORDER BY SALE_DATE DESC;
The index on the name field has been used in the where clause of the above SQL. Therefore, the_ Index is no longer used when sorting with date. To solve this problem, we can create a composite index on the sales table
     ALTER TABLE SALES DROP INDEX NAME, ADD INDEX (NAME,SALE_DATE)
In this way, the query speed will be greatly improved when using the above select statement. However, when using this method, make sure that there is no sort field in the where clause. In the above example, sale cannot be used_ Date to query, otherwise, although the sort is fast, the sale_ There is no separate index on the date field, so the query slows down again.

In some cases, MySQL can use an index to satisfy the order by clause without additional sorting. The where condition and order by use the same index, and the order of order by and index order are the same, and the fields of order by are in ascending or descending order. For example: the following SQL can use index.
     SELECT * FROM t1 ORDER BY key_part1,key_part2,… ;
     SELECT * FROM t1 WHERE key_part1=1 ORDER BY key_part1 DESC, key_part2 DESC;
     SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 DESC;
However, indexes are not used in the following cases:
     SELECT * FROM t1 ORDER BY key_ part1 DESC, key_ Part2 ASC; — the field of order by mixes ASC and desc
Select * from T1 where key2 = constant order by key1; — the keywords used to query rows are different from those used in order by
Select * from T1 order by key1, key2; — use order by for different keywords:

9. Optimize group by
By default, MySQL sorts all group by col1, col2. The query method is the same as specifying order by col1, col2,… In the query. If you explicitly include an order by that contains the same column
Clause, MySQL can optimize it without slowing down, although still sorting. If the query includes group by but you want to avoid the consumption of sorting results, you can specify order by null to disable sorting.
For example:
INSERT INTO foo  SELECT a, COUNT(*) FROM bar GROUP BY a ORDER BY NULL;

10. Optimize or
For detailed explanation: MySQL or condition can use index instead of full table

4、 Explain
Explain shows how MySQL uses indexes to handle select statements and join tables. It can help to choose a better index and write more optimized query statements.
To use the method, add explain before the select statement
For example:

explain select surname,first_name form a,b where a.id=b.id  
The results are as follows
table |  type | possible_keys | key | key_len  | ref | rows | Extra
Explain column:

1 table:
Show which table the data in this row is about
2 type:
This is an important column that shows what type of connection is used. The best to worst connection types are: system, const, eg_ reg、ref、ref_ or_ null、 range、indexhe、 ALL。
System: the table has only one row (= system table). This is a special case of the const join type
        Const: (primary key or unique)
The table has at most one matching row, which will be read at the beginning of the query. Because there is only one row, the column values in this row can be considered constant by the rest of the optimizer.
Const tables are fast because they are read only once!
Const is used to compare all parts of a primary key or unique index with constant values.
In the following query, TBL_ Name can be used in const table:
SELECT * from tbl_name WHERE primary_key=1;  
        eq_reg:key
For each row combination from the previous table, read one row from that table. This is probably the best join type, except for the const type.
It is used when all parts of an index are joined and the index is unique or primary key.
           eq_ Ref can be used for indexed columns that are compared using the = operator. The comparison value can be a constant or an expression that uses the column of the table read in front of the table.
In the following example, MySQL can use Eq_ Refjoin to handle refs_ tables:
SELECT * FROM ref_table,other_table WHERE ref_table.key_column=other_table.column; 
   SELECT * FROM ref_table,other_table WHERE ref_table.key_column_part1=other_table.column 
                                                 AND ref_table.key_column_part2=1; 

        ref:key
For each row combination from the previous table, all rows with matching index values are read from this table. Use ref if the join uses only the leftmost prefix of the key, or if the key is not unique or primary key (in other words, if the join cannot select a single row based on a keyword).

If the key used matches only a few rows, the join type is good.
Ref can be used for indexed columns that use the = or < = > operators.
In the following example, MySQL can use ref join to handle Ref_ tables:

SELECT * FROM ref_table WHERE key_column=expr; 
 SELECT * FROM ref_table,other_table WHERE ref_table.key_column=other_table.column; 
 SELECT * FROM ref_table,other_table WHERE ref_table.key_column_part1=other_table.column 
                          AND ref_table.key_column_part2=1;  
      ref_or_null:Or Is null
The join type is like ref, but MySQL is added to search for rows with null values. Optimization of this join type is often used in solving subqueries.
In the following example, MySQL can use Ref_ or_ Null join to handle Ref_ tables:
SELECT * FROM ref_table WHERE key_column=expr OR key_column IS NULL;      
    Range: =, < >, >, > =, <, < =, is null, < = >, between or in
Only the rows in the given range are retrieved, and an index is used to select the rows. The key column shows which index is used.
          key_ Len contains the longest key element of the index used. In this type, the ref column is null.
When using =, < >, >, > =, <, < =, is null, < = >, between or in operators to compare keyword columns with constants, you can use range:
SELECT * FROM tbl_name WHERE key_column = 10; 
SELECT * FROM tbl_name WHERE key_column BETWEEN 10 and 20; 
SELECT * FROM tbl_name WHERE key_column IN (10,20,30); 
SELECT * FROM tbl_name WHERE key_part1= 10 AND key_part2 IN (10,20,30);  
      indexhe:
The join type is the same as all except that only the index tree is scanned. This is usually faster than all because index files are usually smaller than data files.
MySQL can use this join type when queries only use columns that are part of a single index.
      ALL:
For each row combination from the previous table, a complete table scan is performed. If the table is the first table not marked const,
It’s usually not good, and it’s usually bad in its case. You can usually add more indexes instead of all,
Enables rows to be retrieved based on constant or column values in the previous table.
3 possible_keys :
Displays the indexes that may be applied to this table. If it is empty, there is no possible index. You can retrieve the data from the where statement for the related fields
Choose an appropriate statement
4 key :
The actual index used. If NULL, no index is used. In rare cases, MySQL chooses indexes that are not well optimized.
In this case, you can use useindex (indexname) to force an index in the select statement, or ignore index (indexname) to force Mysql to ignore the index
5key_len:
The length of the index used. The shorter the length, the better without losing accuracy

6 ref
Shows which column of the index is used, if possible, as a constant

7 rows
The number of rows that MySQL considers necessary to check to return the requested data (the number of scanned rows)

8 Extra 
This column contains the details of how MySQL resolves the query
Additional information about how MySQL parses queries. This is discussed in table 4.3, but the bad examples you can see here are using temporary and using filesort,
MySQL can’t use index at all. As a result, retrieval will be very slow

The meaning of the description returned by the extra column

Distinct:
Once MySQL finds a row that matches the row Union, it no longer searches
Not exists :
MySQL optimizes the left join. Once it finds a row that matches the left join standard, it will no longer search
Face is an example of a query type that can be optimized in this way
SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;
Let T2. Id be defined as not null. In this case, MySQL scans T1 with the value of T1. ID and looks for rows in T2.
If MySQL finds a matching row in T2, it knows that T2. ID will never be null, and it will no longer scan the rows with the same ID value in T2.
In other words, for each row of T1, MySQL only needs to look up T2 once, no matter how many matching rows there are in T2.
Range checked for each Record(index map:#)
No ideal index was found, so for each row combination from the previous table, MySQL checks which index to use and uses it to return rows from the table.
This is one of the slowest connections to use an index
MySQL did not find a good index to use, but found that if the column value from the previous table is known, some indexes may be available.
For each row combination of the previous table, MySQL checks whether range or index can be used_ Merge access method.
For a description of the applicability criteria, see section 7.2.5, scope optimization, and section 7.2.6, index merge optimization,
The difference is that all column values in the previous table are known and considered constant. This is not fast, but much faster than performing join without index.
Using filesort
When you see this, the query needs to be optimized. MySQL needs to take extra steps to find out how to sort the returned rows.
It sorts all the rows according to the connection type and the row pointer that stores the sort key value and all the rows that match the condition
Using index
Column data is returned from a table that only uses the information in the index and does not read the actual action,
This occurs when all request columns to a table are part of the same index
Using temporary
When you see this, the query needs to be optimized. Here, MySQL needs to create a temporary table to store the results, which usually occurs on order by for different column sets, rather than group by
Using where
Where clauses are used to restrict which rows match the next table or are returned to the user. If you don’t want to return all the rows in the table,
And the connection type is all or index, which will happen, or there is a problem with the query
Impossible WHERE noticed after reading const table…

5. SQL core statements (some very practical skills)

1) Insert data

Batch insert:


INSERT mytable (first_column,second_column,third_column) 
VALUES ('some data','some more data','yet more data') , 
VALUES ('some data','some more data','yet more data') , 
VALUES ('some data','some more data','yet more data')  


2) . clear data table


TRUNCATE TABLE `mytable` 

Note: to delete all records in the table, use truncate table statement. Note why the truncate table statement is used instead of the delete statement: when you use the truncate table statement, the deletion of records is not recorded. In other words, this means truncate table is much faster than delete.

3) Creating records and tables with select

Insert statement is different from delete statement and update statement in that it only operates one record at a time. However, there is a way to make the insert statement add more than one record at a time. To do this, you need to combine the insert statement with the select statement, like this:


INSERT mytable(first_column,second_column) 
SELECT another_first,another_second FROM anothertable WHERE another_first='Copy Me!';   

This statement copies records from another table to mytable. Only the field another in the table another table_ Only records with the value of ‘copy me! In first are copied.

This form of insert statement is very useful when creating a backup of records in a table. Before deleting records from one table, you can copy them to another table in this way.

If you need to copy the entire table, you can use the select into statement. For example, the following statement creates a new table called newtable, which contains all the data of the table mytable:


SELECT * INTO newtable FROM mytable; 

You can also specify that only specific fields are used to create the new table. To do this, simply specify the field you want to copy in the field list. In addition, you can use the where clause to restrict the records copied to the new table. The following example copies only the field second_ The value of columnd is equal to the first value of the record of ‘copy me! “_ Column field.


SELECT first_column INTO newtable 
FROM mytable 
WHERE second_column='Copy Me!';

It is very difficult to modify the established table with SQL. For example, if you add a field to a table, there is no easy way to remove it. In addition, if you accidentally give a field the wrong data type, you will not be able to change it. However, using the SQL statements described in this section, you can bypass both of these problems.

For example, suppose you want to delete a field from a table. Using the select into statement, you can create a copy of the table without the fields to be deleted. This allows you to delete the field and keep the data you don’t want to delete.

If you want to change the data type of a field, you can create a new table that contains fields of the correct data type. After creating the table, you can use the update statement and select statement together to copy all the data in the original table to the new table. In this way, you can not only modify the structure of the table, but also save the original data.

The above is the collation of SQL optimization experience, I hope to help you learn.