Some suggestions on improving query efficiency by SQL

Time:2020-9-22

1. If you want to use subqueries, use exists instead of in and not exists to replace not in. Because the subquery introduced by exists only tests whether there are rows that meet the specified conditions in the subquery, which is more efficient. In either case, not in is the most inefficient. Because it does a full table traversal of the table in the subquery.

2. To optimize the query, we should avoid scanning the whole table as much as possible. First of all, we should consider building indexes on the columns involved in where and order by.

3. We should try our best to avoid judging the null value of the field in the where clause, otherwise the engine will give up using the index and scan the whole table, for example:select id from t where num is null You can set the default value of 0 on num to ensure that there is no null value in num column in the table, and then query as follows:select id from t where num=0

4. Try to avoid using the! = or < > operator in the where clause, otherwise the engine will give up using the index and scan the whole table.

5. Try to avoid using or in the where clause to join conditions, otherwise the engine will abandon the use of index and scan the whole tableselect id from t where num=10 or num=20You can query as follows:select id from t where num=10 union all select id from t where num=20

6. In and not in should also be used with caution, otherwise full table scanning will be caused, such as:select id from t where num in(1,2,3)For continuous values, use between instead of inselect id from t where num between 1 and 3

7. The following query will also cause a full table scan:select id from t where name like '%abc%'To improve efficiency, full-text retrieval can be considered.

8. If the parameter is used in the where clause, it will also cause a full table scan. Because SQL only resolves local variables at run time, the optimizer cannot delay the selection of an access plan to run time; it must do so at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and cannot be used as an input for index selection. The following statement will scan the whole table:select id from t where [email protected]You can force the query to use the index instead:select id from t with(IndexName) where [email protected]

9. We should try our best to avoid the expression operation on the field in the where clause, which will cause the engine to abandon the use of index and scan the whole table. For example:select id from t where num/2=100It should be replaced by:select id from t where num=100*2

10. We should try our best to avoid function operation on the field in the where clause, which will cause the engine to abandon the use of index and scan the whole table. For example:select id from t where substring(name,1,3)='abc'–Name ID starting with ABCselect id from t where datediff(day,createdate,'2005-11-30')=0--‘2005-11-30’The generated ID should be changed to:select id from t where name like 'abc%' select id from t where createdate>='2005-11-30' and createdate<'2005-12-1'

11. Do not perform function, arithmetic or other expression operations on the left of “=” in the where clause, otherwise the system may not be able to use the index correctly.

If the index is used as the first index, then the order of the index must be consistent with that of the index system.

13. Do not write meaningless queries, such as generating an empty table structure:select col1,col2 into #t from t where 1=0 This kind of code will not return any result set, but will consume system resources, so it should be changed as follows:create table #t(...)

14. In many cases, it is a good choice to use exists instead of inselect num from a where num in(select num from b)Replace with the following statement:select num from a where exists(select 1 from b where num=a.num)

15. Not all indexes are effective for queries. SQL optimizes queries based on the data in the table. When there is a large number of duplicate data in the index column, SQL query may not make use of the index. For example, if the fields sex, male and female are almost half of each other in a table, even if the index is built on the sex, the query efficiency will not be affected.

16. The index is not the more the better. Although the index can improve the efficiency of the corresponding select, it also reduces the efficiency of the insert and update. Because the index may be rebuilt during the insert or update, careful consideration should be given to how to build the index, depending on the specific situation. It is better not to have more than 6 indexes in a table. If there are too many indexes, we should consider whether it is necessary to build indexes on some infrequently used columns.

17. We should avoid updating the clustered index data column as much as possible, because the order of the clustered index data column is the physical storage order of the table records. Once the column value changes, the order of the whole table record will be adjusted, which will consume considerable resources. If the application system needs to update the clustered index data column frequently, it needs to consider whether the clustered index should be built.

18. Try to use numeric fields. If the fields only contain numerical information should not be designed as character type, it will reduce the performance of query and connection, and increase the storage cost. This is because the engine compares each character in the string one by one when processing queries and joins, and only one comparison is needed for numeric types.

19. Use varchar / nvarchar instead of char / nchar as much as possible, because the storage space of variable length field is small, which can save storage space. Secondly, for query, the search efficiency in a relatively small field is obviously higher.

20. Don’t use it anywhereselect * from t, replace “*” with a specific field list, and do not return any fields that cannot be used.

21. Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, note that the index is very limited (only the primary key index).

22. Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

23. Temporary tables are not unusable, and their proper use can make some routines more efficient, for example, when you need to repeatedly refer to a dataset in a large table or a common table. However, for one-time events, it is best to use the export table.

24. When creating a new temporary table, if a large amount of data is inserted at one time, it can be usedselect intoreplacecreate tableTo avoid creating a large number of logs to improve the speed; if the amount of data is small, in order to ease the resources of the system table, you should first create a table and then insert it.

25. If temporary tables are used, all temporary tables must be explicitly deleted at the end of the stored procedure. Truncate table, then drop table, can avoid long-term locking of system tables.

26. Try to avoid using cursors, because cursors are inefficient. If the data operated by cursors exceeds 10000 rows, you should consider rewriting them.

27. Before using cursor based method or temporary table method, we should find a set based solution to solve the problem. The set based method is usually more effective.

28. Like temporary tables, cursors are not unusable. Using fast for small datasets_ Forward cursors are generally superior to other row by row processing methods, especially when several tables must be referenced to obtain the required data. Routines that include “totals” in the result set are usually faster than using cursors. If development time allows, both cursor based and set based methods can be tried to see which method works better.

29. Set NOCOUNT on at the beginning of all stored procedures and triggers, and set NOCOUNT off at the end. There is no need to send done to the client after each statement of the stored procedure and trigger is executed_ IN_ Proc message.

30. Try to avoid large transaction operation and improve the concurrent ability of the system.

31. Try to avoid returning large amount of data to the client. If the amount of data is too large, we should consider whether the corresponding demand is reasonable

Recommended Today

On the theoretical basis of SRE

What is SRE? When I first got into contact with SRE, many people thought that it was a post with full stack capability in Google and could solve many problems independently. After in-depth exploration, it is found that SRE can solve many problems, but there are too many problems. It is difficult for a post […]