How to optimize SQL statements

Time:2022-6-21

(1) Select the most efficient table name order (valid only in the rule-based optimizer):
The Oracle parser processes the table names in the from clause from right to left. The last table written in the from clause (basic table driving table) will be processed first. If the from clause contains multiple tables, you must select the table with the least number of records as the basic table. If there are more than three table join queries, you need to select the intersection table as the base table. The intersection table refers to the table referenced by other tables.
(2) Join order in where clause:
Oracle parses the where clause from bottom to top. According to this principle, the connections between tables must be written before other where conditions, and the conditions that can filter out the maximum number of records must be written at the end of the where clause.
(3) Avoid using ‘*’ in the select clause:
In the process of parsing, Oracle will convert ‘*’ into all column names in turn. This is done by querying the data dictionary, which means it will take more time.
(4) Reduce the number of database accesses:
Oracle performs a lot of work internally: parsing SQL statements, estimating index utilization, binding variables, reading data blocks, etc.
(5) Reset the arraysize parameter in sql*plus, sql*forms and pro*c to increase the amount of retrieved data for each database access. The recommended value is 200.
(6) Use the decode function to reduce processing time:
Use the decode function to avoid repeatedly scanning the same records or repeatedly joining the same tables.
(7) Simple consolidation with no associated database access:
If you have several simple database query statements, you can integrate them into one query (even if there is no relationship between them).
(8) Delete duplicate records:
The most efficient method to delete duplicate records (because ROWID is used) example:

Copy codeThe codes are as follows:
DELETE FROM EMP E WHERE E.ROWID > (SELECT MIN(X.ROWID)
FROM EMP X WHERE X.EMP_NO = E.EMP_NO);

(9) Replace delete with truncate:
When deleting records in a table, rollback segments are usually used to store information that can be recovered If you do not have a commit transaction, Oracle will restore the data to the state before deletion (to be exact, the state before the deletion command is executed). When truncate is used, the rollback segment will no longer store any recoverable information. When the command is run, the data cannot be recovered Therefore, few resources are called and the execution time is very short. (truncate is only applicable to deleting all tables. Truncate is DDL, not DML).
(10) Use commit as much as possible:
Whenever possible, use commit as much as possible in the program, so that the performance of the program can be improved and the demand will be reduced due to the resources released by commit. The resources released by commit:
a. Information on the rollback segment used to recover data.
b. A lock acquired by a program statement.
c. The space in the redo log buffer.
d. Oracle manages the internal costs of the above three resources.
(11) Replace the having clause with the where clause:
Avoid using the having clause, which filters the result set only after all records have been retrieved. This process requires sorting, totaling, and other operations If you can limit the number of records through the where clause, you can reduce this overhead. (non Oracle) of the three clauses where conditions can be added to on, where and having, on is the first to execute, where is the second, and having is the last. Because on filters the unqualified records before making statistics, it can reduce the data to be processed by intermediate operations. It should be the fastest in theory, and where should be faster than having, because it filters the data before making sum, and only uses on when two tables are joined, Therefore, in a table, it is left to compare where with having. In the case of query statistics in this single table, if the conditions to be filtered do not involve the fields to be calculated, their results are the same, except that Rushmore technology can be used in where, but not in having. The latter is slower in speed. If a calculated field is involved, it means that the value of this field is uncertain before calculation. According to the workflow written in the previous chapter, the action time of where is completed before calculation, And having works only after calculation, so in this case, the results of the two will be different. In multi table join queries, on works earlier than where. The system first combines multiple tables into a temporary table according to the connection conditions between the tables, then filters them by where, then calculates them, and then filters them by having. It can be seen that if you want a filter condition to work correctly, you must first understand when it should work, and then decide where to put it.
(12) Reduce queries on tables:
In SQL statements with subqueries, special attention should be paid to reducing queries on tables. example:

Copy codeThe codes are as follows:
SELECT TAB_NAME FROM TABLES WHERE (TAB_NAME,DB_VER) = ( SELECT
TAB_NAME,DB_VER FROM TAB_COLUMNS WHERE VERSION = 604)

(13) Improve SQL efficiency through internal functions:
Complex SQL often sacrifices execution efficiency. It is very meaningful to master the above methods of using functions to solve problems in practical work.
(14) Use alias for table:
When connecting multiple tables in an SQL statement, use the alias of the table and prefix the alias on each column. This will reduce parsing time and syntax errors caused by column ambiguity.
(15) Replace in with exists and not in with not exists:
In many queries based on basic tables, in order to meet one condition, it is often necessary to join another table. In this case, using exists (or not exists) will usually improve the efficiency of the query. In a subquery, the not in Clause performs an internal sort and merge. In either case, not in is the least efficient (because it performs a full table traversal of the tables in the subquery). To avoid using not in, we can rewrite it as outer joins or not exists.
example:
(efficient)

Copy codeThe codes are as follows:
Select * from EMP (basic table)
WHERE EMPNO > 0 AND EXISTS (SELECT ‘X’ FROM DEPT
WHERE DEPT.DEPTNO = EMP.DEPTNO AND LOC = ‘MELB’)
(inefficient) select * from EMP (basic table) where empno > 0
AND DEPTNO IN(SELECT DEPTNO FROM DEPT WHERE LOC = ‘MELB’)

(16) Identify ‘inefficient execution’ SQL statements:
Although various graphical tools for SQL optimization are emerging in endlessly, it is always the best way to write your own SQL tools to solve problems:

Copy codeThe codes are as follows:
SELECT EXECUTIONS , DISK_READS, BUFFER_GETS,
ROUND((BUFFER_GETS-DISK_READS)/BUFFER_GETS,2) Hit_radio,
ROUND(DISK_READS/EXECUTIONS,2) Reads_per_run,
SQL_TEXT
FROM V$SQLAREA
WHERE EXECUTIONS>0
AND BUFFER_GETS > 0
AND (BUFFER_GETS-DISK_READS)/BUFFER_GETS < 0.8
ORDER BY 4 DESC;

(17) Increase efficiency with indexes:
An index is a conceptual part of a table. It is used to improve the efficiency of retrieving data. Oracle uses a complex self balancing B-tree structure. In general, querying data through an index is faster than a full table scan. The Oracle optimizer uses indexes when Oracle finds the best path to execute queries and update statements. Similarly, using indexes when joining multiple tables can also improve efficiency. Another advantage of using an index is that it provides primary key uniqueness verification. For those long or long raw data types, you can index almost all columns. In general, using indexes in large tables is particularly effective Of course, you will also find that when scanning small tables, using indexes can also improve efficiency. Although using index can improve query efficiency, we must also pay attention to its cost. The index needs space for storage and regular maintenance. Whenever a record is added or deleted in the table or the index column is modified, the index itself will be modified. This means that the insert, delete, and update of each record will pay 4 or 5 additional disk i/o for this. Because indexes require additional storage space and processing, those unnecessary indexes will slow down the query response time. It is necessary to refactor the index regularly:

Copy codeThe codes are as follows:
ALTER INDEX <INDEXNAME> REBUILD <TABLESPACENAME>

(18) Replace distinct with exists:
Avoid using distinct in the select clause when submitting a query that contains one to many table information, such as department tables and employee tables. In general, you can consider using exist to replace it. Exists makes the query faster, because the RDBMS core module will return the results immediately after the sub query conditions are met. example:
(inefficient):

Copy codeThe codes are as follows:
SELECT DISTINCT DEPT_NO,DEPT_NAME FROM DEPT D , EMP E
WHERE D.DEPT_NO = E.DEPT_NO

(efficient):

Copy codeThe codes are as follows:
SELECT DEPT_NO,DEPT_NAME FROM DEPT D WHERE EXISTS ( SELECT ‘X’
FROM EMP E WHERE E.DEPT_NO = D.DEPT_NO);

(19) SQL statements are capitalized; Because Oracle always parses SQL statements first, converts lowercase letters to uppercase letters, and then executes them.
(20) Use the connector “+” to connect strings as little as possible in Java code.
(21) avoid using not on index columns. Generally, we should avoid using not on index columns. Not will have the same impact as using functions on index columns. When oracle “encounters” not, it will stop using the index and instead perform a full table scan.
(22) avoid using calculations on index columns. In the where clause, if the index column is part of a function. The optimizer will use a full table scan instead of an index.
give an example:
Inefficiency:
SELECT … FROM DEPT WHERE SAL * 12 > 25000;
High efficiency:
SELECT … FROM DEPT WHERE SAL > 25000/12;
(23) substitute > = for >:
High efficiency:
SELECT * FROM EMP WHERE DEPTNO >=4
Inefficiency:
SELECT * FROM EMP WHERE DEPTNO >3
The difference between the two is that the former DBMS will directly jump to the first record with dept equal to 4, while the latter will first locate the record with deptno=3 and scan forward to the first record with dept greater than 3.
(24) replace or with Union (for index columns):
In general, replacing or in the where clause with union will have a better effect. Using or on an indexed column will result in a full table scan. Note that the above rule is only valid for multiple index columns If a column is not indexed, the query efficiency may be reduced because you do not select or. In the following example, loc_ Both ID and region have indexes.
High efficiency:

Copy codeThe codes are as follows:
SELECT LOC_ID , LOC_DESC , REGION
FROM LOCATION
WHERE LOC_ID = 10
UNION
SELECT LOC_ID , LOC_DESC , REGION
FROM LOCATION
WHERE REGION = “MELBOURNE”
Inefficiency:
SELECT LOC_ID , LOC_DESC , REGION
FROM LOCATION
WHERE LOC_ID = 10 OR REGION = “MELBOURNE”

(25) replace or with in:
This is a simple rule to remember, but the actual implementation effect must be tested. Under Oracle8i, the implementation path of the two seems to be the same:
Inefficiency:
SELECT…. FROM LOCATION WHERE LOC_ID = 10 OR LOC_ID = 20 OR LOC_ID = 30
High efficiency:
SELECT… FROM LOCATION WHERE LOC_IN IN (10,20,30);
(26) avoid using is null and is not null on index columns:
Avoid using any nullable column in the index, and Oracle will not be able to use the index. For a single column index, if the column contains a null value, the record will not exist in the index. For a composite index, if each column is empty, the record also does not exist in the index If at least one column is not empty, the record exists in the index. For example: if the unique index is established on columns a and B of the table, and there is a record in the table whose a and b values are (123, null), Oracle will not accept the next record (insert) with the same a and b values (123, null). However, if all index columns are empty, Oracle will assume that the entire key value is empty and that null does not equal null. So you can insert 1000 records with the same key value. Of course, they are all empty! Because the null value does not exist in the index column, a null value comparison of the index column in the where clause will cause Oracle to deactivate the index.
Low efficiency: (index failure)
SELECT … FROM DEPARTMENT WHERE DEPT_CODE IS NOT NULL;
Efficient: (index is valid)
SELECT … FROM DEPARTMENT WHERE DEPT_CODE >=0;
(27) always use the first column of the index:
If the index is built on multiple columns, the optimizer will choose to use the index only when its first leading column is referenced by the where clause. This is also a simple and important rule. When only the second column of the index is referenced, the optimizer uses a full table scan and ignores the index.
(28) replace union with union-all (if possible):
When the SQL statement requires two query result sets of union, the two result sets will be merged in the form of union-all, and then sorted before the final result is output. If you use union all instead of union, sorting is not necessary. Efficiency will be improved. It should be noted that union all will repeatedly output the same records in the two result sets. Therefore, you should analyze the feasibility of using union all from the business requirements Union will sort the result set. This operation will use sort_ AREA_ Size memory. The optimization of this memory is also very important. The following SQL can be used to query the sorting consumption:
Inefficiency:

Copy codeThe codes are as follows:
SELECT ACCT_NUM, BALANCE_AMT
FROM DEBIT_TRANSACTIONS
WHERE TRAN_DATE = ’31-DEC-95′
UNION
SELECT ACCT_NUM, BALANCE_AMT
FROM DEBIT_TRANSACTIONS
WHERE TRAN_DATE = ’31-DEC-95′

High efficiency:

Copy codeThe codes are as follows:
SELECT ACCT_NUM, BALANCE_AMT
FROM DEBIT_TRANSACTIONS
WHERE TRAN_DATE = ’31-DEC-95′
UNION ALL
SELECT ACCT_NUM, BALANCE_AMT
FROM DEBIT_TRANSACTIONS
WHERE TRAN_DATE = ’31-DEC-95′

(29) replace order by with where:
The order by clause uses indexes only under two strict conditions.
All columns in order by must be included in the same index and maintain the sort order in the index.
All columns in order by must be defined as non empty.
The index used in the where clause and the index used in the order by clause cannot be juxtaposed.
For example, the table dept contains the following columns:

Copy codeThe codes are as follows:
DEPT_CODE PK NOT NULL
DEPT_DESC NOT NULL
DEPT_TYPE NULL

Inefficient: (index not used)
SELECT DEPT_CODE FROM DEPT ORDER BY DEPT_TYPE
Efficient: (use index)
SELECT DEPT_CODE FROM DEPT WHERE DEPT_TYPE > 0
(30) avoid changing the type of index column:
When comparing data of different data types, Oracle automatically performs simple type conversion on columns. Suppose empno is an index column of numeric type: Select… From EMP where empno = ‘123’. In fact, after Oracle type conversion, the statement is converted to: Select… From EMP where empno = to_ NUMBER(‘123′) 。
Fortunately, the type conversion did not occur on the index column, and the purpose of the index was not changed. Now, suppose EMP_ Type is an index column of character type: Select… From EMP where EMP_ TYPE = 123 。
This statement is converted by Oracle to: Select… From EMP where_ NUMBER(EMP_TYPE)=123。 Because of the internal type conversion, this index will not be used! In order to avoid the implicit type conversion of Oracle to your SQL, it is best to express the type conversion explicitly. Note that when comparing characters with numeric values, Oracle will preferentially convert numeric types to character types.
(31) where clause needing attention:
The where clause in some select statements does not use an index. Here are some examples:
(1)‘!=’ Indexes will not be used. Remember, an index can only tell you what exists in a table, not what does not exist in a table.
(2) ‘|’ is a character concatenation function. Like other functions, indexes are disabled.
(3) ‘+’ is a mathematical function. Like other mathematical functions, indexes are disabled.
(4) The same index columns cannot be compared with each other, which will enable full table scanning.
(32) A. if the amount of retrieved data exceeds 30% of the number of records in the table Using indexes will not result in significant efficiency gains. b. In certain cases, using an index may be slower than a full table scan, but this is a difference of the same order of magnitude. In general, using an index is several times or even thousands of times larger than a full table scan!
(33) avoid resource consuming operations:
SQL statements with distinct, union, minus, intersect, and order by will start the SQL Engine to perform the resource consuming sort function. Distinct requires one sort operation, while others require at least two sorts Generally, SQL statements with union, minus, and intersect can be rewritten in other ways If your database sort_ AREA_ Size is well adjusted. It is also possible to use union, minus and intersect. After all, they are highly readable.
(34) optimize group by:
To improve the efficiency of group by statements, you can filter out unwanted records before group by. The next two queries return the same results, but the second one is significantly faster.
Inefficiency:

Copy codeThe codes are as follows:
SELECT JOB, AVG(SAL)
FROM EMP
GROUP JOB
HAVING JOB = ‘PRESIDENT’
OR JOB = ‘MANAGER’

High efficiency:

Copy codeThe codes are as follows:
SELECT JOB, AVG(SAL)
FROM EMP
WHERE JOB = ‘PRESIDENT’
OR JOB = ‘MANAGER’
GROUP JOB