Explain the execution process of MySQL query statements in detail

Time:2021-11-25

First, simply describe what MySQL has done behind a query statement:

  • The client sends a query to the server.
  • The server first checks the query cache. If the cache is hit, it immediately returns the results stored in the cache. Otherwise, proceed to the next stage.
  • The server side performs SQL parsing and preprocessing, and then the optimizer generates the corresponding execution plan.
  • MySQL calls the API of the storage engine to execute the query according to the execution plan generated by the optimizer.
  • Return the results to the client.

Then we will expand these steps in this process in detail.

1. Communication mode between client and server

The communication between client and server is a kind of half duplex communication, that is, at the same time, only one party can send data to the other party. Therefore, after the client sends the query request, all it can do is wait for the server to return the query result, and it needs to wait until all the returned data are received before it can carry out the next operation, instead of interrupting the transmission or disconnecting the connection during the sending process of the server.

2. Query cache

Before parsing a query statement, if the query cache is open, MySQL will first check whether the query hits the data in the query cache. This check is implemented through a case sensitive hash table. When the query hits the cache, the result is directly obtained from the cache and returned to the client. MySQL will not perform the following operations, that is, query statements will not be parsed, execution plans will not be generated, and will not be executed.

3. Query optimization processing

This link may be the most complex link in the whole query execution process. It can be divided into three steps: parsing SQL, preprocessing and optimizing SQL execution plan.

(1) Syntax parser and preprocessing
This process is to check the syntax of the incoming SQL statement and verify the permission of the query. Binghui generates a “parse tree”.

(2) Query optimizer
When we get to this step, we prove that there is no problem at the syntactic level. A query can have many kinds of execution plans, which can return correct results. This link is to select the best execution plan.
The optimal execution plan of MySQL is based on cost. MySQL will set a cost for each operation (such as performing a where comparison), and select the one with the least “cost” from all execution plans.
We can use the following statement to view the cost of the previous query operation:


mysql> SHOW STATUS LIKE 'last_query_cost';

MySQL will return an execution cost data:


+-----------------+----------+
| Variable_name  | Value  |
+-----------------+----------+
| Last_query_cost | 0.549000 |
+-----------------+----------+

However, it is worth noting that the minimum “cost” here does not mean the fastest query speed. That is, it is sometimes unreliable to judge the quality of query statements by “cost”.

The optimization strategies of the optimizer can be roughly divided into two types: static optimization and dynamic optimization.

Static optimization is to directly analyze the previously generated analytic tree. For example, the where condition can be transformed into another equivalent form through some algebraic transformations. Static optimization takes effect after the first completion. Even if the query is executed repeatedly with different parameters, it will not change. It can be considered as a “compilation (preprocessing) time optimization”.

Dynamic optimization is context sensitive and needs to be re evaluated at each query. It can be considered as a “runtime optimization”.

Here are some optimization types that MySQL can handle:

  • Redefine the order of associated tables

Sometimes the order of the associated tables of the query statements we give may not be the best for the query efficiency. At this time, MySQL can automatically help us adjust the order of the associated tables to improve the efficiency.

  • Convert external connection to internal connection

Not all out join statements must be executed in an out join manner. MySQL can recognize this and rewrite the query so that it can adjust the association order.

  • Use equivalence transformation rules

Use some equivalent statements to reduce the number of comparisons and remove some constant and non constant conditions. For example, (5 = 5 and a > 5) will be rewritten as a > 5; If yes (A5 and B = C and a = 5).

  • Optimize count(), min() and max()

Whether the index and column are empty can help optimize this kind of expression. For example, when finding the minimum value, you can directly find the leftmost record with the help of the index, so you don’t need to query the whole table, but replace it with a constant.

  • Overwrite index scan

When the columns in the index contain the columns that need to be used in all queries, MySQL will use the index to return the required data without querying the corresponding data rows.

  • Terminate query in advance

When it is found that the query can meet the requirements, MySQL can always terminate the query immediately. A typical example is when the limit clause is used.

So far, the MySQL server layer has given an optimal execution plan according to the given query statements. But what we need to know is that some column operations we have carried out so far are carried out in the server layer, which is not a place for data storage. Therefore, next we need to take our optimal execution plan to the actual storage engine. Therefore, our next step is to obtain the corresponding statistics from the storage engine.

4. Query execution engine

Compared with the query optimization phase, the query execution phase is not so complex. MySQL simply executes step by step according to the instructions given in the execution plan.

5. Return the result to the client

The last stage of query execution is to return the results to the client. Even if the query does not need to return the result set to the client, MySQL will still return some information of the query, such as the number of rows affected by the query.
If the query can be cached, MySQL will put the query results into the query cache at this stage.
The process of returning results is a step-by-step incremental process. That is, when the first result is obtained, it starts to return to the client. The advantage of this is that it will not return all data at one time, resulting in excessive memory consumption, and the client can get the results at the first time. Each row in the result set will be sent with a packet that meets the MySQL client / server communication protocol, and then transmitted through the TCP protocol. In the process of TCP transmission, the packets may be cached and then transmitted in batches.

The above is a detailed explanation of the execution process of MySQL query statements. For more information about MySQL query statements, please pay attention to other related articles of developeppaer!