Optimization analysis of limit query in MySQL optimization skills

Time:2021-4-3

preface

In the actual business, paging is a common business requirement. Then limit query will be used. When we use limit query, the efficiency is very high when the data is small or only the previous part of the data is queried. However, when the amount of data is large, or the number of query offsets is relatively large, such as limit 100000,20, the efficiency is often unsatisfactory. A common way is to combine limit with order by. If order by has an index for users, the efficiency is usually quite good.

In this case, the simplest query is to use the overlay index to query some required columns. This effect is very good

Like this one


mysql> SELECT * FROM student LIMIT 1000000,1;
+---------+------------+------------+------------+-------+---------------------+
| id  | first_name | last_name | created_at | score | updated_at   |
+---------+------------+------------+------------+-------+---------------------+
| 1000001 | kF9DxBgnUi | yLXnPSHJpH | 2019-07-11 | 97 | 2019-07-11 14:29:59 | |
+---------+------------+------------+------------+-------+---------------------+
1 rows in set (0.31 sec)

You can see the time


mysql> EXPLAIN SELECT score,first_name FROM student ORDER BY created_at LIMIT 1000000,20 \G
*************************** 1. row ***************************
   id: 1
 select_type: SIMPLE
  table: student
 partitions: NULL
   type: index
possible_keys: NULL
   key: time_sorce_name
  key_len: 69
   ref: NULL
   rows: 1000001
  filtered: 100.00
  Extra: Using index
1 row in set, 1 warning (0.00 sec)

mysql>

In this way, the columns of the query use the overlay index, and the number of scanning rows will be reduced a lot, but the effect is not very satisfactory, but if there are other queries, the query will become very slow.

For example, we add last_ Name column.

as follows


mysql> SELECT score,first_name,last_name FROM student ORDER BY created_at LIMIT 1000000,1;
+-------+------------+------------+
| score | first_name | last_name |
+-------+------------+------------+
| 86 | knKsV2g2fY | WB5qJeLZuk |
+-------+------------+------------+
1 row in set (4.81 sec)

mysql>

This query takes more than four seconds to execute. Through analysis, we can see that there is no way to use index for this query


mysql> explain SELECT score,first_name,last_name FROM student ORDER BY created_at LIMIT 1000000,1\G
*************************** 1. row ***************************
   id: 1
 select_type: SIMPLE
  table: student
 partitions: NULL
   type: ALL
possible_keys: NULL
   key: NULL
  key_len: NULL
   ref: NULL
   rows: 6489221
  filtered: 100.00
  Extra: Using filesort
1 row in set, 1 warning (0.00 sec)

mysql>

Now let’s modify the query as follows


mysql> SELECT student.score,student.first_name FROM student INNER JOIN (SELECT id FROM student ORDER BY created_at LIMIT 1000000,1 ) AS temp USING(id);
+-------+------------+
| score | first_name |
+-------+------------+
| 15 | 2QWZ  |
+-------+------------+
1 row in set (0.18 sec)

mysql> EXPLAIN SELECT student.score,student.first_name,last_name FROM student INNER JOIN (SELECT id FROM student ORDER BY created_at LIMIT 1000000,1 ) AS temp USING(id);
+----+-------------+------------+------------+--------+---------------+-----------------+---------+---------+---------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key    | key_len | ref  | rows | filtered | Extra  |
+----+-------------+------------+------------+--------+---------------+-----------------+---------+---------+---------+----------+-------------+
| 1 | PRIMARY  | <derived2> | NULL  | ALL | NULL   | NULL   | NULL | NULL | 1000001 | 100.00 | NULL  |
| 1 | PRIMARY  | student | NULL  | eq_ref | PRIMARY  | PRIMARY   | 4  | temp.id |  1 | 100.00 | NULL  |
| 2 | DERIVED  | student | NULL  | index | NULL   | time_sorce_name | 69  | NULL | 1000001 | 100.00 | Using index |
+----+-------------+------------+------------+--------+---------------+-----------------+---------+---------+---------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)

According to the analysis results, we can see that only 10000001 data records were queried at this time. Why is there such a change. This kind of association is called delay Association. First, it returns the required primary key by using the overlay index query, and then obtains the required data by associating the original table according to the primary key, so as to reduce the number of rows that need to be scanned as much as possible.

In some specific occasions, there is actually another optimization scheme. For example, to get the latest insert records. In the last query, we can record the primary key ID (last) of the last record_ id)。
Then the query can be changed to


SELECT score,first_name,last_name,id FROM student WHERE id>=last_id ORDER BY id ASC LIMIT 1

Like last_ If id = 1000000, the query will start from 1000000. In this scenario, no matter how large the offset is, the performance will be very good.

summary

The above is the whole content of this article, I hope the content of this article has a certain reference learning value for your study or work, thank you for your support to developer.

Recommended Today

. net Maui preview 5 function Preview

Although Microsoft build has just passed, we will continue to share our continuous progress in. Net multi platform application UI (. Net Maui). In this release, we have enabled animation and view transformation, completed the migration of multiple UI components, and improved a single project template.We also released the first batch of preview documents covering […]