MySQL advanced part (03): rational use of index structure and query

Time:2020-11-25

This article source code: GitHub · point here | gitee · point here

1、 High performance index

1. Query performance issues

In the process of using mysql, the so-called performance problem, in most scenarios, refers to the query performance. The root cause of slow query is the increasing amount of data. The most common way to solve the query performance is to design a reasonable index structure for the business scenarios of query.

2. Principles of index use

The use of the index is not the more the better. Instead, it is continuously improved and optimized for the query scenarios under the business. For example, the user order scenario in the e-commerce system is assumed to have the following table structure:

CREATE TABLE `ds_user` (
  `id` int(11) NOT NULL AUTO_ Increment comment 'primary key ID',
  `user_name` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`)
)Engine = InnoDB default charset = utf8 comment ='user table ';

CREATE TABLE `ds_order` (
  `id` int(11) NOT NULL AUTO_ Increment comment 'primary key ID',
  `user_ ID ` int (11) not null comment 'user ID',
  `order_ No ` varchar (60) not null comment 'order number',
  `product_ Name ` varchar (50) default null comment 'product name',
  `Number ` int (11) default '1' comment ',
  `unit_ Price ` decimal (10,2) default '0.00' comment 'unit price',
  `total_ Price ` decimal (10,2) default '0.00' comment 'total price',
  `order_ State ` int (2) default '1' comment '1 to be paid, 2 paid, 3 delivered, 4 signed in',
  `order_ Remark ` varchar (50) default null comment 'order comment',
  `create_ Time ` datetime default null comment 'creation time',
  PRIMARY KEY (`id`)
)Engine = InnoDB default charset = utf8 comment ='order table ';

User and order management tables are very common in e-commerce business. You can analyze the business to see the common index structure

User side:

  • Most of the queries based on users are based on user ID (user)_ id);
  • Based on the order number (order_ No) to view the logistics information;

Operator:

  • Create based on time period_ Time) or sort;
  • Order based filtering_ State) and statistics;
  • Product based_ Name);

After such a process analysis, it is possible to determine which structures must be used for queries at the early stage of development. The index structure should be prepared in advance to avoid considering the use of indexes when the amount of data is too large to affect performance.

Sometimes, we will consider giving up some query conditions, such as the data statistics based on the product name, and taking the fixed time task to relieve the query pressure of the table. The processing methods are various.

Excellent index design is based on the understanding of business data, considering the query mode of business data and improving the query efficiency.

2、 Index creation

1. Single column index

Single column index, that is, the index is built on a field of a table. A table can have multiple single column indexes, which is relatively simple to use

CREATE INDEX user_id_index ON ds_order(user_id) USING BTREE;

Primary key index, or user above_ Id_ Index is a single column index.

Business scenarios: Based on the user’s own order query, and the management system, order and user’s associated query, so the user of the order table_ ID needs an index.

2. Composite index

Composite index contains two or more columns. Compared with single column index, composite index is much more complex. How to establish composite index is highly related to business. When using composite index, the order of query conditions should be considered.

CREATE INDEX state_create_time_index ON `ds_order`(`create_time`,`order_state`);

The above is the composite index, which actually contains two indexes (create_ time) (create_ time,order_ In this way, the query involves the principle of the leftmost prefix and must be queried in order, which is described in detail below.

Business scenarios: first of all, combined index is used here. In business development, the statistics of common order status is analyzed based on the statistical results. In addition, in the operation system, the filter conditions based on the creation time period exist by default to avoid real-time scanning of all data. Some other common queries are also the query mode of condition plus time period.

3. Prefix index

If the column to be indexed is a long string, then the index will become bulky and the effect may not be obvious. At this time, the front part of the column can be intercepted to create an index to save space. In this way, the selectivity of the index may decrease. In other words, many similar data can be found based on the prefix index

ALTER TABLE ds_order ADD KEY (order_no(30)) ;

Because the order number is too long, we select the first 30 digits as the prefix index to query the order number. Of course, this involves a very classic business scenario, the order number mechanism.

Business scenarios: a typical application scenario of prefix index is to process the order number. A seemingly long order number actually contains a lot of information

  • Time point: the time when the order is generated, mm / DD / h, mm / s;
  • Identification bit: a unique uid to ensure the uniqueness of the order number;
  • Buried point 1: in many businesses, the product category is recorded in the order number;
  • Buried point 2: usually identify product attributes, such as color, taste, etc;
  • Dislocation symbol: to prevent the order number from being analyzed, a random segment of misplaced symbol will be generated;

In this analysis, the actual order number is very long. Therefore, we need to introduce a prefix index mechanism. The expected index length of the prefix index can filter the cardinality of the whole column. For example, the order number above:

  • Most of the services are filtered based on time nodes, that is, the index length is 14 bits;
  • If it is a concurrent service and many time nodes are the same, the index length is time point + identifier bit;

be careful: if business permits, the length of prefix index is generally required to be unique, such as the time and identifier above.

4. Other indexes

For example, full-text index and so on, which are not used in many scenarios. If the data is huge and needs to be retrieved, the powerful search middleware is usually selected to deal with it. Explicit unique index, which can also be avoided on the program to avoid unfriendly exceptions being thrown.

3、 Index query

How to create the optimal index is not easy. Similarly, in the query, whether to use the index is also a very difficult thing. Experience: most of the time when the performance problem is exposed, we will review the SQL statement of the query, and do the corresponding query optimization for the performance problem.

1. Single column query

Here, you can query the primary key index directly. MySQL’s primary key is generally self incremented, so it is very fast.

EXPLAIN SELECT * FROM ds_order WHERE id=2;
EXPLAIN SELECT * FROM ds_order WHERE id=1+1;
EXPLAIN SELECT * FROM ds_order WHERE id+1=1;

Here, id = 2, id = 1 + 1, MySQL can automatically parse, but Id + 1 performs the operation on the index column, which directly causes the primary key index to fail. Here is a basic strategy. If you have to operate on a single column index, you can put this logic in the program. At the MySQL level, the cleaner the SQL statement, the better.

2. Prefix index query

For prefix index query, you can filter specific length based on like, or query full order number.

EXPLAIN SELECT * FROM ds_order WHERE order_no LIKE '202008011314158723628732871625%';
EXPLAIN SELECT * FROM ds_order WHERE order_no='20200801131415872362873287162572367';

3. Composite index query

The most troublesome query is the composite index, or the combination of query conditions, using the index

EXPLAIN SELECT * FROM ds_order 
WHERE create_time>'2020-08-01 00:00:00' AND order_state='1';

Based on the order of columns in the composite index, the composite index: state is used_ create_ time_ index。

EXPLAIN SELECT * FROM ds_order WHERE create_time>'2020-08-01 00:00:00';

The above only uses create_ The time column also uses the index structure.

EXPLAIN SELECT * FROM ds_order WHERE order_state='1';

If only order is used_ State condition, the result shows the full table scan.

EXPLAIN SELECT * FROM ds_order 
WHERE create_time>'2020-08-01 00:00:00' AND order_no LIKE '20200801%';

The above is based on the create of composite index_ Time column and single column index order_ No guarantees that the query conditions are indexed.

Through the above query cases, the precautions for using index combination index are as follows:

  • The combination index must start to query by the leftmost column of the index;
  • You can’t skip composite field queries, so you can’t use indexes;

4、 Index other instructions

1. Advantages of indexing

  • Based on annotation or unique index, the uniqueness of data in database table is guaranteed;
  • Index improves the efficiency of query by reducing the number of rows in the scan table;

2. Disadvantages of index

  • Creating and maintaining indexes will consume space and reality;
  • All operations such as adding, deleting and modifying other than query need to be dynamically maintained;

3. Summary of index usage

The indexing mechanism is really complex in MySQL. Non professional DBAs (that is, developers) should be familiar with common index structures. After two years in the so-called large factories, the core table SQL involved in each version development is checked and accepted by professional DBAs. Complex queries are submitted requirements. DBAs directly output query SQL. Of course, in general companies, there is no DBA and need to be developed in the development process This requires a certain sensitivity to business data and execution monitoring of the core interface. When a little time-consuming is found, it can be continuously optimized. This accumulation is a boring and progressive process.

5、 Source code address

GitHub · address
https://github.com/cicadasmile/mysql-data-base
Gitee · address
https://gitee.com/cicadasmile/mysql-data-base

Recommended reading: MySQL database Series

Serial number Article title
01 MySQL Foundation: classic practical query cases, summary and collation
02 MySQL Foundation: from five dimensions to examine the table structure design
03 MySQL Foundation: summary of system and user defined function, detailed explanation of trigger use
04 MySQL Basics: stored procedures and views, usage and features
05 MySQL Foundation: logic architecture diagram and detailed explanation of InnoDB storage engine
06 MySQL Foundation: transaction management, lock mechanism
07 MySQL Foundation: user and privilege management, log system introduction
01 MySQL advanced: analyze server performance based on multiple dimensions
02 MySQL advanced: index system division, B-tree structure description

Recommended Today

Interviewer: young man, what do you think of the principle of distributed system

1 Concept 1.1 model 1.2 copies 1.3 indicators for measuring distributed systems 2. Principle of distributed system 2.1 data distribution 2.2 basic copy agreement 2.3 lease mechanism 2.4 quorum mechanism 2.5 log technology 2.6 two phase submission protocol 2.7 MVCC 2.8 Paxos protocol 2.9 CAP 1 Concept 1.1 model node In a specific project, a […]