MySQL advanced optimization


Index (ALI protocol)

MySQL advanced optimization

Index data structure

B + tree


MySQL advanced optimization

Take col2 as an example. When inserting data, a hash operation will be performed on the inserted data, and the hash hash value of the operation will be taken as the address value.

However, hash can not handle the range search well, but the B + tree maintains all data at the bottom, and each layer of the tree is orderly.

B tree

  • The leaf node has the same depth, and the pointer of the leaf node is null
  • All index elements are not duplicate
  • The data indexes in the node are arranged incrementally from left to right

B + tree

Multi fork balanced tree

  • Non leaf nodes do not store data, only store indexes (redundant), and can put more indexes
  • Yedina contains all index fields
  • Leaf nodes are connected with pointers to improve the performance of interval access

Check that the size of a node in the B + tree set by the bottom layer of MySQL is 16kb

SHOW GLOBAL STATUS LIKE 'Innodb_page_size';

MySQL advanced optimization

Take the following figure as an example. For the three-tier index, bigint takes up 8 bytes, and then a node will have many indexes. An index will have a pointer corresponding to 6 bytes, so the total size of an index (8 + 6) is 14 bytes, and a node can put 1170 indexes.

The second layer can also put 1170 indexes, but each index in the third layer corresponds to not only bigint, but also data data. The data data may be the address of the disk file of the row where the index is located, or the fields of other columns of the row where the index is located. There can be no 1170 indexes. If one index 1KB is calculated, each node in the third layer will put 16 indexes.

Therefore, the B + tree with tree height of 3 will put a total of 1170 x 1170 x 16 = 21902400 indexes

And the root node of the underlying B + tree of MySQL index is ram resident.

MySQL advanced optimization

MySQL’s storage engine is table level.

Create two tables with different database engines

CREATE TABLE `test_innodb_lock`(
    `a` int(11) NOT NULL,
    `b` varchar(255) DEFAULT NULL,
    KEY `idx_a` (`a`),
    KEY `idx_b` (`b`)

CREATE TABLE     `test_myisam`(
    `id` int(11) NOT NULL auto_increment,
    `name` varchar(20) DEFAULT NULL,
    PRIMARY KEY(`id`)
)ENGINE=myisam auto_increment=2 default charset=utf8

You can see that it is actually stored in a specific directory under the MySQL data directory in the disk file.

MySQL advanced optimization

SDI is the abbreviation of serialized dictionary information. It is a new product introduced after MySQL 8.0 redesigned the data dictionary. For non InnoDB engines, MySQL provides another readable file format to describe the metadata information of the table, which is stored in the database directory named $tbname.sdi on the disk.

  • MyISAM index files and data files are separate (non clustered)

The search process is to determine whether there is an index. If so, find the index according to the MYI file, and then find the file address of the line according to MyD

MySQL advanced optimization

  • InnoDB index implementation (aggregation)
    • The table data file itself is an index structure file organized by B + tree
    • The clustered index leaf node contains complete data records
    • Why must InnoDB tables have primary keys, and it is recommended to use integer self incrementing primary keys?
    • Why does the leaf node of a non empty primary key index structure store primary key values? (consistency and storage savings)

If the InnoDB table does not have a primary key, it will automatically find a column of data according to the. Each data in this column will be unique, and that column of data will be taken out to maintain the table.
If there is no unique data column, InnoDB will add a hidden column at the end to help you maintain the table. They are all B + trees. MySQL resources are very tight. It’s best to establish a primary key for each table, and it’s best to use integer self increment. String comparison such as UUID is not recommended because the efficiency will be slow. Why self increment? If not, B + should be self balanced, which will lead to efficiency such as executing insert statements.

MySQL advanced optimization

MySQL advanced optimization

There is a two-way pointer between the leaf nodes of the B + tree index, which places the disk file address of the adjacent node. The position of the adjacent node can be quickly located at any node through the pointer.

To find data with an index greater than 20, quickly locate the node with index 20 from the root node, and then directly take out all the data after index 20.

There are no two-way pointers between leaf nodes of B tree. The B tree has data for each node. The B + tree places all the data on the leaves. The non leaf nodes have redundant indexes, and the leaves have complete indexes.

Because each node on the B tree will place data, there will be fewer nodes in the layer. If the same amount of data is stored, the number of layers of the B tree will be higher.

Joint index

Create sample table

CREATE TABLE `employees` (
    `id` int(11) not null auto_increment,
    `Name ` varchar (24) not null default '' comment 'name',
    `Age ` int (11) not null default '0' comment 'age',
    `Position ` varchar (20) not null default '' comment 'position',
    `hire_ time` timestamp not null default current_ Timestamp comment 'induction time',
    primary key(`id`),
    key `idx_name_age_position` (`name`,`age`,`position`) using btree
)ENGINE = innodb auto_ Increment = 4 default charset = utf8 comment = 'employee record table';

INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei',22,'manager',NOW()),('HanMeimei',23,'dev',NOW()),('Lucy',23,'dev',NOW());

MySQL advanced optimization

It is compared in the order of indexing.

key `idx_name_age_position` (`name`,`age`,`position`) using btree;

EXPLAIN SELECT * FROM employees WHERE NAME = 'Bill' AND age = 31;
EXPLAIN SELECT * FROM employees WHERE age = 30 AND position = 'dev';
EXPLAIN SELECT * FROM employees WHERE position = 'manager';

The leftmost prefix principle, so only the first of the above statements will be executed.
If the first field of the index is ignored and the following fields are directly searched in the whole table, the fields are not arranged in order and cannot be searched through the index. Full tree comparison is also required.

This work adoptsCC agreement, reprint must indicate the author and the link to this article

Recommended Today

Parsing c + + STL container list is different from Python list

class template std::list preface 📄Content of this article:C++ STL list 📇 Column:C/C++ | Comprehensive understanding of C + + STL Standard Template Library 👤 Author URI :Bauhinia fish 📆 Creation time:2022-1-3 📟 Little tip: the article is very long and detailed. It is recommended to collect it first 🔙Return to directory (recommended Collection):Comprehensive understanding of […]