Case analysis of page splitting principle of MySQL clustering index

Time:2021-4-16

This paper introduces the page splitting of MySQL clustering index. The details are as follows:

In mysql, MyISAM uses non clustered index, and InnoDB storage engine uses clustered index.

The characteristics of cluster structure are as follows

  • When querying items according to the primary key, there is no need to return the line (the data is under the primary key node)
  • If you encounter irregular data insertion, resulting in frequent page splitting

Why page splitting?

This is because the cluster index adopts the balanced binary tree algorithm, and each node keeps the data of the row corresponding to the primary key. Assuming that the primary key of the inserted data is self growing, then according to the binary tree algorithm, the data will be quickly added to a node, while the other nodes will not move; but if the inserted data is irregular, then each insertion will be changed The data state before changing the binary tree. This causes the page to split.

Test:

Create 2 tables


create table t8(
id int primary key,
c1 varchar(500),
c2 varchar(500),
c3 varchar(500),
c4 varchar(500),
c5 varchar(500),
c6 varchar(500)
) engine innodb charset utf8;
create table t9(
id int primary key,
c1 varchar(500),
c2 varchar(500),
c3 varchar(500),
c4 varchar(500),
c5 varchar(500),
c6 varchar(500)
) engine innodb charset utf8;

Write a PHP script to insert 1W irregular primary key data and 1W regular primary key data to see the difference.

<?php
set_time_limit(0);
$conn = mysql_connect('localhost','root','1234');
mysql_query('use test;');
//Self growing primary key
$str = str_repeat('a', 500);
$startTime = microtime(true);
for($i=1;$i<=10000;$i++){
 mysql_query("insert into t8 values($i,'$str','$str','$str','$str','$str','$str')");
}
$endTime = microtime(true);
echo $endTime-$startTime.'<br/>';
//Unordered primary key
$arr = range(1, 10000);
shuffle($arr);
$startTime = microtime(true);
foreach($arr as $i){
 mysql_query("insert into t9 values($i,'$str','$str','$str','$str','$str','$str')");
}
$endTime = microtime(true);
echo $endTime-$startTime.'<br/>';

Test result chart

Data of 1W rules: 998 seconds = 16 minutes
1W irregular data: 1939 seconds = 32 minutes

Conclusion

The primary key value of cluster index should be continuously increasing value instead of random value (don’t use random string or UUID), otherwise it will cause a lot of page splitting and page moving. When using InnoDB, it is best defined as:

id int unsigned primary key auto_increment

For more information about mysql, readers interested in this site can see the following topics: summary of MySQL index operation skills, summary of MySQL common functions, collection of MySQL log operation skills, summary of MySQL transaction operation skills, collection of MySQL stored procedure skills, and summary of MySQL database lock related skills

I hope this article will be helpful to your MySQL database design.

Recommended Today

Review of SQL Sever basic command

catalogue preface Installation of virtual machine Commands and operations Basic command syntax Case sensitive SQL keyword and function name Column and Index Names alias Too long to see? Space Database connection Connection of SSMS Connection of command line Database operation establish delete constraint integrity constraint Common constraints NOT NULL UNIQUE PRIMARY KEY FOREIGN KEY DEFAULT […]