MySQL operation principle [table]

Time:2021-9-24

According to MySQL technology insider: InnoDB storage engine (Second Edition)Nuggets brochure “how MySQL works: understanding MySQL from the root”A summary of the two books

Screenshots of the article are taken from inside MySQL Technology: InnoDB storage engine (Second Edition)

This article will introduce the logical storage and implementation of MySQL tables, that is, if data is organized and stored in tables.

Table overview

A table is a collection of data about a specific entity, which is also the core of the relational database model.

Index organization table

In InnoDB storage engine, tables are organized and stored according to the primary key order. Tables in this storage mode are called index organization tables.

Each table has a primary key. If not, a primary key will be selected or created as follows:

  • First, judge whether there is a unique not null index in the table. If so, the column is the primary key
  • If the above conditions are not met, the InnoDB storage engine automatically creates a 6-byte pointer
  • When there are multiple non empty unique indexes in the table, InnoDB storage engine will select the first defined non empty unique index as the primary key when creating the table

InnoDB storage logical structure

All data are logically stored in a space, which is what we call the table space. The table space can be seen as the highest level of the logical structure of InnoDB storage engine. The structure diagram is as follows (the screenshot of the table is from inside MySQL Technology: InnoDB storage engine (Second Edition))

MySQL operation principle [table]

From the structure diagram, you can see that the table space is composed of four parts: segment, area, page and row

  • Segment: the table space is composed of various segments. Common segments include data segment, index segment, rollback segment, etc
  • Area: the section consists of each area. The area consists of consecutive pages. In any case, the size of the area is 1MB
    • In order to ensure the continuity of pages in a zone, the InnoDB storage engine requests 4 ~ 5 zones from the disk at a time. By default, the size of InnoDB storage engine pages is 16kb, that is, there are 64 consecutive pages in a zone
  • Page: the smallest unit of disk management of InnoDB storage engine [a separate article will be written to introduce data pages in detail]. Common page types include: data page, undo page, system page, transaction data page, insert buffer bitmap page, insert buffer free page, uncompressed binary large object page and compressed binary large object page
  • Row: InnoDB storage engine is column oriented, that is, data is stored by row

InnoDB row record format

We normally insert data into the table in behavioral units. The format of these data stored on disk is calledLine formatperhapsRecord format, there are many line record formats, such as:CompactRedundantDynamicCompressed, starting from MySQL 5.1, the default line format isCompact
You can modify the line record format with the following command

Create table table name (column information) row_ Format = line format name

Alter table table name row_ Format = line format name

CompactThe structure diagram of row record format (screenshot from MySQL technology insider: InnoDB storage engine (Second Edition)) is as follows:

MySQL operation principle [table]
  • Variable length field length list: the header of compact row record format is a non null variable length field length list, which is placed in reverse order
    • Store the byte length occupied by the real data of all variable length fields in the variable length field length list
    • If the column length is less than 255 bytes, it is represented by 1 byte; if it is greater than 255 bytes, it is represented by 2 bytes
  • Null flag bit: indicates whether there is a null value in the row data. If so, 1 indicates
  • Record header information: it is fixed to occupy 5 bytes, that is, 40 binary bits. Different bits represent different meanings, as shown in figure ():

MySQL operation principle [table]
  • Redundant line record format: line record format before MySQL version 5.1
  • Row overflow data [described in detail below]
  • Compressed and dynamic row record formats
    • The dynamic row record format is similar to the compact row record format, except that it is different when dealing with row overflow data
    • The difference between the compressed line record format and the dynamic line record format is that the compressed line record format uses the compression algorithm to compress the page to save space

row-overflow data

The InnoDB storage engine can store some data in a record outside the real data page. It is generally believed that only blob and lob, which are large object data types, will store the data outside the page, but this understanding is a little biased,
Blobs can also not store data outside the page, and even varchar types may still be stored as row overflow data.

Overflow caused by too much data in the record

Not only too much big object data such as blob and lob will overflow, but also varchar type will overflow.

Critical point of row overflow

How many bytes of data does a row overflow occur when a column stores data?
MySQL stipulates that a data page should store at least two data records. We can first analyze how to use the space in the next page

  • Each page needs to store other information besides our data records, such asFile HeaderPage HeaderWait, what do you need132A byte space, and other spaces can be used to store data records
  • Each record requires additional27Bytes of information,27Byte information includes
    • The length of 2 bytes used to store real data
    • 1 byte is used to store whether the column is null
    • 5-byte header information
    • 6-byte row_ ID column
    • 6-byte transaction_ ID column
    • 7-byte roll_ Pointer column

Assuming that the number of data bytes stored in a column is n, the uncle of MySQL design stipulates that if the column does not overflow, the following formula needs to be satisfied:

132 + 2×(27 + n) < 16384

It is obtained that n is 8099, that is, this column can store 8099 bytes of information. If it is greater than 8099 bytes, it is overflow. In the case of only one column, if there are multiple columns, this value can not be used as a reference, so the final conclusion is: you don’t need to pay attention to the critical point, As long as we know that if the number of bytes of data stored in a column of a record is very large, the column may become an overflow column.

This work adoptsCC agreement, reprint must indicate the author and the link to this article

Recommended Today

Supervisor

Supervisor [note] Supervisor – H view supervisor command help Supervisorctl – H view supervisorctl command help Supervisorctl help view the action command of supervisorctl Supervisorctl help any action to view the use of this action 1. Introduction Supervisor is a process control system. Generally speaking, it can monitor your process. If the process exits abnormally, […]