How is MySQL data stored on disk? How to find the stored data?


This article is contributed by zyz1992

I believe many people are familiar with MySQL database. It is the most commonly used relational database today, and there is a wealth of knowledge about mysql.

So, I wonder if you have ever thought about the question: where does the data in MySQL exist? How is it stored?

This article will analyze these problems in depth. The content of this article is very long. I suggest you to collect it. I suggest you to calm down and read it carefully. You will have a harvest!


InnoDB storage format

We know that the data stored in mysql, a relational database, will eventually be persisted to disk files. The physical format stored in the disk file is the data page (for the data page, it will be ignored if you don’t understand it, and it will be introduced separately in the subsequent articles). The records stored in the data page are line by line, but how does it store each line of data in the data page?

Let’s focus on the InnoDB engine most commonly used in MySQL and introduce the storage format.

There are three types of storage in MySQL:

Server layer format: it has nothing to do with the storage engine. Binlog is a commonly used format for binlog storage (as we have described in detail earlier, this is a very important file for MySQL master-slave replication)

Index tuple format: the intermediate status of InnoDB access process records is the format that InnoDB stores in memory (in other words, all our add, delete and modify operations are performed in memory, and this is only a temporary status)

Physical storage format: the storage format recorded in the physical page, i.e. compact format, corresponds to the index tuple format one by one. (this is the real format of data stored on disk)

The InnoDB storage engine of MySQL, like most databases, stores data in the form of rows. We can view the storage format of rows through show table status.

The InnoDB storage engine supports four row storage formats: compact, redundant, dynamic, and compressed. The default is compact.


We don’t pay attention to other parameters here, just look at row_ In the format column, we can see that the row storage format is compact, and the format of compact storage data is roughly as follows


For each row of data we see, it seems that the first thing we see is not each column, but some descriptions of similar columns. Yes, in fact, there are some fields to describe the information of this row during storage, which is similar to the description data of the cache page in the cache pool.

The above picture can be simplified in this way. Don’t pay attention to the transaction ID and rollback pointer first, so as to avoid interference and difficulty in understanding


1. How to store variable length field varchar

In general, the length of the data we want to store cannot be determined. In most cases, it is some variable length data. Take varchar as an example. Suppose that there are three fields, and the field types are varchar (10), char (1), char (1), char. We all know that the data stored is basically some known fixed length data. Suppose that the three types of fields have the following data respectively:

The first line: MySQL a a; The second line: dog B C; Draw a picture to help you imagine what you see in the data.


However, this is not the case in the disk. As mentioned earlier, table space and row are actually logical concepts, while data page is a physical concept. That is to say, what we see in the disk is different.

The two records in the disk are roughly like this: MySQL a a dog B C. they are stored next to each other in the disk.

Do you feel that it is very troublesome to find a piece of data? I tell you: Yes, so MySQL only uses row format to store data when designing, and only has the previous variable length field list, flag bit and record information. These are the information used to record a row of records. In other words, MySQL uses these description information to locate specific records in a row.

Take the first row of records as an example. The records in the disk are roughly as follows. First, we need to know clearly that MySQL knows the types of each field. On this basis, we can understand the following and figure out the following. First of all, we can see that MySQL has 5 characters and 0x05 in hexadecimal, so its storage is like this:


Similarly, the data in the second row is similar to this:


I believe you can guess how MySQL reads data at this time. That is, it will first find the variable length field according to the information of the variable length field described in the variable length field length list. For example, in the first line, MySQL resolves that the variable length field is 5, so it will take 5 characters from MySQL a a dog B C, that is, mysql, Then there are two chars (1), that is, two A’s are taken out in turn.

Intermediate equipment. From simple to deep, let’s take our time. What we just said is just a very simple situation. First, it is to help you understand, so that you can first understand that there is such a thing, and then slowly excavate it. We must study one radish by one pit

Now what if there are multiple varchar fields? For example: varchar (3), varchar (10), varchar (4), char (1). There is a record like this: AAA, BB, CCCC, D. can you infer what the row records in the disk look like from the above?

Do you think so: the disk must be like this: 0x03,0x02,0x04 null flag bit record header information AAA BB CCCC D; If you think so, please put your nose against the wall:); This is not the case.

When there are multiple variable length fields, MySQL stores the lengths of all variable length types in the compact row format at the beginning of the row record to form a list (this list is the variable length field list just mentioned above), which is stored in the reverse order of the columns, which is roughly the following:


Here, I have to explain that the variable length field list will be stored in reverse order, because each line of records has a next_ The record pointer points to the position between the next row of record header information and the real data. Because this position is just right, the left reading is the row description related information, and the right reading is the real data. Exactly corresponds to the variable length field length list. Draw a picture to help you understand:


At this point, let’s make a brief summary

Summary of data storage on disk in MySQL

The storage of data in the disk is continuous in the physical space

The data is stored on the data page designed by mysql, and the final row by row records are stored on the data page

The storage format of rows is compact by default

Each line of data will have a corresponding line description, which includes [variable length field list], [null flag bit] and [record header information]

Every line has next_ The record pointer points to a position between the record header and the variable length field list for easy addressing

The varchar column in the variable length list is described in reverse order (opposite to the field order). The purpose of this is clearly described in the figure above


2. How null fields are stored

The above mentioned situations are normal. That is to say, the fields mentioned above do not have null values. Whether they are variable length fields or char fields, they all have values. If a field is allowed to be null and the value is indeed null, how does MySQL handle it? Is it possible to store null directly.

Assuming that MySQL stores data directly with null, it actually stores data in the form of “null” strings. This obviously doesn’t work because strings take up space (a null string takes up four characters). You don’t have a value and take up so much space, so MySQL certainly doesn’t store data in this way. In fact, when MySQL processes null values, it stores them in binary and reverse order

How does MySQL store null values through binary?

The [null flag bit (also called null list)] in the above compact format data is used to store null values. If a field value is null, its bit position will be set to 1, indicating that the value is null. If bit is 0, indicating that the field value is not empty

Are you confused after listening to the explanation? Don’t worry. I’ll draw a picture and introduce it in detail. Let’s assume that we have a sutdents table first









He has such a record


Let’s first look at the variable length field list (remember that it is stored in reverse order):

Roles is a length of 5, recorded as 0x05; Address is null, not in the variable length list, gender is of type char, not in the variable length list, class is empty, not in the variable length list, hobby_ XX length of 8 is recorded as: 0x08; Therefore, the record of variable length list is 0x08 0x05

Now the null flag bit is reached: the field is still recorded from right to left: the name is not null in the design, which means that it will not appear in the null flag (the null flag is used to record the field that can be null, and the field that cannot be null will not be recorded in the null flag bit). If the address is null, it will be recorded as 1, if the gender is not null, it will be recorded as 0, if the class is null, and if the hobbies is not null, it will be recorded as 0; Therefore, the result is 0101 in the order of the fields, but the null flag bits are in reverse order. Therefore, the result stored in the null flag bits is roughly like this: 0101, the high order is 0


Let’s simulate the reading of this record: MySQL must know the field type (this has been determined when creating the data table), so the field not nul l l l such as name will not be stored in the null flag. The following are the detailed reading steps:

The name field is a primary key and cannot be in the null flag bit. Because the name field is a varchar field, you will look in the variable length field column. If you find the value 0x05, you will read the length of 5 characters in the field list, that is, roles. The first field is read successfully;

Then there is the address field. Because the type is known to MySQL and the field value is null, there is no need to read it. The second field is read;

Then there is the gender field, which is of char type. Just get f directly;

The next one is the class field. Because it is null, it will not be searched in the variable length field at all;

The last one is the hobbies field. Because it is not null, it is also the second variable length field. At this time, you will find it in the variable length field list. If the result is 0x08, you will read the length of 8 characters, and the result is hobby_ XX;

Having said that, I have introduced how to read the field value of the variable length field list and the null flag bit in a row of records. I wonder if you will show the description information of a row of records when you see the above content. At present, we only need to understand the basic functions of varchar and null storage, because these two tables are special and most frequently used. Other field types will not be discussed in this article.

We haven’t discussed the above record header information. Let’s introduce the record header information in detail.

Record header information

The recording header information is composed of 40 bit bits. The division and meaning of each bit are as follows:


In fact, the role of each bit of the record header has been clearly explained. Some concepts cannot be explained yet. Many things can only be explained when indexing. What we need to be clear about here is the meaning of each flag bit.

I think it’s enough to understand the record header here. It’s enough to clarify the meaning of each flag bit. As for more possibilities, we can’t touch them at all. This section is regarded as popular science.


3. How data is stored on disk

A picture has been drawn above:


It was said that the data is stored in the disk in this way: 0x03 null flag bit records header information dog B C, but in fact, the data in the following columns is not what we see. The disk is stored according to the character set code specified by the database. You think it may be stored in the above way.

In fact, it may look like this: 0x03 null flag bit records header information 123323223, that is to say, the actual data stored on the disk is not recognized by us at all. The following 123323223 are scribbled by me and have no meaning, mainly to show that the calculation is stored in a specific character code during actual storage.

In addition, each row of data actually has hidden fields when it is stored. I believe you will not be unfamiliar with this. Row_ ID you should know. You may have heard of it even if you haven’t used it. This is a unique representation generated by the database for each row of records. If we don’t specify a primary key field or a unique key for the data table, the database will help us maintain a self growing row at this time_ The ID field is used as the primary key.

Another hidden field is the transaction ID, which is drawn in the upper layer of the second figure above. As the name suggests, it is a field related to transactions. The attribute field is named dB_ TRX_ ID, which will be explained in detail when it comes to transactions; The last one is also drawn on the second figure above, which is the rollback pointer DB_ Roll_ PTR, rollback is also a concept used by transactions. It is also introduced to you on the transaction side

Now let’s review the structure of the storage recorded on the disk in the next line:

0x08 0x05 00000101 000001010000000000000000000000000000001021134 44 232343

Speaking of storage, let’s talk about a storage related concept, row overflow.

line overflow

Having said that, I wonder if you have thought about a problem. We have been saying that MySQL storage is stored in the form of data pages, and the records in the data pages are row by row, but there is usually no problem under normal circumstances.

However, if a row of records is very large, because the default data page size is 16kb, suppose a table has both text and blob fields, and the size of the records in this row far exceeds the size of a data page by 16kb. This is called row overflow.

How does MySQL handle this kind of row overflow? In fact, it is very simple. If one data page is not enough, multiple data pages are used. Linked lists are used to connect data pages. The reason why linked lists can be used is that the data pages contain bit bits for storing pointers. It is sufficient to understand the concept of row overflow. We study with a definite aim in mind. We don’t have to go to the bottom of everything.


Concluding remarks

This article introduces in detail the format of data stored in mysql, how the data is stored in the disk, and how to find the stored data. It says that many things are already established rules, and the so-called established rules mean that the right things have been designed earlier.

So you only need to follow the rules of the person you are using and understand, and then on this basis, you can deeply understand why other people design this way? This will help us to master and understand a certain knowledge point.

Article source: