Solution for MySQL Database disk IO reaching 100%


The author recently encountered a problem. When the equipment supplier finished the deployment of power monitoring server, it was found that the monitoring alarm was too slow and not timely. So I fed back this problem to the manufacturer. It is estimated that the manufacturer is also trying to be convenient and worry-free. Directly reduce the cycle of reading MySQL database (InnoDB engine) from 15 seconds to 6 seconds. That’s good. The monitoring server is even more stuck!

Usually, I like to play MariaDB, a derivative version of MySQL. When I encounter problems, the author gives full play to the spirit of “nothing to find trouble, nothing to fear”, and analyzes it in combination with the previous book “the road to promotion of MySQL king”.

After communication, the following problems were found:

Problem 1: the amount of instantaneous write data is too large.

The software design is unreasonable. The program on the monitoring server will read all the data and write it to the database every certain cycle. The amount of instantaneous data is too large. The following figure shows that when data is written to the database, the disk IO increases impulsively, especially when the data reading cycle is reduced to 6 seconds, the disk IO remains at 100%, causing the monitoring server to get stuck.

terms of settlement:

In the software, it is suggested to write the program function modules (Collector classification, equipment classification or level classification) into the database. There are many tables in the database, and writing them separately can reduce disk IO.

The interval for the terminal to read the database shall be lengthened appropriately, and the terminal shall not compete for resources with write operations.

In terms of hardware, it can be considered to replace the solid-state disk to submit the read-write rate.

Solution for MySQL Database disk IO reaching 100%

Runtime disk IO status

After the flush tables with read lock, the disk IO is normal. After the unlock tables is turned on, the disk IO increases impulsively. Note: the increase of disk IO is caused by database write, so the write operation needs to be optimized.

Solution for MySQL Database disk IO reaching 100%

The disk IO is normal after the lock is read. After the write is enabled, the disk IO increases in pulse mode

Problem 2: the binary log file mysqlbinlog is too large

Every day, 17g binary log file mysqlbinlog is generated, and the disk is almost full.

Solution: for the monitoring data, there is no need to save the statements that change the database information. Expire logs days to set the number of days to clear the log file. Or not directly, in my Closing directly in the INI file can also reduce disk IO.

; binary logging – not required for slaves, but recommended


Problem 3: the database is not tuned

The MySQL database is not tuned. Although I am not a DBA, I share my own analysis and solution ideas for specific problems in combination with previous knowledge (please refer to the database db Collection) and network materials.

The biggest performance problem affecting the database is disk IO. In order to improve the IOPs of the database, SSD or pcie-ssd high-speed disks can be used. Memory is also very important. Memory can cache hot data and storage engine files to avoid excessive physical io. Physical memory can be increased to improve database concurrency and read-write performance.

The next section focuses on how to tune MySQL database parameters when disk IO is too high?

1、Innodb_ buffer_ pool_ size。 The function is to cache the data information of InnoDB table, which can be set to 50-80% of physical memory.

2、innodb_ flush_ log_ trx_ Commit and. The parameter is the redo log refresh parameter, which defaults to 1. That is, when each transaction is committed, InnoDB immediately writes the redo log in the cache back to the log file and calls the operating system fsync to refresh the IO cache. It can be set to 2, that is, when each transaction is committed, InnoDB immediately writes the redo log in the cache back to the log file, but does not immediately call fsync to refresh the IO cache, but only refresh the disk IO cache once a second. When the parameter is 0, InnoDB will not immediately trigger the operation of writing the cache log to the disk file when the transaction is committed, but will trigger the operation of writing the cache log back to the disk every second and call the operating system fsync to refresh the IO cache. When it is changed to 2 or 0, IO resources can be saved, but the last second data will be lost in case of crash or power failure, but it has no impact on the monitoring data.

Set to is the safest, but the performance is also the worst (relative to the other two parameters, but not unacceptable). If the requirements for data consistency and integrity are not high, it can be set to 2. If only performance is required, such as log servers with high concurrent write, it can be set to 0 to obtain higher performance.

3、sync_ binlog。 The binlog refresh parameter is 1 by default. And parameter InnoDB_ flush_ log_ trx_ Commit constitutes the double one of the database, which can ensure the consistency of data in the master-slave architecture. In the monitoring system, binlog is closed.

4、innodb_ max_ dirty_ pages_ pct。 Dirty pages occupy InnoDB_ buffer_ When the proportion of pool exceeds, the dirty pages are triggered to the disk. It is recommended that 25% – 50%.

5、innodb_ io_ capacity。 The largest IO performance index in InnoDB background process, which affects the number of dirty pages and inserted buffers. The default is 200. This parameter can be appropriately increased under high speed disks. SSD disk configuration can be adjusted to 5000-20000, and pcie-ssd high-speed disk can be adjusted higher (50000).

6、innodb_ data_ file_ path = ibdata1:1G:autoextend。 The InnoDB tablespace should not use the default 10MB, but can be adjusted to 1g to prevent the database from being affected.

7、gerenal log。 It is recommended to close the full log, which is closed by default. Otherwise, the log file will become larger and larger, affecting the performance.