Technology sharing | slave_ relay_ log_ Some extensions of info table cognition

Time:2020-8-11

Author: Hu Chengqing

slave_ relay_ log_ The info table looks like this:

mysql> select * from mysql.slave_relay_log_info\G

 *************************** 1. row ***************************
  Number_of_lines: 7
   Relay_log_name: ./mysql-relay.000015
    Relay_log_pos: 621
  Master_log_name: mysql-bin.000001
   Master_log_pos: 2407
        Sql_delay: 0
Number_of_workers: 16
               Id: 1
     Channel_name:
slave_ relay_ log_ The info table stores the working location of the slave SQL thread.

When starting from the library, the slave is read_ relay_ log_ Info table, and pass the value to the relay in “show slave status”_ Log_ File、Relay_ Log_ POS, the next “start slave” starts from this location to continue playback of relay log.
slave_ relay_ log_ The info table stores the persistent state and show slave status outputs the state in memory

  • The positions of the two outputs may be different
  • Stop slave or shut down mysqld normally will persist the state in memory to the disk (slave)_ relay_ log_ Info table)
  • When mysqld is started, the disk state will be read and initialized to the memory state
  • The memory state takes effect when start slave

Slave IO thread according to master_ Log_ File、Read_ Master_ Log_ The POS location reads the binlog of the main database and writes it to the local relay log (note that the information of these two sites is saved in the slave_ master_ In the info table); slave SQL thread is based on relay_ Log_ Name、Relay_ Log_ The POS position plays back the real log.

However, the position of the same transaction in the slave database relay log and the master database binlog is not equal_ relay_ log_ Info table through master_ log_ name、Master_ log_ The POS fields record the position of the transaction in the relay log in the binlog of the main database.
We have to know that if the slave IO thread repeats or omits to read the master database binlog into the relay log, the SQL thread will replay these relay logs repeatedly or omitted.That is to say, whether the data from the slave database is correct and whether the position of IO thread is correct is also very important.
Before MySQL 5.6, replication site information could only be stored in the master.info In the file, it is updated to the file after the transaction is played back (by default, 10000 transaction updates are played back each time, which is subject to the parameter sync_ relay_ log_ Info control). Even if each transaction updates the file, there is no way to ensure consistency in the event of an unexpected outage.
Starting with MySQL 5.6, you can set — relay log info repository = table to store the working location of slave SQL thread in mysql.slave_ relay_ log_ In the info table, if this table is a transaction supporting engine such as InnoDB, it will be updated in the transaction every time a transaction is played back from the database mysql.slave_ relay_ log_ Info table to keep the position of SQL thread consistent with the data. In fact, in 5.6.0-5.6.5, slave_ relay_ log_ The MyISAM engine is used by default for the info table, and the later version is changed to InnoDB. However, considering that MySQL 5.6.10 is GA, there should not be many people stepping on this pit.

Renewal mechanism

Reference Manual:

sync_relay_log_info = 0
  If relay_log_info_repository is set to FILE, the MySQL server performs no synchronization of the relay-  log.info file to disk; instead, the server relies on the operating system to flush its contents periodically as with any other file.
  If relay_log_info_repository is set to TABLE, and the storage engine for that table is transactional, the table is updated after each transaction. (Thesync_relay_log_info setting is effectively ignored in this case.)
  If relay_log_info_repository is set to TABLE, and the storage engine for that table is not transactional, the table is never updated.

sync_relay_log_info = N > 0
  If relay_log_info_repository is set to FILE, the slave synchronizes its relay-log.info file to disk (using fdatasync()) after every N transactions.
  If relay_log_info_repository is set to TABLE, and the storage engine for that table is transactional, the table is updated after each transaction. (Thesync_relay_log_info setting is effectively ignored in this case.)
  If relay_log_info_repository is set to TABLE, and the storage engine for that table is not transactional, the table is updated after every N events.

General operation and maintenance specifications require relay_ log_ info_ Repository = table, the default value is sync_ relay_ log_ Info = 10000 will be invalid at this time, and every playback transaction will be updated in this transaction at the same time mysql.slave_ relay_ log_ Info table, to ensure the persistence, and ultimately ensure the consistency of the copied data. Of course, InnoDB is required for the persistence of inoodb_ flush_ log_ At_ trx_ Commit = 1.

There is a sentence “that is to say, whether the data from the database is correct, and whether the position of IO thread is correct is also very important.”. In short, the IO thread location is saved in slave_ master_ In the info table, set and relay_ log_ info_ Repository is similar in that its persistence guarantee usually conflicts with performance

  • Master must be set_ info_ Repository = table and sync_ master_ Info = 1, the unit of disk flushing is binlog event rather than transaction. Write amplification is very serious and performance loss is high

So usually sync_ master_ If info uses the default value of 10000, the location of IO thread cannot guarantee persistence, so it cannot be guaranteed to be correct. MySQL has another parameter, relay_ log_ Recovery provides a mechanism to ensure the accuracy of IO thread position after mysqld crash, which will be introduced later.

master_auto_position

master_ auto_ The function of position is based on the executed of the slave library_ Gtid_ Set automatically finds the corresponding binlog location on the main database, which is a function after the appearance of gtid.

Here’s a question to consider: turn on the master_ auto_ After position, the slave IO thread can be directly executed according to the executed of the slave library_ Gtid_ Does set locate the binlog location on the master database? We also need slave_ relay_ log_ info、slave_ master_ Is the locus information recorded in the info table?

In fact, slave_ relay_ log_ info、slave_ master_ The info table still works:

  • When you execute the start slave for the first time or after the reset slave, IO thread will be executed from the executed of the library_ Gtid_ The set is sent to the master database to obtain the corresponding file and position, and then updated to the slave database_ relay_ log_ info、slave_ master_ In the info table
  • When slave_ relay_ log_ info、slave_ master_ After the location information exists in the info table, whether it is to restart replication or restart mysqld, the file and position are directly obtained from these two places, and the binlog and relay log are read and played back from here

Note: executing “reset slave” will delete the relay log on the slave library and reset the slave_ relay_ log_ Info table, which resets the replication location. If master_ auto_ Position = 0. The next time replication is started, the binlog of the main database will be retrieved and played back, resulting in an error.

relay_log_recovery

When relay is enabled_ log_ Recovery: when mysqld is started, the recovery process will generate a new relay log and initialize the location of the slave SQL thread

  • slave_ relay_ log_ Relay of info table_ Log_ The name value is updated to the latest log name, relay_ Log_ The POS value is updated to a fixed value of 4 (the fixed information of the head should account for 4 offsets)
  • The memory state is the relay in the show slave status output_ Log_ File、Relay_ Log_ POS, also updated to the same as above

In addition, the position of slave IO thread will also be initialized as follows:

  • slave_ master_ Master in info table_ log_ name、Master_ log_ POS will not change
  • The memory status is the master in the show slave status output_ Log_ File、Read_ Master_ Log_ POS will be updated to slave_ relay_ log_ Master in info table_ Log_ Name、Master_ Log_ Pos

The initialization idea of IO thread is: since the previously recorded position is not sure whether it is accurate, it is not required. I will retrieve the binlog of the main database from where the SQL thread is played back. This is certainly true. The position of a transaction in the relay log corresponds to the position of the binlog in the main database

  • slave_ relay_ log_ Relay in info table_ log_ Name and master_ log_ name,Relay_ Log_ POS and master_ Log_ POS always correspond one to one, representing the location of the same transaction.

So even if sync_ master_ The persistence of info table cannot be guaranteed_ log_ Recovery will also reset the IO thread to the position where it has been played back.
relay_ log_ Another function of recovery is to prevent the relay log from being damaged, because the default relay log is not guaranteed to be persistent (and setting sync is not recommended_ relay_ Log = 1). After the operating system or mysqld crash, SQL thread may cause errors due to damage or loss of relay log.

Some conclusions

  • When gtid and master are enabled_ auto_ Position and set relay_ log_ Recovery = 1, even if relay_ log_ info_ After the repository is set to file, the operating system or mysqld crash, the next time mysqld restarts and starts replication can ensure that the data is consistent with the master database. Even if slave_ relay_ log_ The position of the record in the info table is not up to date. SQL thread may replay some transactions repeatedly, but gtids of these transactions already exist in the slave database, and these repeated transactions will be skipped.
  • When gtid and master are not enabled_ auto_ Position, relay must be set_ log_ info_ repository=table、relay_ log_ Recovery = 1, after the operating system or mysqld crash, the next time mysqld restarts and starts replication to ensure that the data is consistent with the master database.