MySQL master-slave replication summary

Time:2022-5-25

MySQL replication process

  1. On standby database B, set the IP, port, user name and password of primary database a through the change master command, and where to start requesting binlog, which contains the file name and log offset
  2. In this case, the start command of slave is executed on the two slave libraries in the figure_ Thread and SQL_ thread。 Where IO_ Thread is responsible for establishing a connection with the main database
  3. After verifying the user name and password, main database a starts to read binlog locally and send it to B according to the location transmitted by standby database B
  4. After the standby database B obtains the binlog, it writes it to the local file, which is called the relay log
  5. sql_ Thread reads the transit log, parses the commands in the log, and executes them

Three formats of binlog

statement

In the statement format, the original statement is recorded in binlog

Binlog is set in the format of statement, and there is limit in the statement, so this command may be unsafe. This may happen: when the main database executes this SQL statement, index a is used; When the standby database executes this SQL statement, the index t is used_ modified。 Therefore, MySQL believes that writing like this is risky

In the statement mode, because it is a recorded execution statement, in order for these statements to execute correctly on the slave side, it must also record some relevant information of each statement during execution, that is, context information, to ensure that all statements can get the same result when executed on the slave side as when executed on the master side. This needs to be parsed by mysqlbinlog tool, and then send the whole parsing result to MySQL for execution, instead of using only the statements parsed from the statement, which will lack context

row

When binlog_ Format when the row format is used, the binlog records the primary key ID of the real affected row

When binlog_ When the format uses the row format, the binlog records the primary key ID of the real deleted row. In this way, when the binlog is transferred to the standby database, the row with id = 4 will be deleted. There will be no problem that the active and standby delete different rows

The disadvantage of row format is that it takes up a lot of space. For example, if you delete 100000 rows of data with a delete statement, an SQL statement is recorded in binlog, occupying dozens of bytes of space. However, if the binlog in row format is used, the 100000 records must be written to the binlog. This will not only occupy more space, but also consume IO resources and affect the execution speed when writing binlog

But now more and more scenarios require that MySQL’s binlog format be set to row. Because data can be recovered

In the row mode, binlog does not record the context related information of the executed SQL statement. It only needs to record which record has been modified and what has been modified. Therefore, the log content of row will clearly record the details of each row of data modification, which is very easy to understand

mixed

Mixed format means that MySQL will judge whether this SQL statement may cause inconsistency between active and standby. If possible, use row format, otherwise use statement format

The mixed format can take advantage of the statement format while avoiding the risk of data inconsistency

If the binlog format set by your online MySQL is statement, it can basically be considered as an unreasonable setting. You should at least set the format of binlog to mixed

Dual m architecture

Nodes a and B are always primary and standby to each other. In this way, there is no need to modify the master-slave relationship during switching

How to solve the problem of circular replication

  • It is specified that the server IDs of the two databases must be different. If they are the same, the master-slave relationship cannot be set between them
  • A standby database receives the binlog and generates a new binlog with the same server ID as the original binlog during playback
  • After each database receives the log sent from its own primary database, it first judges the server ID. if it is the same as its own, it means that the log is generated by itself, and it will be discarded directly

Active standby delay

  • Execute the show slave status command on the standby database, and the returned results will show seconds_ behind_ Master, used to indicate how many seconds the current standby database is delayed

seconds_ behind_ The calculation method of master is as follows:

  1. There is a time field in the binlog of each transaction, which is used to record the time written on the main database;
  2. The standby database takes out the value of the time field of the currently executing transaction, calculates the difference between it and the current system time, and obtains seconds_ behind_ Master, in seconds

When the standby database is connected to the primary database, it will execute select UNIX_ The timestamp() function to obtain the system time of the current master database. If you find that the system time of the primary database is inconsistent with your own time, the standby database is executing seconds_ behind_ This difference will be deducted automatically during master calculation

When the network is normal, the time required to transfer logs from the primary database to the standby database is very short. Under normal network conditions, the main source of the delay between the primary database and the standby database is the time difference between receiving binlog and executing this transaction. The most direct manifestation of the active / standby delay is that the standby database consumes the relay log slower than the primary database produces binlog.

Reasons for active / standby delay

  • The performance of the machine where the standby library is located is worse than that of the machine where the main library is located
  • The pressure of standby warehouse is high. The general idea is that since the main library provides writing ability, the standby library can provide some reading ability. Or some analysis statements required by the operation background can not affect the normal business, so they can only run on the standby database. The pressure control of standby warehouse is ignored. As a result, the query on the standby database consumes a lot of CPU resources, affects the synchronization speed and causes the delay of the primary and standby databases

Solution:

  • One master and many slaves. In addition to the standby database, you can connect several more slave databases to share the reading pressure.
  • Output binlog to external systems, such as Hadoop, to enable external systems to provide statistical query capabilities.
  • Large transaction: the primary database cannot write to binlog until the transaction is completed, and then transfer it to the secondary database. Therefore, if the statement on a master database is executed for 10 minutes, the transaction is likely to delay the slave database for 10 minutes
  • Large table DDL

Active / standby switching mode

Reliability priority strategy

  1. Judge the current seconds of standby database B_ behind_ Master, if it is less than a certain value (e.g. 5 seconds), continue to the next step, otherwise continue to retry this step
  2. Change main database a to read-only status, that is, set readonly to true
  3. Judge the seconds of standby database B_ behind_ The value of master until it becomes 0
  4. Change standby database B to read-write state, that is, set readonly to false
  5. Switch the service request to standby database B

There is unavailable time in this switching process. After step 2, both primary database a and standby database B are in readonly state, that is, the system is in a non writable state and cannot be restored until step 5 is completed

In this unavailable state, step 3 is more time-consuming, which may take several seconds. This is why you need to make a judgment in step 1 to ensure that seconds_ behind_ The value of master is small enough

Availability priority policy

Before synchronizing the active and standby data, directly switch the connection to standby database B and allow standby database B to read and write, then the system will have almost no unavailable time

Availability first policy and binlog_ format=mixed

  1. After the main database a executes the insert statement and inserts a row of data (4,4), it starts the active / standby switching.
  2. Because there is a 5-second delay between active and standby databases, standby database B starts to receive the command of “insert C = 5” from the client before it has time to apply the transit log of “insert C = 4”.
  3. Standby database B inserts a row of data (4,5) and sends the binlog to primary database a
  4. Standby database B executes the “insert C = 4” transit log and inserts a row of data (5,4).
  5. When the “insert C = 5” statement executed directly in the standby database B is transferred to the main database a, a new row of data (5,5) is inserted.
  6. The final result is that there are two rows of inconsistent data on primary database a and standby database B. It can be seen that this data inconsistency is caused by the availability priority process.

Availability priority policy, but set binlog_ format=row

When binlog is recorded in row format, all field values of the newly inserted row will be recorded, so only one row will be inconsistent in the end. In addition, the application threads of active and standby synchronization on both sides will report an error duplicate key error and stop. In other words, in this case, the two lines of data (5,4) of standby database B and (5,5) of primary database a will not be executed by the other party

Summary:

  1. When using binlog in row format, the problem of data inconsistency is easier to be found.
  2. When using binlog in mixed or statement format, the data is likely to be inconsistent quietly. If you find the problem of data inconsistency after a long time, it is likely that the data inconsistency is no longer available, or more data logic inconsistencies are caused.
  3. The availability priority policy of active and standby switching will lead to data inconsistency. Therefore, in most cases, it is recommended to use the reliability first strategy. After all, for data services, the reliability of data is generally better than availability.

Parallel replication

During replication, the execution phase can be executed in parallel. During binlog flush, it is carried out in sequence; Therefore, the binlogs of the master node and the slave node are consistent

mysql5.6

Parallel by library; Binlogs of the same library are processed by the same worker, and binlogs of different libraries are processed by different workers

mysql5.7

  • Database: default value, parallel replication based on library
  • LOGICAL_ Clock: a parallel replication method based on group submission.

    • The transactions submitted by a group can be played back in parallel. Because these transactions have entered the Prepare phase of transactions, there is no conflict between transactions (otherwise it is impossible to commit)
    • Transactions submitted by a group can be played back in parallel (with binary log group commit); Last in the relay log of the slave machine_ The same committed transactions (different sequence_num) can be executed concurrently

      • sequence_ Number is the self incrementing transaction ID, last_ Committed represents the last committed transaction ID.
      • If the last of two transactions_ The same committed indicates that the two transactions are committed in the same group

    mysql5.7.22

    Even if the transactions committed in serial by the primary database do not conflict with each other, they can be played back in parallel in the standby database.

A parameter binlog transaction dependency tracking is added to control whether the new policy is enabled. There are three optional values for this parameter

  • COMMIT_ Order refers to the group submission policy described above according to 5.7
  • Writeset refers to calculating the hash value of each row involved in the update of the transaction to form a set writeset. If two transactions do not operate on the same row, that is, their writesets do not intersect, they can be parallel

    • Hash value is calculated by “database name + table name + index name + value + (unique index and value)”
    • Whether the binset can be written to the binlog Library in parallel after the binset is parsed
    • Writeset is a hash array whose size is determined by the parameter binlog_ transaction_ dependency_ history_ Size decision
  • WRITESET_ Session, based on the writeset method, ensures that transactions in the same session cannot be parallelized

Master slave switching

In traditional replication, in case of failure, master-slave switching is required. Binlog and location information need to be found. After data recovery, the master node points to the new master node. MySQL 5.6 provides a new idea of data recovery. You only need to know the IP, port, account and password of the master node. Because replication is automatic, MySQL will automatically find some synchronization through the internal mechanism gtid

GTID (global transaction identifier)

Global transaction ID, a transaction corresponds to a gtid, which ensures that each transaction submitted on the master database has a unique ID in the cluster

  • GTID = source_id:transaction_id

source_ If the ID is normal, it is the server_ UUID, generated at the first startup (function generate_server_uuid), and persisted to dataDir / auto CNF file.
transaction_ ID is the sequential number, which starts from 1 and grows from 1 on each MySQL server. It is the unique identification of the transaction.

  • The generation of gtid is controlled by the parameter gtid_ Value control of next.
    On the master, gtid_ Next is automatic by default, that is, gtid is automatically generated every time a transaction is committed. It finds an unused minimum value greater than 0 from the currently executed gtid set (i.e. gtid_executed) as the gtid of the next transaction. Write the gtid to binlog before the actual update transaction record. You can also set the gtid_ Next is a specified value. Through set gtid_ next=’current_ Gtid ‘is specified as current_ Gtid, then there are two possibilities:

    • a. If current_ The gtid already exists in the gtid set of the instance, and the next transaction will be directly ignored by the system;
    • b. If current_ If the gtid does not exist in the gtid collection of the instance, the current_ The gtid is assigned to the transaction to be executed next.
  • Note that a current_ Gtid can only be used for one transaction. After the transaction is committed, if you want to execute the next transaction, you need to execute the set command and set the gtid_ Next is set to another gtid or automatic
    Each MySQL instance maintains a gtid set that corresponds to “all transactions executed by this instance”
  • Gtid is convenient to realize the failover between master and slave. There is no need to locate binlog log files and find binlog location information step by step
  • The startup of gtid mode is also very simple. We only need to add the parameter gtid when starting a MySQL instance_ Mode = on and enforce_ gtid_ Consistency = on

On slave, the gtid of the main database (i.e. set gtid_next record) is read from binlog, and then the transaction executed adopts the gtid.

  • In the original log based replication, the slave database needs to tell the master database from which offset to perform incremental synchronization. If it is specified incorrectly, it will cause data omission and data inconsistency.
  • In gtid based replication, the slave database will inform the master database of the gtid value of the transaction that has been executed, and then the master database will return the list of gtids of all unexecuted transactions to the slave database, which can ensure that the same transaction is executed only once in the specified slave database. The way to determine the transaction to be executed from the slave database through the global transaction ID replaces the way to determine the transaction to be executed from the slave database by binlog and location

step

  • When the master updates the data, a gtid will be generated before the transaction and recorded in the binlog log together
  • The I / O thread on the slave side writes the changed binlog into the local relay log, and the read value is based on GITD_ The next variable tells us which gtid the slave will execute next
  • The SQL thread obtains the gtid from the relay log, and then compares whether the binlog on the slave side has records. If there is a record, it indicates that the transaction of the gtid has been executed, and the slave will ignore it
  • If there is no record, the slave will execute the transaction of the gtid from the relay log and record it to the binlog

Steps to switch from active to standby

A is the former primary database and a ‘is the backup database of A. they are the primary and backup databases of each other; B is from the library. At first, it synchronizes data from A. now it needs to synchronize data from a ‘

  1. Record the gtid set of instance a ‘as set at this time_ a
  2. The gtid set of instance B is marked as set_ b
  3. Execute the start slave command on instance B. instance B specifies primary database a ‘and establishes a connection based on the primary and standby protocols
  4. Instance B sets_ B send to main warehouse a ‘.
  5. Example a ‘calculates set_ A and set_ The difference set of B, that is, all existing in set_ a. But it doesn’t exist in set_ Set of gtids of B, and judge whether a ‘local contains all binlog transactions required by this difference set

    1. If not, it means that a ‘has deleted the binlog required by instance B, and an error is returned directly;
    2. If it is confirmed that all are included, a ‘finds out from its own binlog file that the first one is not in set_ B’s affairs, sent to B; Then start with this transaction, read the file later, take binlog in sequence and send it to B for execution

reference material

Geek time, MySQL actual combat 45 lectures