Technology sharing | the fastest logical backup tool in MySQL history

Time:2021-9-1

Author: Hong bin
The person in charge of aikesheng south district and technical service director, MySQL ace, is good at database architecture planning, fault diagnosis, performance optimization analysis, has rich practical experience, helps customers in various industries solve MySQL technical problems, and provides overall MySQL solutions for customers in finance, operators, Internet and other industries.
Source: reproduced from official account – fun MySQL
*It is produced by aikesheng open source community. The original content cannot be used without authorization. For reprint, please contact Xiaobian and indicate the source.


MySQL shell 8.0.21 adds a new logical backup and recovery method, which has faster backup and recovery efficiency, supports zstd real-time compression, block parallel export, load data parallel import, and backup to OCI object storage.

  • Util. Dumpinstance() is used to back up the entire instance
  • Util. Dumpschemas() is used to back up the specified schema
  • Util. Loaddump() is used to restore backups

A comparative test is made. Under zero load, the MySQL configuration parameters remain unchanged, and the same schema is backed up / restored, in which large tables and small tables are mixed. Let’s see the actual effect of these methods.

Comparison results

Technology sharing | the fastest logical backup tool in MySQL history

conclusion

  • mysql shell

    Using the default parameter zstd compression + 32m chunk parallel export, a single table can load data in parallel during recovery, and its backup and recovery speed is better than non compression + non blocking.
    It is found in the test that if compression is disabled, blocking will also be disabled.

  • mysqldump

    Backup and recovery are performed in a single thread. Uncompressed backup is more efficient. The real-time backup speed of zstd is faster than gzip, and the recovery speed is the slowest.

  • mysqlpump

    Backup supports parallelism and is fast, but single thread recovery is hard.

  • mydumper

    Gzip protocol is used by default. The backup speed is basically the same as mysqldump. It seems that the bottleneck is compression. In uncompressed, non chunked backups are faster. The recovery speed is medium, and a single table cannot be parallel.

Based on the above test results, the new backup and recovery method of MySQL shell is the fastest. Thanks to the use of zstd real-time compression algorithm, backup and recovery can be parallel, and can also be parallel for a single large table.

Here are some test procedures for reference:

MySQLShell

utli.dumpSchemas/utli.loadDump


  • backups
Acquiring global read lock
All transactions have been started
Locking instance for backup
Global read lock has been released
Writing global DDL files
Preparing data dump for table `test`.`customer1`
Writing DDL for schema `test`
Writing DDL for table `test`.`sbtest1`
Writing DDL for table `test`.`customer1`
Writing DDL for table `test`.`sbtest10`
Data dump for table `test`.`customer1` will be chunked using column `c_w_id`
Preparing data dump for table `test`.`sbtest1`
Data dump for table `test`.`sbtest1` will be chunked using column `id`
Preparing data dump for table `test`.`sbtest10`
Data dump for table `test`.`sbtest10` will be chunked using column `id`
Preparing data dump for table `test`.`sbtest2`
Data dump for table `test`.`sbtest2` will be chunked using column `id`
Preparing data dump for table `test`.`sbtest4`
Data dump for table `test`.`sbtest4` will be chunked using column `id`
Preparing data dump for table `test`.`sbtest6`
Data dump for table `test`.`sbtest6` will be chunked using column `id`
Preparing data dump for table `test`.`sbtest8`
Data dump for table `test`.`sbtest8` will be chunked using column `id`
Running data dump using 4 threads.
NOTE: Progress information uses estimated values and may not be accurate.
Writing DDL for table `test`.`sbtest2`
Writing DDL for table `test`.`sbtest4`
Writing DDL for table `test`.`sbtest6`
Writing DDL for table `test`.`sbtest8`
Data dump for table `test`.`customer1` will be written to 3 files
Data dump for table `test`.`sbtest10` will be written to 1 file
Data dump for table `test`.`sbtest2` will be written to 1 file
Data dump for table `test`.`sbtest4` will be written to 1 file
Data dump for table `test`.`sbtest6` will be written to 1 file
Data dump for table `test`.`sbtest8` will be written to 1 file
Data dump for table `test`.`sbtest1` will be written to 160 files
1 thds dumping - 98% (10.46M rows / ~10.62M rows), 589.52K rows/s, 115.55 MB/s uncompressed, 51.66 MB/s compressed
Duration: 00:00:18s
Schemas dumped: 1
Tables dumped: 7
Uncompressed data size: 2.06 GB
Compressed data size: 922.35 MB
Compression ratio: 2.2
Rows written: 10464999
Bytes written: 922.35 MB
Average uncompressed throughput: 109.46 MB/s
Average compressed throughput: 48.97 MB/s
  • recovery
util.loadDump("test1")
Loading DDL and Data from 'instance' using 4 threads.
Target is MySQL 8.0.21. Dump was produced from MySQL 8.0.21
Checking for pre-existing objects...
Executing common preamble SQL
Executing DDL script for schema `test`
Executing DDL script for `test`.`sbtest1`
Executing DDL script for `test`.`sbtest4`
Executing DDL script for `test`.`sbtest2`
Executing DDL script for `test`.`sbtest8`
Executing DDL script for `test`.`sbtest10`
Executing DDL script for `test`.`sbtest6`
Executing DDL script for `test`.`customer1`
...
[Worker000] [email protected]@158.tsv.zst: Records: 65736  Deleted: 0  Skipped: 0  Warnings: 0
Executing common postamble SQL
​
168 chunks (10.46M rows, 2.06 GB) for 7 tables in 1 schemas were loaded in 1 min 26 sec (avg throughput 23.97 MB/s)

mysqldump


  • backups
/usr/bin/time mysqldump -umsandbox -pmsandbox -h127.0.0.1 -P8021 test | gzip > db.sql.gz
mysqldump: [Warning] Using a password on the command line interface can be insecure.
      169.40 real        24.65 user         1.34 sys
  • recovery
 /usr/bin/time gzip -d < db.sql.gz | ./use test
      257.11 real         9.74 user         0.55 sys

mysqlpump


  • backups
/usr/bin/time mysqlpump --default-parallelism=4 -umsandbox -pmsandbox -h127.0.0.1  -P8021 test | gzip > db2.sql.gz
Dump progress: 6/7 tables, 10421749/10406264 rows
Dump completed in 185352
      185.50 real        31.18 user         6.34 sys
  • recovery
/usr/bin/time gzip -d < db2.sql.gz | ./use test
      121.17 real         9.66 user         0.76 sys

mydumper/myloader


  • backups
/usr/bin/time mydumper -u msandbox -p msandbox -h 127.0.0.1 -P 8021 -B test -t 4 -v 3 -c -o dumper
** Message: 21:44:55.958: Connected to a MySQL server
** Message: 21:44:56.319: Started dump at: 2020-07-24 21:44:56
​
** Message: 21:44:56.341: Written master status
** Message: 21:44:56.420: Thread 1 connected using MySQL connection ID 22
** Message: 21:44:56.537: Thread 2 connected using MySQL connection ID 23
** Message: 21:44:56.651: Thread 3 connected using MySQL connection ID 24
** Message: 21:44:56.769: Thread 4 connected using MySQL connection ID 25
** Message: 21:44:56.878: Non-InnoDB dump complete, unlocking tables
** Message: 21:44:56.878: Thread 4 dumping data for `test`.`sbtest10`
** Message: 21:44:56.878: Thread 1 dumping data for `test`.`customer1`
** Message: 21:44:56.878: Thread 3 dumping data for `test`.`sbtest1`
** Message: 21:44:56.878: Thread 2 dumping data for `test`.`sbtest2`
** Message: 21:44:57.139: Thread 2 dumping data for `test`.`sbtest4`
** Message: 21:44:57.143: Thread 4 dumping data for `test`.`sbtest6`
** Message: 21:44:57.396: Thread 2 dumping data for `test`.`sbtest8`
** Message: 21:44:57.398: Thread 4 dumping schema for `test`.`customer1`
** Message: 21:44:57.409: Thread 4 dumping schema for `test`.`sbtest1`
** Message: 21:44:57.419: Thread 4 dumping schema for `test`.`sbtest10`
** Message: 21:44:57.430: Thread 4 dumping schema for `test`.`sbtest2`
** Message: 21:44:57.441: Thread 4 dumping schema for `test`.`sbtest4`
** Message: 21:44:57.453: Thread 4 dumping schema for `test`.`sbtest6`
** Message: 21:44:57.464: Thread 4 dumping schema for `test`.`sbtest8`
** Message: 21:44:57.475: Thread 4 shutting down
** Message: 21:44:57.636: Thread 2 shutting down
** Message: 21:45:03.706: Thread 1 shutting down
** Message: 21:47:40.297: Thread 3 shutting down
** Message: 21:47:40.307: Finished dump at: 2020-07-24 21:47:40
      164.54 real       167.58 user         2.28 sys

Note: the number of parallel backup threads used is the same as dumpschema.

  • recovery
/usr/bin/time myloader -u msandbox -p msandbox -h 127.0.0.1 -P 8021 -B test -t 4 -v 3  -d dumper/
** Message: 23:54:39.961: 4 threads created
** Message: 23:54:39.973: Creating table `test`.`sbtest4`
** Message: 23:54:40.055: Creating table `test`.`sbtest10`
** Message: 23:54:40.127: Creating table `test`.`customer1`
** Message: 23:54:40.201: Creating table `test`.`sbtest8`
** Message: 23:54:40.273: Creating table `test`.`sbtest2`
** Message: 23:54:40.346: Creating table `test`.`sbtest6`
** Message: 23:54:40.423: Creating table `test`.`sbtest1`
** Message: 23:54:40.488: Thread 2 restoring `test`.`sbtest2` part 0
** Message: 23:54:40.488: Thread 3 restoring `test`.`sbtest6` part 0
** Message: 23:54:40.488: Thread 4 restoring `test`.`sbtest8` part 0
** Message: 23:54:40.488: Thread 1 restoring `test`.`sbtest4` part 0
** Message: 23:54:40.833: Thread 2 restoring `test`.`sbtest1` part 0
** Message: 23:54:40.833: Thread 4 restoring `test`.`sbtest10` part 0
** Message: 23:54:40.834: Thread 3 restoring `test`.`customer1` part 0
** Message: 23:54:40.834: Thread 1 shutting down
** Message: 23:54:41.070: Thread 4 shutting down
** Message: 23:54:50.407: Thread 3 shutting down
** Message: 23:57:46.425: Thread 2 shutting down