Method for cleaning data in MySQL online database


01 scenario analysis

This afternoon, the development colleagues asked to delete some data records online. After a brief look at the data distribution, it is about to delete more than 27 million records in the data table, and the total records in the data table are more than 28 million, that is, the records to be deleted account for most of the total records, more than 27 million data records, If deleted, it will take quite a long time to use, which will certainly affect the online business. Here, the actual application case is simply reconstructed into the following methods:

mysql> select date,count(*) from test.tbl_a  group by date;
|   date | count(*) |
| |  63103 |
| 20190118 | |
| |  125916 |
| 20190120 |   |
| |  129198 |
| 20190122 |  |
| | 5191247 |
 rows in set (13.21 sec)

The above is the structure of the reconstructed table. We can see the table TBL in the test database_ B after grouping by date, the amount of data in each group is not small, and our demand is to delete the records with dates 20190118 and 20190123. We can see that there are more than 20 million records in total, accounting for most of the data in the table. If you delete them directly, the online business will be blocked.

The following are the operation methods. It should be noted that the premise of applying the following operation methods is:

The data to be deleted accounts for the vast majority of the data table.

02 operation method

Through analysis, we know that the remaining table data is a small part of the whole table. This operation is divided into four steps:

1. We first save the remaining data into another database test1:

create table test1.tbl_b_new as 
select * from test.tbl_b 
where date in(,,,,);

The operation time is only about 3S;

2. We create a table TBL with the same name in test1 database_ b. Its structure is similar to the TBL in the test database_ B. consistent data structure:

create table test1.tbl_b like test.tbl_b;

3. Next, we use the rename operation in the database to rename the table TBL in the table test_ TABLE tbl in B and test1_ B, which is equivalent to clearing all data of the table in the test database.

RENAME TABLE test.tbl_b  TO test1.tbl_b_bak,
         test1.tbl_b  TO test.tbl_b,
         test1.tbl_b_bak TO test1.tbl_b;

4. Again, fill the remaining data saved in the first step into the new table, as follows:

insert into test.tbl_b select * from  test1.tbl_b_new;

03 result analysis

It seems that we have complicated the problem. We have taken several steps to delete directly, but this is not the case. This set of operations can save us several minutes and have less impact on online business. The reasons are as follows:

  1. We use the create table as method to create the remaining data tables, which enables us to operate only a small part of the data in the data table. It avoids US scanning too much data.
  2. For the rename operation of large tables in mysql, the rename command will directly modify the underlying. Frm file, so its speed is quite fast.

The second feature provides us with an idea. In a very anxious business, when we want to use a table, we often don’t give us enough time to back up the table. If we want to delete the data in a large table and need to make relevant backups, we can quickly process it through the rename operation, and then find a way to back up the table after the rename.

In the above example, create table as and create table like are used to clone tables. There are several points to note:

The create table like method will completely clone the table structure, but will not insert data. You need to use the insert into or load data method to load data separately
create table as   This method will partially clone the table structure and retain the data completely

If you are interested, you can do experiments to verify.

The above is the details of how to clean up the data of MySQL online database. For more information about cleaning up the data of MySQL online database, please pay attention to other relevant articles of developeppaer!

Recommended Today

Go reflection class instance

Internal mechanism of variables Type information: static meta information, predefined Value information: it is dynamically changed during program operation Use of reflection Get type information: reflect.typeof, which is static Get value information: reflect.valueof, which is dynamic Get interface value information by reflection package main import ( “fmt” “reflect” ) //Get interface value information by reflection […]