Test join query with go randgen

Time:2021-5-4

In database query, join is one of the most commonly used queries. Due to the complexity of join algorithm implementation, the probability of problems is high. We analyze the join problems in tidb and classify the problem prone scenarios into the following categories:

  1. For the same join query, the join key is of different data types
  2. Join on partition table
  3. The same join query, different join algorithms
  4. Special query conditions

We started with these scenarios, and in the past few months, we have tested tidb using the go range framework. In the following, we will describe the go range testing framework and the related content of our testing work, which is divided into the following four parts:

  • A brief introduction of go randgen testing framework
  • This paper introduces the usage of go range by an example
  • Practice and effect of go randgen tool in tidb testing
  • In the future, it can be based on the further work of go randgen, as well as the introduction of other related work

Introduction to go randgen

go-randgenIs a fully configurable test framework, it allows to create random data sets, and run randomly generated queries on them, and then verify the correctness of query results through a / B test.

Go randgen example

Taking the join test as an example, we can refer to the related syntax formats of go rangehttps://github.com/pingcap/go-randgenThere are three steps

  1. Define ZZ file, specify the generation rules of table, such as data type, table type, row number, etc
  2. Define YY file and specify generation rules of random SQL
  3. Use the generated table structure and SQL to run a / B test

Each step is described in detail through examples below:

1. Define join.zz.lua. The ZZ file in this example can generate 6 tables, and each table has 17 fields corresponding to the types defined in files. Types. The six tables are as follows:

  • table_ 400_ undef_ undef_ 1 (400 rows of data)
  • table_ 400_ undef_ 4_ 1 (400 rows of data with 4 partitions)
  • table_ 300_ undef_ undef_ 1 (300 rows of data)
  • table_ 300_ undef_ 4_ 1 (300 rows of data and 4 partitions)
  • table_ 290_ undef_ undef_ 1 (290 rows of data)
  • table_ 290_ undef_ 4_ 1 (290 rows of data and 4 partitions)
tables = {
    rows = {400, 300, 290},
    partitions = {'undef', 4},
}

fields = {
    types = {'int', 'tinyint', 'smallint', 'bigint', 'decimal(40, 20)', 'float', 'double', 'char(20)', 'varchar(20)', 'enum', 'set', 'datetime', 'bool', 'bit(64)', 'timestamp', 'year', 'date'},
    keys = {'key'},
}

data = {
    numbers = {'null', 'tinyint', 'smallint',
        'decimal',
    },
    smallint = {null, 'smallint'},
    mediumint = {null, 'mediumint'},
    tinyint = {null, 'tinyint'},
    bool = {1, 0, null},
    year = {'null', 'year'},
    datetime = {'null', 'datetime'},
    timestamp = {'null', 'datetime'},
    date = {'null', 'date'},
    strings = {'null', 'letter', 'english'},
}

2. Define the join.yy file. The YY file in this example generates inl through hint specification_ merge_ Join and Inl_ hash_ Join algorithm query statement. In the generated SQL statement, in addition to the specified fields, the tables and fields in the query conditions will be randomly combined.

Generated SQL example:

SELECT /*+ inl_hash_join(t1) */ t1.pk, t2.pk from table_290_undef_undef_1 t1, table_400_undef_undef_1 t2 where t1. `col_enum_key_signed` = t2. `col_int_key_signed` and t1. `col_smallint_key_signed` < -5418830167423061551 order by t1.pk, t2.pk;
query:
    select

select:
    SELECT hint_begin inl_merge_join(t1, t2) */ col_list FROM _table  t1, _table t2 where condition and condition1 order by t1.pk, t2.pk;
    SELECT hint_begin inl_hash_join(t1) */ col_list from _table  t1, _table t2 where condition and condition1 order by t1.pk, t2.pk;
    
col_list:
    t1.pk, t2.pk

condition:
    t1. _field = t2. _field

condition1:
    t1. _field_int < _int

hint_begin:
    /*+

3. Run a / B test according to join.zz.lua and join.yy files. In this example, tidb query results are compared with MySQL query results. SQL with inconsistent query results will be recorded in the dump subdirectory of the current directory.

./go-randgen exec -Z join.zz.lua -Y join.yy --dsn1 "root:[email protected](127.0.0.1:3306)/test" --dsn2 "root:@tcp(127.0.0.1:4000)/test" -Q 2000
2020/12/25 16:37:18 Open DB ok, starting generate data in two db by ddls
2020/12/25 16:37:18 load zz from join.zz.lua
2020/12/25 16:37:20 generating data ok
2020/12/25 16:37:20 starting execute sqls generated by yy
2020/12/25 16:37:20 load yy from join.yy
2020/12/25 16:37:32 dump ok

The practice of go randgen in tidb testing

By testing the join algorithm of tidb with go randgen, we have found 10 correctness related problems, such as:

  • Through different types of coverage, the column values are compared. For example:select * from _table where _field > _field. The comparison error between time column and year column is found and recorded intidb/issues/20121In the middle.
  • Test the distinct statement. For example:select count(distinct(t1. _field)), count(distinct t1. _field, t1. _field) from table_400_utf8_undef t1, table_290_utf8_undef t2 where t1. _field = t2. _field and t1. _field = t2. _field and t1. _field_int != _int. The calculation error of distinct is found and recorded intidb/issues/20237In the middle.
  • In addition to the random type, by expanding the coverage of a single statement and randomly combining statements, the SQL statement context is associated. For example:alter table _table add index {print(string.format("t%d", math.random(10,2000000)))} (_field); SELECT t1.pk, t2.pk from t t1 left join t t2 on t1. _field = t2. _field where t1. _field != _int order by t1.pk, t2.pk. After the index is added, an error is found in the query and recorded in thetidb/issues/20698In the middle.

The discovery of these issues reminds us to have a sense of awe for the quality of tidb, and also confirms that it is feasible to analyze the problems found in the past, summarize the scenarios, and then expand the scope of test points. The subsequent join test will also continue to cover more data types, and try more statement combinations and scenario combinations, such as adding data to a transaction, deleting data, and randomly combining with join query.

Future work

We can improve the coverage of join test by constantly improving ZZ and YY files. However, the SQL generated by go range has a very fixed structure. If we can’t know the test point in advance, we can’t construct the query statement and carry out effective test coverage. Is there any way to randomly generate join query statements for testing? At the moment, we areHoroscope(optimizer detection tool) to realize the function of randomly generating join query. In addition, we refer to Manuel rigger’s“Testing Database Engines via Pivoted Query Synthesis”According to the idea in this paper, horoscope will randomly select a row of data in some tables as a pivot row to construct a query, so that the returned results of the query will contain these selected rows.

Because of the complexity of join query, join testing will be a long-term but valuable task. In addition, optimizer testing, region testing, tidb cluster chaos testing and transaction testing are also very important and valuable tasks. If you are interested, you are welcome to test tidb with go randgen or other tools. For the problems found, please refer to theGitHubCreate an issue to feed back to us. If you have better testing methods, testing ideas and testing tools, welcome toTUGCommunicate with us.