In recent months, especially after the release of tidb RC1, more and more users have started to test it, and many friends have used it in the production environment. We have also received a lot of user’s test and use feedback. Thank you very much for the love of your friends and early users. After watching so many scenes, I also summarized some tidb usage practices (in fact, most of spanner’s best practices are applicable to tidb, and so are MySQL’s best practices). I also took a look at spanner through the east wind of Google cloud spanner’s release For some official best practice documents, write an article about the correct use posture of tidb and distributed relational database. Of course, the times are also developing, and tidb is constantly evolving. This article basically represents some recent observations.
First of all, let’s talk about some good experience of schema design. Since tidb is a distributed database, things that need to be considered in the design of table structure may be different from those of traditional stand-alone databases. Developers need to be able to carry the premise of “the data of this table will be distributed on different machines” to make better design.
Like spanner, the rows of a table in tidb are sorted according to the byte order of the primary key (for integer type primary keys, we will use specific encoding to make the byte order consistent with the sort by size). Even if the primary key is not explicitly created in the CREATE TABLE statement, tidb will allocate an implicit one.
There are four things to remember:
The efficiency of scanning in byte order is relatively high;
Continuous rows will be stored in the vicinity of the same machine in large probability, and the efficiency of each batch reading and writing will be high;
The index is ordered (the primary key is also an index). The index of each column in a row will occupy a kV pair. For example, if a table has three indexes in addition to the primary key, then inserting a row in this table corresponds to the writing of four kV pairs: data rows and three index rows.
A row of data is stored in a kV pair and will not be segmented. This is very different from the column storage of BigTable.
The data of the table will be partitioned into many 64M regions (corresponding to the spanner’s concept of splits) by the underlying storage in tidb. Each region stores continuous rows. Region is the unit for tidb to schedule data. With the increasing amount of data in a region and the passage of time, the region will be divided into many 64M regions It will split / merge or move to different physical machines in the cluster, so that the whole cluster can be expanded horizontally.
Write in batches as much as possible, but the total size of one write should not exceed the split threshold (64M) of the region. In addition, tidb also has a size limit for a single transaction.
It is not appropriate to store super wide tables, especially if there are too many columns in a row, and they are not too sparse. One experience is that the total data size of a single row should not exceed 64K, and the smaller the better. Large data is best split into multiple tables.
For data with high concurrency and frequent access, try to hit only one region at a time. This is also well understood. For example, a fuzzy query or a table scan operation without index may occur on multiple physical nodes. On the one hand, there will be more network overhead. On the other hand, the more regions you access, you will encounter a stale region Then, the greater the probability of retrying (it can be understood that tidb often moves regions, and the routing information of clients may not be updated in a timely manner), which may affect the. 99 delay; on the other hand, the write latency of small transactions (within the scope of a region) will be lower, and tidb is optimized for cross row transactions in the same region. In addition, tidb is more efficient for precise point queries through primary keys (there is only one result set).
In addition to using primary key queries, tidb allows users to create secondary indexes to speed up access. As mentioned above, at the tikv level, the row data and index data in the table on tidb look like kV pairs in tikv, so many principles applicable to table data also apply to indexes. Different from spanner, tidb only supports global indexes, which is the default non interleaved indexes in spanner. The advantage of a global index is that it has no restrictions on users and can scale to any size, but this means that the index information isnot alwaysAnd the actual data in one region.
For a needle in a haystack query (precise positioning of a certain or several items in massive data), it must be through the index.
Of course, do not blindly create indexes, creating too many indexes will affect the performance of writing.
Antipattern (better not! )
As a matter of fact, spanner’s white paper has been written very clearly. Let me repeat it again
First, it relies too much on monotonically increasing primary keys, auto increment ID
In traditional relational databases, developers often rely on auto increment ID as the primary key. However, in most scenarios, what you want is a non duplicate ID. it doesn’t matter whether it is self increment or not. However, this is not recommended for distributed databases. As the insertion pressure increases, it will be in the region at the end of the table It forms a hot spot, and there is no way to spread the hotspot to multiple machines. Tidb optimizes the primary key of non auto increment ID in GA version to make the insert workload as decentralized as possible.
If the business does not need to use monotonic incremental ID as the primary key, do not use it. Use the really meaningful columns as the primary key (generally, e.g. mailbox, user name, etc.)
Use random UUID or bit reverse for monotonically increasing ID
Second, monotonically increasing indexes (such as timestamps)
For many log type businesses, it is natural to create an index on the timestamp because they often need to query according to the time dimension. However, the problem of this kind of index is essentially the same as monotonically increasing primary key. In the internal implementation of tidb, the index is also a pile of continuous kV pairs, and constant insertion of monotonically increasing timestamps will cause the region at the end of the index Hot spots are formed, resulting in write throughput being affected.
Because inevitably, many users are using tidb to store logs. After all, tidb’s elastic scalability and MySQL compatible query features are very suitable for this kind of business. On the other hand, if you find that you can’t bear the pressure of writing, but you really want to use tidb to store this type of data, you can do sharding at the application level as suggested by spanner. Taking the storage of logs as an example, it is possible to create a log table on the tidb. The better mode is to create multiple log tables, such as log_ 1, log_ 2 … log_ N. Then, when inserting the service layer, hash is performed according to the timestamp and randomly allocated to one of the 1.. n partitioned tables.
Accordingly, the query requests need to be distributed to each fragment, and finally the results are summarized in the business layer.
The optimization of tidb is divided into rule-based optimization and cost based optimization. In essence, tidb’s SQL Engine is more like a distributed computing framework. For large table data, tidb will disperse the data to multiple storage nodes and push down the query logic, which will greatly improve the query efficiency.
Rule based optimization of tidb includes:
Predicate pushdown pushes the where / on / having condition as close to the table as possible, for example:
select * from t join s on t.id = s.id where t.c1 < 10
It can be automatically rewritten by tidb
select * from (select * from t where t.c1 < 10) as t join s on t.id = s.id
Elimination of associated subqueries
The associated subquery may be rewritten as join by tidb, for example:
select * from t where t.id in (select id from s where s.c1 < 10 and s.name = t.name)
It can be rewritten as:
select * from t semi join s on t.id = s.id and s.name = t.name and s.c1 < 10
Aggregate push down
The aggregate function can be pushed through the join, so the efficiency of join with equivalent connection will be higher, for example:
select count(s.id) from t join s on t.id = s.t_id
It can be rewritten as:
select sum(agg0) from t join (select count(id) as agg0, t_id from s group by t_id) as s on t.id = s.t_id
Rule based optimization can sometimes be combined to produce unexpected results, such as:
select s.c2 from s where 0 = (select count(id) from t where t.s_id = s.id)
In tidb, this statement will first be optimized by eliminating the associated subqueries and become:
select s.c2 from s left outer join t on t.s_id = s.id group by s.id where 0 = count(t.id)
Then, the statement will be optimized by aggregation and push down to:
select s.c2 from s left outer join (select count(t.id) as agg0 from t group by t.s_id) t on t.s_id = s.id group by s.id where 0 = sum(agg0)
After the judgment of aggregation elimination, the statement can be optimized as follows:
select s.c2 from s left outer join (select count(t.id) as agg0 from t group by t.s_id) t on t.s_id = s.id where 0 = agg0
Cost based optimization includes:
When reading a table, if there are multiple indexes to select, we can select the best index through statistics. For example:
Select * from t where age = 30 and name in ("Xiaoming", "Xiaoqiang")
For operations involving join, we can distinguish between large and small tables. Tidb has special optimization for joining a large table and a small table.
select * from t join s on s.id = t.id
The optimizer selects the join algorithm by estimating the table size: it chooses to load smaller tables into memory.
For a variety of schemes, dynamic programming algorithm is used to select the best one
(select * from t where c1 < 10) union all (select * from s where c2 < 10) order by c3 limit 10
T and s can determine whether to select the index C3 or C2 according to the data distribution of the index.
In short, the correct use of tidb posture, or the typical application scenarios of tidb are:
Under the large amount of data, MySQL complex query is very slow;
In the case of large amount of data, the data growth is very fast, which is close to the limit of single machine processing. We don’t want to separate databases and tables or use database middleware to invade the business, and the architecture restricts the business in turn;
In the case of large amount of data, high concurrent real-time writing, real-time query and real-time statistical analysis are required;
There are distributed transactions, multi data center data 100% strong consistency, high availability of auto failure requirements.
If you just want to remember one sentence in the whole article, tidb is usually not used in scenarios with less than 5000W data. Tidb is designed for large-scale data scenarios. If you still want to remember one sentence, it is that single MySQL can not meet the scenario of tidb.