Yixin’s 105 database rules

Time:2021-1-10

Abstract: This paper introduces the 105 database military regulations of Yixin, which helps the R & D team to evaluate the quality of database development and achieve the goal of finding and solving problems as soon as possible.

As a financial technology enterprise, a large number of businesses of Yixin rely on databases. How to improve the overall database application level of the company is a big challenge to DBA, which is also of great practical significance. In Yixin’s years of work, the author and the team summarized and sorted out the rules for the use of traditional relational database, and with the help of the self-developed database audit platform, helped the R & D team to evaluate the quality of database development, so as to find and solve problems as soon as possible. The figure below is a simple schematic diagram of the system.

Yixin's 105 database rules

As shown in the figure above, the rule part can be subdivided into the following categories (parts). The brief description is as follows, which will be explained in detail later.

Yixin's 105 database rules

1、 Oracle rules (objects)

1.1 table and partition

[rule 1]

Rule Description: tables that exceed the specified size and have no partitions.

Rule threshold: 2GB (physical size exceeds the specified threshold).

Rule Description: if the scale of the table is too large, it will affect the access efficiency of the table and increase the maintenance cost. A common solution is to use partitioned tables to convert large tables into partitioned tables.

[rule 2]

Rule Description: the number of single table or single partition records is too large.

Rule threshold: 1000000 (the number of records in a single table or partition exceeds the specified threshold).

Rule Description: control the data size of a single table or partition to improve the access efficiency of a single object. If there are too many records, we should consider the strategies of database, table and partition.

[rule 3]

Rule Description: too many large tables.

Rule threshold: Custom (too many tables over 2G).

Rule Description: large tables account for more than 20% (OLTP) or 95% (OLAP) of all tables.

[Rule 4]

Rule Description: there are too many partitions in a single table.

Rule threshold: 500 (the number of single table partitions exceeds the specified threshold).

Rule Description: if there are too many partitions in the partition table, the overall maintenance cost will be too high, and the partition granularity can be adjusted.

[Rule 5]

Rule Description: too many partition tables.

Rule threshold: 2000 (the number of partition tables exceeds the specified threshold).

Rule Description: there are too many partition tables. The common reason is that there are too many large tables. According to the demand, vertical splitting is considered to reduce the scale of single library.

[rule 6]

Rule Description: too many composite partitions.

Rule threshold: 5 (the number of composite partitions exceeds the specified threshold).

Rule Description: similar to the above reason for too many partition tables (including the number of composite partition tables).

[Rule 7]

Rule Description: there is a table with parallel property enabled.

Rule threshold: 1 (table degree property is not 1).

Rule Description: in general, it is not recommended to set parallel properties on tables.

1.2 index

[rule 8]

Rule Description: foreign key has no index table.

Rule Description: if foreign key has no index, the efficiency of association is very low.

[rule 9]

Rule Description: too many composite indexes or no indexes.

Rule Description: too many composite indexes will lead to large space consumption and high index maintenance cost. We should consider building a strategic index structure instead of creating an index for every requirement.

[Rule 10]

Rule Description: too many indexes in a single table.

Rule threshold: 3 (the number of single table indexes exceeds the specified threshold).

Rule Description: index can improve access speed, but too much will lead to excessive space consumption, high index maintenance cost, and affect DML efficiency. The number of indexes should be controlled.

[rule 11]

Rule Description: there are indexes not used in 7 days.

Rule Description: in the database for a period of time, the index is not used by any SQL statements. Please evaluate the validity of this index.

[rule 12]

Rule Description: field duplicate index.
Rule Description: a field is referenced by multiple indexes. Please consider the construction strategy to delete unnecessary indexes.

[Rule 13]

Rule Description: there is a global partition index.

Rule Description: global partition index has high maintenance cost. When the partition changes, it is necessary to maintain the validity of the global index.

[rule 14]

Rule Description: invalid index.
Rule Description: the index status is invalid and unusable.

[rule 15]

Rule Description: the index height exceeds the specified height.

Rule threshold: 4.

Rule Description: high index height increases IO cost.

[rule 16]

Rule Description: bitmap index exists.

Rule Description: bitmap index is not recommended in OLTP environment. If table objects are checked for DML operation, it will block related operations to a certain extent.

[rule 17]

Rule Description: there is a function index.

[rule 18]

Rule Description: there is an index with parallel property enabled.

Rule threshold: 1 (index degree property is not 1).

Rule Description: in general, it is not recommended to set parallel attributes on indexes.

[rule 19]

Rule Description: there is an index with too large clustering factor.

Rule threshold: custom.

Rule Description: the index with too large clustering factor should be optimized.

1.3 constraints

[rule 20]

Rule category: constraint.

Rule Description: a table without a primary key.

Rule Description: the primary key is the only basis for determining a record in a relational database. There is no reason not to define the primary key.

[rule 21]

Rule category: constraint.

Rule Description: table using foreign keys.

Rule Description: it is not recommended to use foreign key constraints, and data consistency is solved by application side.

1.4 fields

[rule 22]

Rule Description: too many table fields.

Rule threshold: 100 (the number of fields exceeds the specified threshold).

Rule Description: too many fields will lead to too large record length. The number of records saved by a single data storage unit is too small, which affects the access efficiency.

[rule 23]

Rule Description: a table containing large field types.

Rule Description: large object fields should be avoided in relational databases. If necessary, external storage can be considered.

[rule 24]

Rule Description: record length definition is too long.

Rule Description: the difference between the length of record definition and the actual storage length is too large. Please consider whether the field type definition is reasonable, and whether individual fields are too long and can be stored in separate tables.

[rule 25]
Rule Description: a table that does not contain a timestamp field.

Rule Description: time stamp is a method to obtain incremental data. It is suggested to add time stamp fields of creation time and update time in the table. The naming method is create_ TIME、UPDATE_ TIME。

[rule 26]

Rule Description: table field type does not match.

Rule Description: this rule will sample part of the data and analyze whether the definition type is consistent with the storage type. Common problems, such as using numbers, text to save the date, etc.

1.5 other objects

[rule 27]

Rule Description: cache too small a sequence.

Rule threshold: 100 (sequence cache value is less than the specified threshold).

Rule Description: the system will cache 20 by default. If it is too small, it will cause frequent query of data dictionary and affect concurrency.

[Rule 28]

Rule Description: there are stored procedures and function degrees.

Rule threshold: 20 (the number of stored procedures and functions exceeds the specified threshold).

Rule Description: stored procedures and functions, will affect the heterogeneous migration ability of database, and there are poor code maintenance and other reasons.

[Rule 29]

Rule Description: trigger exists.

Rule threshold: 20 (the number of triggers exceeds the specified threshold).

Rule Description: trigger, which will affect the heterogeneous migration ability of database. If there are data consistency maintenance requirements, please consider them from the application side.

[rule 30]

Rule Description: dblink exists.

Rule Description: it is not recommended to access other databases in one database, please consider solving it in the application side.

2、 Oracle rules (execution plan)

2.1 binding variables

[rule 31]

Rule Description: bound variable is not used.

Rule threshold: Custom (execution times).

Rule Description: for statements executed more than a certain threshold, there is a constant value on the right side of the predicate.

[rule 32]

Rule Description: too many bound variables.

Rule threshold: Custom (number of bound variables).

Rule Description: too many bound variables will increase the variable replacement time and SQL execution time to a certain extent.

2.2 inter table correlation

[rule 33]

Rule Description: Cartesian product.

Rule Description: due to the lack of join conditions, the Cartesian product join method is used in the association between tables, and the word “Cartesian” is included in the execution plan.

[Rule 34]

Rule Description: nested loop level is too deep.

Rule threshold: Custom (number of layers).

Rule Description: the nested loop level is too deep and exceeds the specified threshold. The words “nested loop” or “filter” are nested in the execution plan.

[rule 35]
Rule Description: nested loop inner table access mode is full table scan.

Rule Description: the inner table access mode of nested loop is full table scanning, which is inefficient.

[rule 36]
Rule Description: large result set sorting exists in sorting merge connection.

Rule Description: in sorting and merging, both result sets should be sorted and adjusted to other connection methods.

[rule 37]

Rule Description: multi table Association.

Rule threshold: Custom (number of tables).

Rule Description: too many table associations affect performance.

2.3 access path

[rule 38]

Rule Description: large table full table scan.

Rule threshold: Custom (table size, in GB).

Rule Description: a full table scan operation is performed on a large table. The execution plan contains the words “table access full”.

[rule 39]

Rule Description: large index full scan.

Rule threshold: Custom (index size, in GB).

Rule Description: the index full scan operation is performed on the large index, and the execution plan contains the words “index full scan”.

[rule 40]

Rule Description: large index fast full scan.

Rule threshold: Custom (index size, in GB).

Rule Description: the index fast full scan operation is performed on the large index, and the execution plan contains the words “index fast full scan”.

[rule 41]

Rule Description: index skip scan.

Rule Description: jump scan the index. The execution plan contains the words “index skip scan”.

[Rule 42]

Rule Description: partition full scan.

Rule Description: the partition table is scanned, and the execution plan contains the words “partition range all”.

[rule 43]

Rule Description: discontinuous partition scanning.

Rule Description: discontinuous partition scanning. The execution plan contains the words “partition range inlist” or “partition range or”.

[rule 44]

Rule Description: cross partition scanning.

Rule Description: continuous partition scanning. The execution plan contains the words “partition range iterator”.

2.4 type conversion

[rule 45]

Rule Description: there is an implicit conversion.

Rule Description: implicit data type conversion is used in condition judgment.

2.5 other implementation plans

[rule 46]

Rule Description: there is a large result set sorting operation.

Rule Description: we can avoid sorting by introducing index and other operations.

[rule 47]

Rule Description: there is a parallel access feature.

Rule Description: parallelism affects performance, which needs to be avoided in general.

[rule 48]

Rule Description: View access exists.
Rule Description: generally, view operations can be merged, de nested, and so on. If not, view definitions should be checked.

3、 Oracle rules (execution characteristics)

3.1 performance characteristics

[rule 49]

Rule Description: the ratio between the number of scanned blocks and the number of returned records is too low.

Rule threshold: Custom (percentage).

Rule Description: scan a large amount of data, but return a small number of records, need to logically adjust the SQL statement.

[rule 50]

Rule Description: too many child cursors.

Rule threshold: Custom (number of child cursors).

Rule Description: there are too many child cursors, and the execution plan may be unstable.

[rule 51]

Rule Description: elapsed_ time。

Rule threshold: custom.

[rule 52]

Rule Description: CPU_ time

Rule threshold: Custom

[rule 53]

Rule Description: buffer_ gets

Rule threshold: Custom

[rule 54]

Rule Description: disk_ reads

Rule threshold: Custom

[rule 55]

Rule Description: direct_ writes

Rule threshold: Custom

[rule 56]

Rule Description: Execution

Rule threshold: Custom

4、 MySQL rules (objects)

4.1 table and partition

[rule 57]

Rule Description: tables that exceed the specified size and have no partitions.

Rule threshold: Custom (table size, GB).

Rule Description: if the scale of the table is too large, it will affect the access efficiency of the table and increase the maintenance cost. A common solution is to use partitioned tables to convert large tables into partitioned tables.

[rule 58]

Rule Description: there are too many data tables in a single database.

Rule threshold: Custom (number of tables).

Rule Description: too many tables in a single database will affect the overall performance. If necessary, the business logic is split vertically.

[rule 59]

Rule Description: the amount of data in a single table (partition) is too large.

Rule threshold: Custom (data size, number of records).

Rule Description: too many single table (partition) data tables will affect the overall performance. When necessary, the data should be divided into different databases and tables or cleaned up and filed regularly.

4.2 index

[rule 60]

Rule Description: too many indexes in a single table.

Rule threshold: Custom (number of indexes).

Rule Description: there are too many indexes in a single table, which not only costs high maintenance, but also takes up more space.

[rule 61]

Rule Description: duplicate index exists.

Rule Description: the index can be completely replaced by another index containing the prefix, which is redundant. Redundant index will waste storage space and affect data update performance.

[rule 62]

Rule Description: index selection rate is not high.

Rule threshold: Custom (selection rate, percentage).

Rule Description: low index selection rate will lead to low index efficiency. Please adjust the index field.

4.3 constraints

[rule 63]

Rule Description: foreign key exists in the table.

Rule Description: foreign key resources will consume the computing power of the database. It is recommended to ensure data constraints through the application layer.

[rule 64]

Rule Description: table has no primary key defined.

Rule Description: if there is no primary key defined, MySQL will automatically create it. This is not a good design method.

4.4 fields

[rule 65]

Rule Description: there are large object fields.

Rule Description: large object fields will affect access performance and consume more space. It is recommended to store them outside the database.

[rule 66]

Rule Description: there are too many fields in a single table.

Rule threshold: Custom (number of fields).

Rule Description: if the number of table fields is too many, the records will be too long and the number of single page storage records will be reduced. The table can be disassembled.

[rule 67]

Rule Description: the length of single table field definition is too long.

Rule threshold: Custom (field length, unit byte).

Rule Description: the length of single table definition should be controlled to avoid too long records.

[rule 68]

Rule Description: the length of single table primary key field definition is too long.

Rule threshold: Custom (field length, unit byte).

Rule Description: the length of primary key field should be controlled. Too long primary key field will consume too much index space.

[rule 69]

Rule Description: there is no timestamp field defined in the table.

Rule Description: the timestamp field is the best way to obtain incremental data. Please define the timestamp field for the table.

[rule 70]

Rule Description: field data type definition error.

Rule threshold: Custom (number of records).

Rule Description: according to the saved content of the field, the definition of the field type is abnormal. It is recommended to select the appropriate data type.

4.5 other objects

[rule 71]

Rule Description: there are functions, stored procedures and triggers in a single table.

Rule Description: stored procedures, functions, triggers and so on will consume the computing power of the database. It is recommended to ensure data constraints through the application layer.

5、 MySQL rules (execution plan)

5.1 access path

[rule 72]

Rule Description: large table full table scan.

Rule threshold: Custom (table size, in GB).

Rule Description: a full table scan operation is performed on a large table.

5.2 SELECT_TYPE

[rule 73]

Specification Union

[rule 74]

Rule Description: subquery

[rule 75]

Rule Description: detailed subquery

[rule 76]

Rule Description: material

[rule 77]

Rule Description: unable subquery

[rule 78]

Rule Description: unable Union

5.3 ACCESS_TYPE

[rule 79]

Rule Description: Fulltext

[rule 80]

Rule Description: index_ merge

[rule 81]

Rule Description: unique_ subquery

[rule 82]

Rule Description: all

[rule 83]

Rule Description: index range

5.4 other implementation plans

[rule 84]

Rule Description: use temporary tables.

Rule Description: temporary tables are used in the execution process, and “using temporary” is included in the execution plan.

[rule 85]
Rule Description: use disk sorting.

Rule Description: disk sorting is used in the execution plan, and the word “using file sort” is included in the execution plan.

6、 MySQL rules (execution features)

6.1 performance characteristics

[rule 86]

Rule Description: index_ ratio

[rule 87]

Rule Description: lock_ time_ Sum

7、 Oracle + MySQL (statement level)

7.1 query class

[rule 88]

Rule Description: Select*

Rule Description: the use of select * is prohibited, and the required column must be explicitly selected.

[rule 89]

Rule Description: duplicate query clause.

Rule Description: it is forbidden to use duplicate query clauses, and should use with as replacement clause (Oracle only) to improve the efficiency of SQL execution.

[rule 90]

Rule Description: query field reference function.

Rule Description: it is forbidden to refer to functions in query fields (type conversion function and function index can be ignored).

[rule 91]

Rule Description: nested select clause.

Rule Description: nested subqueries with select clauses are prohibited to avoid performance problems.

[rule 92]

Rule Description: Union appears.

Rule Description: prevent unnecessary sorting actions.

[rule 93]

Rule Description: multiple filter conditions are connected through or.

Rule Description: prevents the optimizer from having selection exceptions.

[rule 94]

Rule Description: predicate condition uses like% xxx ‘

Rule Description: cannot use index.

[rule 95]
Rule Description: there is a negative operator in the predicate.

Rule Description:! =, < >,! <,! >, not exists, not.

[rule 96]

Rule Description: there are sub queries.

Rule Description: this should distinguish the location (select, from, where, having, etc.).

[rule 97]

Rule Description: there are more than three table associations.

[rule 98]

Rule Description: full connection or external connection exists.

Rule Description: cross join or outer join.

7.2 change type

[rule 99]

Rule Description: order by clause appears in update.

Rule Description: to prevent unnecessary sorting during update.

[rule 100]

Rule Description: where clause must appear in update.

Rule Description: prevent all unexpected update actions.

[Rule 101]
Rule Description: update primary key.

Rule Description: it is forbidden to update the primary key.

[rule 102]

Rule Description: order by clause appears in delete.

Rule Description: to prevent unnecessary sorting during deletion.

[rule 103]

Rule Description: where clause must appear in delete.

Rule Description: to prevent the occurrence of unexpected all delete action.

[rule 104]

Rule Description: new SQL text too long rule.

[rule 105]

Rule Description: too many new in list elements are added.

Author: Han Feng

First appeared in the author’s personal public number “Hanfeng channel”.

Source: Yixin Institute of Technology