Design considerations and MySQL database

Time:2020-12-2

The following is my personal opinion. I have written it wrong, or I understand it wrong. Please understand
If there is a problem with self built MySQL using the information in this article, please don’t come to me
If you have a better solution, or suggestions, you are welcome to submit the version
TimAutumnWind (please indicate the source of the reprint https://learnku.com/articles/50270 )

1. Please use InnoDB storage engine

InnoDB has better CPU and IO performance, better backup and lock table mechanism, and improves the efficiency of statistics and debugging.
In addition, as a system, InnoDB supports a variety of key functions, the most important of which are transaction log and row level lock. Transaction log records real database transactions, but more importantly, data crash recovery and rollback.
IO based on inoodb mode can give more secure data protection and better performance. In addition, in most cases, row level locking can provide higher concurrency performance, because users only lock the data they are writing, and the read data is never blocked

2. The data table and data field must be annotated in Chinese

It is convenient for future newcomers to understand and be familiar with faster, and has better readability
At the same time, the status field is marked with 0 for deletion and 1 for normal enumeration value.

3. Utf8mb4 character set must be used

Utf8 is a universal character set. Mb4 is extended on utf8 to support Emoji and other new characters.

4. It is forbidden to use stored procedure, view, trigger, event, join, etc

For the Internet business with high concurrency and big data, the architecture design idea is to “liberate the database CPU and transfer the calculation to the service layer”. The database is good at storing and indexing, and the CPU computing is more reasonable in the business layer.
If it is low concurrency, small traffic, when I did not say

5. It is forbidden to store large files or photos

When there are many pictures, the paging query speed is significantly slower. The response time is within 1 second before, and it takes about 4-5 seconds to respond after adding the photo field. Large files and photos are stored in the file system, and it is better to store URIs in the database

6. The table must have a primary key, such as auto increment primary key

  1. Increasing primary key and writing data rows can improve insertion performance, avoid page splitting, reduce table fragmentation, and improve the use of space and memory
  2. Using digital type primary key, shorter data type can effectively reduce the disk space of index and improve the efficiency of index cache
  3. From the schema, the primary key will be deleted from the primary and secondary database
  4. More use of business primary keys will make it more convenient to use sub databases and tables.

7. Prohibit the use of foreign keys. If there are foreign key integrity constraints, application control is required

Foreign keys will cause coupling between tables. Update and delete operations will involve associated tables, which will greatly affect the performance of SQL and even cause deadlock.

8. Define the field as not null and provide a default value

  1. Null columns make index / index statistics / value comparisons more complex and more difficult to optimize for MySQL.
  2. This type of MySQL needs special processing to increase the complexity of database processing records; under the same conditions, when there are many empty fields in the table, the processing performance of the database will be greatly reduced.
  3. Null values require more storage space, and the null columns in each row of the table or index need additional space to identify.
  4. When processing null, you can only use is null or is not null, instead of =, in, <, < >,! =, not in

9. Do not use text and blob types

It will waste more disk and memory space, and unnecessary large field queries will eliminate hot data, resulting in a sharp decrease in memory hit rate and affect database performance.

10. It is forbidden to use decimal to store currency

All 0202, use integer bar, decimal easy to lead to money on the wrong or precision problems

11. Use varchar (20) to store mobile phone number

  1. When it comes to area code or country code, + – ()
  2. Will mobile phone numbers do math?
  3. Varchar can support fuzzy queries, such as “138%”

12. Enum is prohibited and tinyint can be used instead

  1. Adding a new enum value requires DDL operation
  2. The actual internal storage of enum is an integer. Do you think you define a string?

13. About index design

  1. It is suggested that the number of single table indexes should be controlled within 5

  2. The more indexes, the better! Indexing can improve efficiency, but it can also reduce efficiency.

  3. Index can increase query efficiency, but it will also reduce the efficiency of insertion and update, and even reduce the efficiency of query in some cases.

  4. When the MySQL optimizer chooses how to optimize the query, it will evaluate each available index according to the unified information to generate the best execution plan. If there are many indexes available for query at the same time, it will increase the time for the MySQL optimizer to generate the execution plan, which will also reduce the query performance.

  5. It is forbidden to build indexes on attributes that are updated frequently and with low discrimination, such as age and gender

  6. Updating will change the B + tree, and indexing frequently updated fields will greatly reduce database performance

  7. For the attribute of gender, which is not distinguished, it is meaningless to establish an index. It can not effectively filter data, and its performance is similar to that of full table scanning

  8. To build a composite index, we must put the field with high discrimination in front, which can filter data more effectively

14. About SQL usage specification

  1. Do not use insert into t_ XXX values (xxx), must display the column attribute of the specified insertion, which is easy to appear program bug after adding or deleting fields

  2. Do not use functions or expressions on the properties of where conditions

  3. for example

  4. This will cause a full table scan

  5. SELECT uid FROM t_user WHERE from_unixtime(day)>=’2019-10-09’

  6. Optimized writing

  7. SELECT uid FROM t_user WHERE day>=unix_timestamp(‘2019-10-09 00:00:00’)

  8. Do not negative queries, and fuzzy queries beginning with%

  9. Negative query conditions: not,! =, < >,! <,! >, not in, not like, will cause full table scanning

  10. %The first fuzzy query will result in a full table scan

  11. If you need to search, you can use full-text indexing

  12. Do not use join queries in large tables and prohibit sub queries in large tables, which will produce temporary tables, consume more memory and CPU, and greatly affect database performance

  13. Try not to use the or condition, you must change to in query. Old version of MySQL or query can not hit the index, even if it can, it is not necessary

  14. The application must catch SQL exceptions and handle them accordingly

15. Use moreEXPLAIN optimization

To optimize mysql, we should make good use of explain to view SQL execution plan

The following is a comment on the optimization

Key column = > the index name used. If no index is selected, the value is null. Index can be forced

Type column = > connection type. A good SQL statement should reach the range level. Eliminate all level

Rows column = > the number of rows to scan. This value is an estimated value. It would be good if you knew it in mind
Extra column = > for details, note that the common unfriendly values are: using filesort, using temporary
key_ Len column = > index length

16. In should not contain too many values in SQL statement

When using in, MySQL optimizes in accordingly, that is, all constants in in in are stored in an array, and the array is ordered. However, if the value is large, the consumption is also relatively large. Another example: select id from table_ Name where num in (1,2,3) for consecutive values, use between instead of in, or use the connection to replace it.

17. SELECT Statement must specify the field name

Select * increases a lot of unnecessary consumption (CPU, IO, memory, network bandwidth); increases the possibility of using overlay index. When the table structure changes, the front break also needs to be updated. Therefore, the field name should be directly followed by the select.

18. When only one piece of data is needed, it is best to use limit 1

This is to make the type column in explain a const type

19. If the sort field does not use an index, sort as little as possible

20. If other fields in the constraint have no index, use or as little as possible

If one of the fields on both sides of or is not an index field, and other conditions are not index fields, the query will not be indexed. Many times, using union all or Union (when necessary) instead of “or” will get better results

21. Try to use union all instead of union

The main difference between union and union all is that the former needs to merge the result set before performing unique filtering operation, which will involve sorting, increasing a large number of CPU operations, increasing resource consumption and latency. Of course, the premise of union all is that there is no duplicate data in the two result sets.

22. Do not use order by Rand ()

select id from table_name order by rand() limit 10000;
The above SQL statement can be optimized to
select id from table_name t1 join (select rand() * (select max(id) from table_name) as nid) t2 ont1.id > t2.nid limit 1000;

23. Use reasonable paging methods to improve paging efficiency

select id,name from table_name limit 866613, 20
When using the above SQL statements for pagination, some people may find that with the increase of the table data, using limit paging query directly will become slower and slower.
The optimization method is as follows: you can take the ID of the maximum number of rows in the previous page, and then limit the starting point of the next page according to the maximum ID. In this column, the largest ID on the previous page is 866612. SQL can be written as follows:
select id,name from table_name where id> 866612 limit 20

24. Avoid where Clause to the field null Value judgment

The judgment of null will cause the engine to abandon the use of index and scan the whole table. As mentioned above, null is not used

25. Avoid where Clause to operate on the field expression

such as
select user_id,user_project from table_name where age*2=36;
In the database, arithmetic operations are performed on the fields, which will cause the engine to abandon the use of indexes. It is suggested that the following should be changed:
select user_id,user_project from table_name where age=36/2;

26. For federated indexes, the leftmost prefix rule should be followed

For example, the index contains the fields ID, name and school. You can use the ID field directly or the order of ID and name, but name; school can’t use this index. Therefore, when creating a federated index, we must pay attention to the order of index fields, and the commonly used query fields are put at the top

27. If necessary, you can use force index to force the query to follow an index

Sometimes the MySQL optimizer uses the index it thinks is appropriate to retrieve SQL statements, but maybe the index it uses is not what we want. In this case, we can use force index to force the optimizer to use the index we have made. At the same time, it can also be used to force index search for statements without index under certain conditions

28. Pay attention to the range query statement

For a federated index, if there are range queries, such as between, >, < and other conditions, the following index fields will be invalid.

29. Soft optimization

  1. Pay attention to query sentence optimization
  2. Optimize subquery
  3. Use index
  4. Breakdown table, cold field can also be separated from hot field
  5. Intermediate table
  6. Use as much as possible inner join**, avoid**left join
  7. Use the small table to drive the large table
  8. Analysis table, check list, optimization table
  9. Increase**Redundant fields, reduce queries, do not split everything very scattered**
  10. Sub database and sub table One master and many subordinates Or directly Multi master and multi subordinate
  11. Cache cluster

30. Hard optimization – money and money

a. Configuration of multi – core and high – frequency CPU, multi – core can execute multiple threads
b. If you configure large memory and increase memory, you can increase the cache capacity, so that you can reduce the disk I / O time and improve the response speed

c. Configure high speed disk or distribute disk reasonably: high speed disk can improve I / O, distributed disk can improve the ability of parallel operation

d. To make a analogy, alicloud servers do self built databases. The server hard disk should be at the level of ESSD. Don’t put me on an efficient cloud disk

31. Optimize database parameters

  1. The configuration parameters of MySQL service are all in the my.cnf Or my.ini The following is a list of parameters that have a significant impact on performance
  2. key_ buffer_ Size = > index buffer size
  3. table_ Cache = > the number of tables that can be opened at the same time
  4. query_ cache_ Size = > query buffer size
  5. query_ cache_ Type = > the switch of the previous parameter, 0 means no buffer is used, 1 means buffer is used
  6. But you can use SQL in queries_ NO_ Cache means not to use buffers
  7. Indicates that the buffer is used only when it is explicitly pointed out in the query, that is, SQL_ CACHE.
  8. sort_ buffer_ Size = > sort buffer
  9. More parameters = >www.mysql.com/cn/why-mysql/perform…

This work adoptsCC agreementThe author and the link to this article must be indicated in the reprint