SQL basic note 3 normal form key
What is paradigm? Ha, I can design and use it, but I can’t say it. Then the concept of less clear as a whole
What is & classification
Normal form (NF) is a kind of specification. When designing a database model, it has different levels of specification requirements for the rationalization of the relationship between the various attributes within the relationship.
- 1NF, 2NF, 3NF, BCNF (bascoid paradigm), 4NF, 5NF (perfect paradigm)
Low order paradigm is the basis of high-order paradigm. The higher the level of paradigm, the lower the redundancy, and the easier it is to join
- Hyperkey: a set of attributes that uniquely represent tuples (tuples are a row of data or a record). A superkey may contain redundant attributes. For example, ID number + student id+ name.
- Candidate key: a superkey that does not contain redundant attributes. Candidate keys may not be unique, and ID number or student ID can be used as candidate keys.
- Primary key
- Foreign key
- Primary attribute: the attribute of any candidate key is the primary attribute
- Non primary attribute
- Dependency: the value of one or a group of attributes determines the value of other attributes
- Fully dependent: a non primary property is dependent on all primary properties
- Partial dependence
- Transitive dependency: if there is a “a > > b > > C” dependency, then C transitive depends on a
1NF: any attribute in the table is atomic (not splittable), and all RDBMS meet the requirements.
2NF: any non primary attribute must be associated with
Primary attribute / candidate keyComplete dependence, eliminating the partial dependence of non primary attributes on candidate codes.
If 2NF is not met, it will cause
Exception in addition, deletion and modification
Example of not satisfying 2NF
Player ID determines player‘s name and other personal information
The competition ID determines the competition information such as the competition time and the competition field
If you put all the above attributes in one table, it will result in
Data redundancy: if there are n players participating in an abnormal game, the corresponding game information will be n-1 times more when n players are recorded
Exception in addition, deletion and modification: only when operating according to player ID or game ID,
Add: do not know another main property, resulting in unable to insert
Delete: another main attribute and its corresponding information will be deleted
Modify: when modifying information, all records with modified information should be modified
3NF: on the basis of the second normal form, any non primary attribute does not rely on the candidate key
Examples of satisfying 2NF but not 3NF
Table 1: player ID, player name, team name, head coach name
Player name, team name and head coach name can be determined by player ID
But the manager’s name can also be determined by the team’s name, which leads to the transmission dependence
If we split table 1 into
Table 2: player ID, player name, team name
Table 3: name of team and head coach
Then 3NF is satisfied
On the basis of the first normal form, all the attributes are divided according to the “non primary completely dependent primary attribute”, and each table is added with ID to meet the second normal form, and then the attributes are subdivided to eliminate the transitive dependency
FOREIGN KEYThe third paradigm is satisfied
BCNF, 4NF, 5NF query efficiency is very low, rarely used, development efficiency and query efficiency are very touching.
1NF + eliminates the partial functional dependence of the non primary attribute on the key = 2NF. That is to say, in 2NF, the non primary attribute completely depends on the primary keyword;
2NF + eliminate the transfer function dependence of the non primary attribute on the key = 3NF. In 3NF, attributes do not depend on other non primary attributes. Transfer function dependence means that if there is a decisive relationship of “a → B → C”, then C transfer function depends on a;
3NF + eliminates the transfer function dependence of the primary property on the key = BCNF. BCNF is an improved form of 3NF, that is, on the basis of 3NF, if there is no transfer function dependence of any field on any candidate key field in the database table, then it conforms to BCNF.
In general, the business interface will display the user name instead of the user ID, so it is often necessary to connect the user table with other tables for query. Therefore, when the amount of data is large, user ID and use name can be “redundant” in other common tables, so that only a single table can be scanned. It doesn’t need to connect queries in the case of large amount of data.
Anti paradigms are commonly used in OLAP scenarios. See here for details and here for details
Anti paradigm design and data warehouse
The differences between data warehouse and database are as follows
- Database is used to capture data and data warehouse is used to analyze data
- The database has strong real-time requirements for addition, deletion and modification, and needs to store online user data. Data warehouse is usually historical data
- The database should avoid redundancy as much as possible, and the design of data warehouse is more anti normal form, because there are often a lot of historical data. Connection queries can slow down significantly.