SQL basic note 3 normal form key


SQL basic note 3 normal form key

What is paradigm? Ha, I can design and use it, but I can’t say it. Then the concept of less clear as a whole


What is & classification

Normal form (NF) is a kind of specification. When designing a database model, it has different levels of specification requirements for the rationalization of the relationship between the various attributes within the relationship.


  • 1NF, 2NF, 3NF, BCNF (bascoid paradigm), 4NF, 5NF (perfect paradigm)

Low order paradigm is the basis of high-order paradigm. The higher the level of paradigm, the lower the redundancy, and the easier it is to join



  • Hyperkey: a set of attributes that uniquely represent tuples (tuples are a row of data or a record). A superkey may contain redundant attributes. For example, ID number + student id+ name.
  • Candidate key: a superkey that does not contain redundant attributes. Candidate keys may not be unique, and ID number or student ID can be used as candidate keys.
  • Primary key
  • Foreign key


  • Primary attribute: the attribute of any candidate key is the primary attribute
  • Non primary attribute

rely on

  • Dependency: the value of one or a group of attributes determines the value of other attributes
  • Fully dependent: a non primary property is dependent on all primary properties
  • Partial dependence
  • Transitive dependency: if there is a “a > > b > > C” dependency, then C transitive depends on a


Each NF

  • 1NF: any attribute in the table is atomic (not splittable), and all RDBMS meet the requirements.

  • 2NF: any non primary attribute must be associated withPrimary attribute / candidate keyComplete dependence, eliminating the partial dependence of non primary attributes on candidate codes.

    If 2NF is not met, it will cause

    • data redundancy

    • Exception in addition, deletion and modification

      Example of not satisfying 2NF

      Player ID determines player‘s name and other personal information

      The competition ID determines the competition information such as the competition time and the competition field

      If you put all the above attributes in one table, it will result in

      • Data redundancy: if there are n players participating in an abnormal game, the corresponding game information will be n-1 times more when n players are recorded

      • Exception in addition, deletion and modification: only when operating according to player ID or game ID,

        Add: do not know another main property, resulting in unable to insert

        Delete: another main attribute and its corresponding information will be deleted

        Modify: when modifying information, all records with modified information should be modified

  • 3NF: on the basis of the second normal form, any non primary attribute does not rely on the candidate key

    Examples of satisfying 2NF but not 3NF

    Table 1: player ID, player name, team name, head coach name

    Player name, team name and head coach name can be determined by player ID

    But the manager’s name can also be determined by the team’s name, which leads to the transmission dependence

    If we split table 1 into

    Table 2: player ID, player name, team name

    Table 3: name of team and head coach

    Then 3NF is satisfied


Fatal conclusion

On the basis of the first normal form, all the attributes are divided according to the “non primary completely dependent primary attribute”, and each table is added with ID to meet the second normal form, and then the attributes are subdivided to eliminate the transitive dependencyFOREIGN KEYThe third paradigm is satisfied

BCNF, 4NF, 5NF query efficiency is very low, rarely used, development efficiency and query efficiency are very touching.

1NF + eliminates the partial functional dependence of the non primary attribute on the key = 2NF. That is to say, in 2NF, the non primary attribute completely depends on the primary keyword;

2NF + eliminate the transfer function dependence of the non primary attribute on the key = 3NF. In 3NF, attributes do not depend on other non primary attributes. Transfer function dependence means that if there is a decisive relationship of “a → B → C”, then C transfer function depends on a;

3NF + eliminates the transfer function dependence of the primary property on the key = BCNF. BCNF is an improved form of 3NF, that is, on the basis of 3NF, if there is no transfer function dependence of any field on any candidate key field in the database table, then it conforms to BCNF.


Anti paradigm

In general, the business interface will display the user name instead of the user ID, so it is often necessary to connect the user table with other tables for query. Therefore, when the amount of data is large, user ID and use name can be “redundant” in other common tables, so that only a single table can be scanned. It doesn’t need to connect queries in the case of large amount of data.

Anti paradigms are commonly used in OLAP scenarios. See here for details and here for details



Anti paradigm design and data warehouse

The differences between data warehouse and database are as follows

  • Database is used to capture data and data warehouse is used to analyze data
  • The database has strong real-time requirements for addition, deletion and modification, and needs to store online user data. Data warehouse is usually historical data
  • The database should avoid redundancy as much as possible, and the design of data warehouse is more anti normal form, because there are often a lot of historical data. Connection queries can slow down significantly.


Recommended Today

NLP in financial market — emotional analysis

By Yuki TakahashiCompile VKSource: towards Data Science Since the launch of alexnet on Imagenet, the deep learning of computer vision has been successfully applied to various applications. On the contrary, NLP has been lagging behind in the application of deep neural network. Many applications that claim to use artificial intelligence usually use some rule-based algorithm […]