Notes on database system and principle

  • Three level model structure

Mode, external mode (sub mode), internal mode

  • Two level mapping

External schema / schema mapping → guarantee logical independence of data

Internal mode / mode influence → guarantee the physical independence of data

Introduction to relational database

E.f.codd of IBM is the most systematic and strict method to put forward relational model

In 1970, the relational data model was proposed, and then the concepts of relational algebra and relational calculus were proposed. In 1972, the first, second and third normal forms of relations were proposed, and in 1974, the BC paradigm of relations was proposed.

  • Single data structure relationship

  • Logical structure of data — two dimensional table

Relation operation

  • query

    • Selection, projection, connection, division, union, intersection, difference
  • Data update

    • Insert, delete, modify

The expression ability of query is the most important part

Characteristics of relational data language

  • Relational data language is a highly non procedural language

    • The choice of storage path is completed by the optimization mechanism of DBMS
    • Users can complete data operation without using loop structure
  • Can be embedded in high-level language

  • Relational algebra, tuple relational calculus and domain relational calculus are completely equivalent in expressive ability.

Three kinds of integrity constraints of relation

  • Entity integrity

    • It is usually automatically supported by relational systems
  • Referential integrity

    • It is not supported by early systems, but can be automatically supported by large-scale systems
  • User defined integrity

    • It reflects the constraint conditions to be followed in the application field, and embodies the implied constraints in the specific field
    • User defined and supported by the system

Relational data structure

  • The relational model is based on set algebra


D1×D2×…..×DnA subset of is called in field D1,D2,…..,DnThe relationship between the two is expressed as follows:

​ R(D1,D2,……,Dn

R: relationship name

N: degree of relationship


  • Each element in a relationship is a tuple in the relationship, usually represented by T.

Unit relation and binary relation

  • When n = 1, the relation is called unary relation
  • When n = 2, the relation is called binary relation

Representation of relations

  • Relationship is also a two-dimensional table, each row of the table corresponds to a tuple, and each column of the table corresponds to a field.


  • Different columns in a relationship can correspond to the same domain. In order to distinguish them, each column must be given a name called attribute.
  • N-item relation must have n attributes.


  • A domain is a collection of values with the same data type. (integers, real numbers, etc.)

Cartesian product

  • Given a set of fields D1,D2………..Dn, these fields can have the same. D1,D2………..DnThe Cartesian product of is:

​ D1×D2………..×Dn = {(d1,d2……dn)|di∈Di,i = 1,2……..n}

  • A combination of all the values of all fields

  • It can’t be repeated

    • Tuple

      • Every element in Cartesian product (D1,d2,………,dn)It’s called an N – tuple, or tuple for short. This is the recorded value.
    • Component

      • Every element in Cartesian product (D1,d2,………,dn)Each value DiIt’s called a component.
    • Cardinal number

      • If Di(I = 1,2,…, n) is a finite set with the cardinality Mi(I = 1, 2,…, n), then D1×D2×……×DnThe cardinal number m is:

  • The Cartesian product is expressed as follows

    • Cartesian product can be expressed as a two-dimensional table. Each row in the table corresponds to a tuple, and each column in the table corresponds to a field.

Candidate key

  • If the value of an attribute group in a relationship can uniquely identify a tuple, it is called a candidate code.

    If there is only one candidate code, it is called the main code

All – key

  • In the most extreme case, all attribute groups of a relational schema are candidate codes for the relational schema, called all key

Primary key

  • If a relationship has multiple candidate codes, one of them is selected as the master code.
  • The attributes of the main code are called prime attributes
  • Attributes that are not included in any candidate code are called non – key attributes