SQL execution order (based on MySQL)

Time:2020-12-11

Keywords used in SQL query: select, distinct, from, join, on, where, group by, having, sum, count, Max, AVG, order by, limit

Writing order:

​ select->distinct->from->join->on->where->group by->having->order by->limit

Required fields: select, from

Optional fields: distinct, join, on, where, group by, having, sum, count, Max, AVG, order by, limit

Execution order

From – > on – > join – > where – > group by (start using the alias in select, you can use the alias in the following statements) – > sum, count, Max, AVG – > having – > Select – > distinct – > order by – > limit

Meaning of key words

  • from: which data table do you want to retrieve data from
  • join: associate the tables that need to be associated queries

    When associating queries, the database selects a driver table, and then uses the records of this table to associate other tables
    The left join generally takes the left table as the driving table (right join is generally the right table), while the inner join generally takes the table with less result set as the driving table. In some cases, the left join will be optimized to inner join by the query optimizer

    • Driving table selection principle: on the premise of no influence on the final result set, the table with the least result set is preferred as the driving table
    • stayUse index associationThere areIndex Nested-Loop joinandBatched Key Access joinTwo algorithms
    • stayIndex association not usedThere areSimple Nested-Loop joinandBlock Nested-Loop joinTwo algorithms
    • Simple Nested-Loop join(snlj, simple nested loop join) algorithm: according to on condition, take a piece of data from the drive table, then scan the whole table to be driven table, and put the qualified records into the final result set. In this way, each record of the driven table is accompanied by a full table scan of the driven table

      • Matching times: outer table rows * inner table rows
    • Index Nested-Loop Join(inlj, index nested loop join) algorithm: the index nested loop join algorithm is based on the index. The index is based on the inner table. Through the outer table matching conditions, the index is directly matched with the index of the inner table, avoiding the comparison with each record of the inner table, so as to reduce the matching times of the inner table by using the index query

      • Matching times: row number of outer table * index height of inner table
    • Block Nested-Loop Join(bnlj, cache block nested loop join) algorithm: the cache block nested loop join caches multiple data at a time, caches the columns participating in the query into the join buffer, and then matches the data in the join buffer with the data in the inner table in batches, so as to reduce the number of inner loops (once traversing the inner table, you can batch match one join The outer table data in buffer).
      When not in useIndex Nested-Loop JoinBy defaultBlock Nested-Loop Join
    • Batched Key Access join(bkaj) algorithm: it is similar to snlj algorithm, but it is used for indexes on the joined table to be used. Before the rows are submitted to the joined table, the rows are sorted according to the index field, so the random IO is reduced. This is the biggest difference between the two. But what if the joined table does not use an index? Then use bnlj
    • What is?Join Buffer?

      • Join BufferAll columns that participate in the query are cached instead of the only join columns.
      • It can be adjusted byjoin_buffer_sizeCache size
      • join_buffer_sizeThe default value for is 256K,join_buffer_sizeThe maximum value of is in theMySQL 5.1.22Before version4G, and the later version can only be applied for under 64 bit operating system4GOfJoin BufferSpace.
      • useBlock Nested-Loop JoinThe algorithm needs to turn on the optimizer management configurationoptimizer_switchSettings forblock_nested_loopbyonOn by default.
    • When selecting join algorithm, priority will be given to determine whether inlj and bnlj can be used
      Index Nested-LoopJoin > Block Nested-Loop Join > Simple Nested-Loop Join
    • Note: you can use explain to find the driver table. The first table of the result is the driver table, but the execution plan may change when it is actually executed
  • on: association condition
  • where: conditions for filtering data in a table

    • Execution order: bottom up, right to left
    • Note: if it is valid for database records, it cannot be effective for aggregate results. The conditions that can filter out the maximum number of records must be written at the end of the where clause, and aggregate functions (sum, count, Max, AVG) cannot be used
  • group by: how to group the data filtered out above

    • Execution order: from left to right
    • Note: try to use where filter before group by and avoid using having filter later
  • avg: Average
  • having: the conditions for filtering the grouped data above

    • Note: the aggregate result is filtered, so it is very resource consuming, so aggregate function can be used
    • Example: screening areas with a population greater than 100W

      Select region, sum (population), sum (area) from BBC group by region having sum (population) > 1000000. You cannot use where to filter regions over 100W, because there is no such record

  • select: view which column in the result set or the calculated results of the column
  • distinct: duplicate the result set
  • order by: in what order to view the returned data

    • Execution order: from left to right
    • Note: very resource intensive
  • limit: intercept the target page data