When writing SQL, because some statement aliases cannot be called, baidu explained the reason. It turned out that it was caused by different alias mechanisms. In order to avoid making the same mistake again, I summarized the information found on the Internet today. The execution sequence of SQL and MySQL is the same. It is found that the internal mechanism is the same. The biggest difference is in the reference of alias.
1、 SQL execution order
(10) order by
From this sequence, it is not difficult to find that all query statements are executed from from. During the execution process, each step will generate a virtual table for the next step, which will be used as the input of the next execution step.
Step 1: first perform a Cartesian product on the first two tables in the from clause, and then generate the virtual table VT1 (select a relatively small table as the basic table)
Step 2: the next step is to apply the on filter. The logical expression in on will be applied to each row in VT1, filter out the rows that meet the on logical expression, and generate the virtual table vt2
Step 3: if it is an outer join, the outer rows will be added in this step. The left outer join will add the rows filtered in the second step in the left table. If it is a right outer join, the rows filtered in the second step in the right table will be added to generate a virtual table VT3
Step 4: if there are more than two tables in the from clause, connect VT3 with the third table to calculate the Cartesian product and generate a virtual table. This process is a repeat of steps 1-3, and finally get a new virtual table VT3.
Step 5: apply the where filter and reference the where filter to the virtual table produced in the previous step to generate the virtual table vt4. There is an important detail here. I have to say that for the query containing the outer join clause, there is a puzzling question. Is it the on filter or the where filter to specify the logical expression? The biggest difference between on and where is that if the logical expression is applied in on, the removed rows can be added back again in the third step of outer join, and the removal of where is the final result. For a simple example, there is a student table (class, name) and a grade table (name, grade). Now I need to return the grades of all the students in class X, but several students in this class are absent from the exam, that is, there is no record in the grade table. In order to get the expected results, we need to specify the relationship between the students and the grade sheet (student. Name = grade. Name) in the on clause. Do we find that when executing the second step, the records of students who did not take the exam will not appear in vt2, because they are filtered out by the logical expression of on, but we can find the students who did not take the exam in the left table (students) by using left outer join, Because we want to return all the students in class X, if we apply students in on If class = ‘x’, all records of students in class X will not be found in the left outer join, so students can only be applied in the where filter Class = ‘x’ because its filtering is final.
Step 6: the group by clause combines the unique values in into a group to get the virtual table vT5. If group by is applied, all the following steps can only get the vT5 columns or aggregate functions (count, sum, AVG, etc.). The reason is that the final result set contains only one row for each group. Please keep this in mind.
Step 7: generate v6.rolt5 for super group
Step 8: apply the having filter to generate vt7. Having filter is the first and only filter applied to grouped data.
Step 9: process the select clause. Filter out the columns in vt7 that appear in select. Generate vt8
Step 10: apply the distinct clause, remove the same row from vt8, and generate vt9. In fact, if the group by clause is applied, distinct is redundant. The same reason is that when grouping, the unique values in the column are divided into a group, and only one row of records is returned for each group, so all records will be different.
Step 11: apply the order by clause. According to order_ by_ Condition sorts vt9, and a cursor is returned instead of a virtual table. SQL is based on the theory of set. The set will not sort its rows in advance. It is only a logical set of members, and the order of members is irrelevant. A query that sorts a table can return an object that contains the logical organization of a specific physical order. This object is called a cursor. Because the return value is a cursor, the query using the order by clause cannot be applied to table expressions. Sorting is very costly. Unless you have to sort, you’d better not specify order by. Finally, this step is the first and only one that can use the alias in the select list.
Step 12: apply the top option. At this time, the result is returned to the requester, that is, the user.
2、 MySQL execution order
Select statement definition
A completed select statement contains several optional clauses. The definition of select statement is as follows:
Select clause is required, and other clauses such as where clause and group by clause are optional.
In a select statement, the order of clauses is fixed. For example, the group by clause will not precede the where clause.
Select statement execution order
The execution order of the sub sentence of the select statement is different from the input order of the sub sentence of the select statement, so it is not executed from the select clause, but in the following order:
Start – > from clause – > where clause – > group by Clause – > having clause – > order by Clause – > select clause – > limit clause – > final result
After each clause is executed, an intermediate result will be generated for the following clauses. If there is no clause, it will be skipped
By comparison, the execution order of MySQL and SQL is basically the same. The SQL statements in the standard order are:
Where candidate name is not null
Group by examinee name
Having > 600
Order by Max total score
In the above example, the execution order of SQL statements is as follows:
(1). First execute the from clause from TB_ The grade table assembles data from the data source
(2). Execute where clause to filter TB_ All data in the grade table is not null data
(3). Execute the group by clause and put TB_ The grade table is grouped by student name column
(4). Calculate the max() aggregation function and find the largest values in the total score according to the “total score”
(5). Execute the having clause to screen those whose total score of the course is greater than 600
(7). Execute the order by clause to sort the final results by “Max score”
The following is an example to illustrate the root cause why aliases cannot be used:
avg(s.score) as rscore,
when avg(s.score) < 60 then
when avg(s.score) >= 60 and avg(s.score) <= 80 then
when avg(s.score) > 80 and avg(s.score) <= 100 then
from score s, student m
where s.mid = m.mid
group by m.mname
order by rscore desc;
If AVG (s.score) in the case is changed to the alias rscore, an error will be reported when running under SQL
If you replace the count (*) in having with the alias C, this sentence will become wrong in SQL. It is correct in MySQL. I can’t figure it out. I gave my friend some advice and finally understood it. Here are the instructions:
Execution order of groupby and having, where, orderby statements:
Finally, I want to explain the execution order of the group by, having, where and orderby statements. An SQL statement often produces multiple temporary views, so the execution order of these keywords is very important, because you must know whether this keyword operates on the fields before the corresponding view is formed or on the temporary view. This problem is particularly important in the view with alias. The keywords listed above are executed in the following order: where, groupby, having, orderby. First, delete the records that do not meet the conditions in the most original records, and then group the filtered views through the grouping conditions specified after the groupby keyword. Then, the system filters out the records that do not meet the conditions after the grouped views according to the filtering conditions specified after the having keyword, and then sorts the views according to the orderby statement, so that the final result is generated. Among the four keywords, the column name of the final view can only be used in the orderby statement, such as:
Here, IDE can be used only in the order by statement. If column names need to be referenced in other conditional statements, only ID can be used, not ide.