Syntax and Use of Window Functions in PostgreSQL Database

Time:2019-6-29

What is a window function?

A window function performs a calculation on a series of table rows that are somewhat related to the current row. This is comparable to the calculation performed by an aggregate function. However, window functions do not aggregate multiple rows into a single output line, which is different from the usual non-window aggregation function. Instead, rows retain their separate identities. Behind these phenomena, window functions can access more than the current row of query results.

  • You can access multiple rows of records related to the current record.
  • It does not aggregate multiple rows into one row, which is different from aggregation function.

Window function grammar

The window function follows an OVER clause, which determines which rows in the query are separated and processed by the window function.

You can include PARTITION BY and ORDER BY instructions, both of which are optional.

window_func() OVER([PARTITION BY field] [ORDER BY field])

If no PARTITION BY and ORDER BY instructions are specified, then they are equivalent to aggregation functions, which compute all data.

The PARTITION BY clause groups the query rows into partitions, and the window function handles them independently. PARTITION BY works like a query-level GROUP BY clause, but its expression is always an expression and cannot be the name or number of the output column. Without PARTITION BY, all rows generated by the query are treated as a single partition.

The ORDER BY clause determines the order of rows in a partition processed by a window function. It works like an ORDER BY clause at the query level, but it also cannot use the name or number of the output column. Without ORDER BY, rows will be processed in an unspecified order.

Aggregation functions in PostgreSQL can also be used as window functions.

In addition to these built-in window functions, any built-in or user-defined general or statistical aggregation (i.e., ordered or hypothetical aggregation) can be used as window functions. Aggregation functions act as window functions only if the call follows the OVER clause; otherwise they act as non-window aggregation and return a single row for the remaining collection.

Examples of window functions

Emp_salary table is structured as follows:


SELECT emp_no, dep_name, salary
FROM public.emp_salary
order by dep_name, emp_no;
emp_id dep_name salary
7 develop 4200
8 develop 6000
9 develop 4500
10 develop 5200
11 develop 5200
2 personnel 3900
5 personnel 3500
1 sales 5000
3 sales 4800
4 sales 4800

If we want to compare the average salary of each employee with that of his department, we need such results:

emp_id dep_name salary avg
7 develop 4200 5020
8 develop 6000 5020
9 develop 4500 5020
10 develop 5200 5020
11 develop 5200 5020
2 personnel 3900 3700
5 personnel 3500 3700
1 sales 5000 4866.66666666667
3 sales 4800 4866.66666666667
4 sales 4800 4866.66666666667

If you do not use window functions to query, it is more complex, of course, it can also be done, the statement is as follows:


SELECT e0.emp_no, e0.dep_name, e0.salary, e2.avg_salary
FROM public.emp_salary e0
join (
 select e1.dep_name, avg(e1.salary) as avg_salary
 from public.emp_salary e1
 group by e1.dep_name
) e2 on e2.dep_name = e0.dep_name
order by e0.dep_name, e0.emp_no;

If you use window function to query, it’s easy to do. The SQL statement is as follows:


SELECT emp_no, dep_name, salary,
  avg(salary) over(partition by dep_name)
FROM public.emp_salary
order by dep_name, emp_no;

But if you want to query the change of the average salary of each department with the increase of employees, as shown in the following table, it will be difficult to do without window function query.

emp_id dep_name salary avg
7 develop 4200 4200
8 develop 6000 5100
9 develop 4500 4900
10 develop 5200 4975
11 develop 5200 5020
2 personnel 3900 3900
5 personnel 3500 3700
1 sales 5000 5000
3 sales 4800 4900
4 sales 4800 4866.66666666667

If the window function is used, it can still be easily completed, as follows:


SELECT emp_no, dep_name, salary,
  avg(salary) over(partition by dep_name order by emp_no)
FROM public.emp_salary
order by dep_name, emp_no;

It can be seen that window functions have great advantages in computing the related rows in the query results.

summary

Above is the whole content of this article. I hope the content of this article has some reference value for your study or work. Thank you for your support to developpaer.