Several methods of realizing PivotTable in Oracle are explained in detail

Time:2022-4-30

Pivot table is a very practical analysis function in Excel. It can be used to realize complex data classification, summary and comparative analysis. It is one of the necessary skills for data analysts and operators. Today, let’s talk about how to implement PivotTable in Oracle database.

The sample data used in this article can be downloaded here.

Use case expression to realize PivotTable report

The essence of PivotTable report is to group data according to different combinations of rows and columns, and then summarize the results; Therefore, it is very similar to the functions of group by and aggregate functions (count, sum, AVG, etc.) in the database.

First, we use the following group by clause to classify and summarize the sales data:

Select coalesce (product, '[all products]') "product",
    '[Channel]' [all channels',
    Coalesce (to_char (saledate, 'yyyymm'), '[all months]') "month",
    Sum (amount) "sales volume"
from sales_data
group by rollup (product,channel,to_char(saledate, 'YYYYMM'));

The above statements are summarized according to products, channels and months; Rollup option is used to generate subtotals, totals and totals of different levels; The coalesce function is used to display the null value in the summary row as the corresponding information. The results returned by the query are as follows:

Product | channel | month | sales volume|
———|———|———–|——-|
Orange | Jingdong | 201901 | 41289|
Orange | Jingdong | 201902 | 43913|
Orange | Jingdong | 201903 | 49803|
Orange | Jingdong | 201904 | 49256|
Orange | Jingdong | 201905 | 64889|
Orange | Jingdong | 201906 | 62649|
Orange | Jingdong | [all months] | 311799|
Orange storefront 201901 41306|
Orange | storefront | 201902 | 37906|
Orange | storefront | 201903 | 48866|
Orange | storefront | 201904 | 48673|
Orange storefront 201905 58998|
Orange storefront 201906 58931|
Oranges | storefront | [all months] | 294680|
Orange | Taobao | 201901 | 43488|
Orange | Taobao | 201902 | 37598|
Orange | Taobao | 201903 | 48621|
Orange | Taobao | 201904 | 49919|
Orange | Taobao | 201905 | 58530|
Orange | Taobao | 201906 | 64626|
Orange | Taobao | [all months] | 302782|
All channels|

Banana | [all channels] | [all months] | 925369|
[all products] | [all channels] | [all months] | 2771682|

In fact, we have obtained the summary results of the PivotTable report, but we need to display the data in different columns according to different months; That is, you need to convert rows into columns. This function can be realized by case expression:

Select coalesce (product, 'all products') "product", coalesce (channel,' all channels') "channel", 
    Sum (case to_char (saledate, 'yyyymm') when '201901' then amount else 0 end) "January",
    Sum (case to_char (saledate, 'yyyymm') when '201902' then amount else 0 end) "February",
    Sum (case to_char (saledate, 'yyyymm') when '201903' then amount else 0 end) "March",
    Sum (case to_char (saledate, 'yyyymm') when '201904' then amount else 0 end) "April",
    Sum (case to_char (saledate, 'yyyymm') when '201905' then amount else 0 end) "may",
    Sum (case to_char (saledate, 'yyyymm') when '201906' then amount else 0 end) "June",
    Total (sum)
from sales_data
group by rollup (product, channel);

The case expression in the first sum function only summarizes the sales volume in January 201901, and the sales volume in other months is set to 0; The sum function in the following is followed by analogy to obtain the sales summary of each month and the total of all months.

Total of products | channels | January | February | March | April | may | June ||
———-|———-|——|——|——|——|——|——|——-|
Orange | Jingdong | 41289 | 43913 | 49803 | 49256 | 64889 | 62649 | 311799|
Orange storefront 41306 37906 48866 48673 58998 58931 294680|
Orange | Taobao | 43488 | 37598 | 48621 | 49919 | 58530 | 64626 | 302782|
Orange | [all channels] | 126083 | 119417 | 147290 | 147848 | 182417 | 186206 | 909261|
Apple | JD | 38269 | 40593 | 56552 | 56662 | 64493 | 62045 | 318614|
Apple store | 43845 | 40539 | 44909 | 55646 | 56771 | 64933 | 306643|
Apple | Taobao | 42969 | 43289 | 48769 | 58052 | 58872 | 59844 | 311795|
Apple | all channels | 125083 124421 | 150230 | 170360 | 180136 | 186822 | 937052|
Banana | Jingdong | 36879 | 36981 | 51748 | 54801 | 64936 | 60688 | 306033|
Banana storefront | 41210 | 39420 | 50884 | 52085 | 60249 | 67597 | 311445|
Banana | Taobao | 42468 | 41955 | 52780 | 54971 | 56504 | 59213 | 307891|
Banana | [all channels] | 120557 | 118356 | 155412 | 161857 | 181689 | 187498 | 925369|
[all products] | [all channels] | 371723 | 362194 | 452932 | 480065 | 544242 | 560526 | 2771682|

The decode function in Oracle can also implement functions similar to case expressions.

The above method of realizing PivotTable report has certain limitations. If the sales volume from July to December needs to be counted, we need to modify the query statement to increase the processing of this part. Therefore, Oracle 11g introduces a new clause to realize automatic row column conversion: pivot.

Using pivot clause to implement PivotTable report

The pivot clause in Oracle is used to convert rows into columns. The basic syntax is as follows:


SELECT col1, col2, ...
FROM tbl
PIVOT (
 pivot_clause,
 pivot_for_clause,
 pivot_in_clause
);

The pivot clause consists of three parts:

  • pivot_ Clause, which defines the data to be summarized, that is, the aggregate function. For example, sum (amount) is used to summarize sales volume;
  • pivot_ for_ Clause, which specifies the fields that need to be converted from rows to columns. For example, use for saledate to display the data of each month as a column;
  • pivot_ in_ Clause, specifying that the pivot_ for_ Which data values in the claim field are converted to columns. For example, in (‘201901 ‘,’ 201902 ‘) means that only the data of 201901 and 201902 months are converted into columns.

For the example above, we can use the following pivot clause:


with d(saledate, product, channel, amount) as (
 select to_char(saledate, 'YYYYMM'),
     product,
     channel,
     amount
 from sales_data
)
select *
from d
pivot (
 sum(amount)
 for saledate
 in ('201901', '201902', '201903', '201904', '201905', '201906')
)
order by product, channel;

Among them, the pivot clause summarizes the sales volume by month and converts the month into column display. The returned results are as follows:

PRODUCT  |CHANNEL |’201901’|’201902’|’201903’|’201904’|’201905’|’201906’|
———|——–|——–|——–|——–|——–|——–|——–|
Orange | Jingdong | 41289 | 43913 | 49803 | 49256 | 64889 | 62649|
Orange storefront 41306 37906 48866 48673 58998 58931|
Orange | Taobao | 43488 | 37598 | 48621 | 49919 | 58530 | 64626|
Apple | JD | 38269 | 40593 | 56552 | 56662 | 64493 | 62045|
Apple store | 43845 | 40539 | 44909 | 55646 | 56771 | 64933|
Apple | Taobao | 42969 | 43289 | 48769 | 58052 | 58872 | 59844|
Banana | Jingdong | 36879 | 36981 | 51748 | 54801 | 64936 | 60688|
Banana storefront | 41210 | 39420 | 50884 | 52085 | 60249 | 67597|
Banana | Taobao | 42468 | 41955 | 52780 | 54971 | 56504 | 59213|

Next, we need to add a total row and total column. To do this, we can add sales first_ Data data is grouped for statistics, and then transformed using pivot clause:

with d(saledate, product, channel, amount) as (
 select to_char(saledate, 'YYYYMM'),
     product,
     channel,
     sum(amount)
 from sales_data
 group by rollup (to_char(saledate, 'YYYYMM'), product, channel)
), pt as (
 select *
 from d
 pivot (
  sum(amount)
  for saledate
  in ('201901' s01, '201902' s02, '201903' s03, '201904' s04, '201905' s05, '201906' s06)
 )
)
Select coalesce (product, '[all products]') "product",
    '[Channel]' [all channels',
    S01 "January", S02 "February", S03 "March", S04 "April", S05 "may", S06 "June",
    S01 + S02 + S03 + S04 + S05 + S06 "total"
from pt
order by product, channel;

We added a select query to the result returned by the pivot clause, and modified the name of the returned field to make the result closer to the excel PivotTable:

Total of products | channels | January | February | March | April | may | June ||
———-|———-|——|——|——|——|——|——|——-|
Orange | Jingdong | 41289 | 43913 | 49803 | 49256 | 64889 | 62649 | 311799|
Orange storefront 41306 37906 48866 48673 58998 58931 294680|
Orange | Taobao | 43488 | 37598 | 48621 | 49919 | 58530 | 64626 | 302782|
Orange | [all channels] | 126083 | 119417 | 147290 | 147848 | 182417 | 186206 | 909261|
Apple | JD | 38269 | 40593 | 56552 | 56662 | 64493 | 62045 | 318614|
Apple store | 43845 | 40539 | 44909 | 55646 | 56771 | 64933 | 306643|
Apple | Taobao | 42969 | 43289 | 48769 | 58052 | 58872 | 59844 | 311795|
Apple | all channels | 125083 124421 | 150230 | 170360 | 180136 | 186822 | 937052|
Banana | Jingdong | 36879 | 36981 | 51748 | 54801 | 64936 | 60688 | 306033|
Banana storefront | 41210 | 39420 | 50884 | 52085 | 60249 | 67597 | 311445|
Banana | Taobao | 42468 | 41955 | 52780 | 54971 | 56504 | 59213 | 307891|
Banana | [all channels] | 120557 | 118356 | 155412 | 161857 | 181689 | 187498 | 925369|
[all products] | [all channels] | 371723 | 362194 | 452932 | 480065 | 544242 | 560526 | 2771682|

Pivot clauses can also perform multiple aggregation operations at one time or group by multiple fields. For example:

with d(saledate, product, channel, amount) as (
 select to_char(saledate, 'YYYYMM'), product, channel, amount
 from sales_data
 where to_char(saledate, 'YYYYMM') in ('201901', '201902', '201903')
)
select *
from d
pivot (
 sum(amount)
 for (channel, saledate)
 In ('taobao ',' 201901 '), ('storefront', '201901'), ('jd ',' 201901 '),
   ('taobao ',' 201902 '), ('storefront', '201902'), ('jd ',' 201902 '),
   ('taobao ',' 201903 '), ('storefront', '201903'), ('jd ',' 201903 '))
);

Product | Taobao | ' 201901 '| storefront' ' 201901 | 'JD |' 201901 '| Taobao' ' 201902 '| storefront' ' 201902 | 'JD |' 201902 | 'Taobao |' 201903 '| storefront' ' 201903 | 'Jingdong |' 201903'|
-------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|
Banana | 42468 | 41210 | 36879 | 41955 | 39420 | 36981 | 52780 | 50884 | 51748|
Orange | 43488 | 41306 | 41289 | 37598 | 37906 | 43913 | 48621 | 48866 | 49803|
Apple | 42969 | 43845 | 38269 | 43289 | 40539 | 40593 | 48769 | 44909 | 56552|

The above query returns the summary results grouped by channel and month, and converts them into columns for display.

The opposite of pivot is unpivot, which converts columns into rows. We use the following example to convert the data after the row and column back:


with d(saledate, product, channel, amount) as (
 select to_char(saledate, 'YYYYMM'),
     product,
     channel,
     amount
 from sales_data
),
pt as (
 select *
 from d
 pivot (
  sum(amount)
  for saledate
  in ('201901' "201901", '201902' "201902", '201903' "201903", '201904' "201904", '201905' "201905", '201906' "201906")
 )
)
select * from pt
unpivot (
 amount
 for saledate
 IN ("201901", "201902", "201903", "201904", "201905", "201906")
);

Among them, the unpivot clause also has three options to convert the column represented by each month into a row in the saledate field, and convert the corresponding data into a row in the amount field. The results returned by the above query are as follows:

PRODUCT |CHANNEL |SALEDATE|AMOUNT|
——–|——–|——–|——|
Orange | Jingdong | 201901 | 41289|
Orange | Jingdong | 201902 | 43913|
Orange | Jingdong | 201903 | 49803|
Orange | Jingdong | 201904 | 49256|
Orange | Jingdong | 201905 | 64889|
Orange | Jingdong | 201906 | 62649|
Banana store 201901 41210|
Banana store | 201902 | 39420|
Banana store | 201903 | 50884|
Banana store | 201904 | 52085|
Banana store 201905 60249|
Banana store | 201906 | 67597|

If you want to unlock more pivot and unpivot gestures, you can refer to the definitions and examples in the official documents.

Using model clause to implement PivotTable report

In addition to the pivot clause, Oracle also provides a more powerful function: the model clause. In short, the model clause can realize the cell reference based on position and symbol and complex formula calculation in Excel and other spreadsheets.

The complete model clause is complex. Let’s take a direct look at an example:

with d(saledate, product, channel, amount) as (
 select to_char(saledate, 'YYYYMM'), product, channel, sum(amount)
 from sales_data
 group by rollup (to_char(saledate, 'YYYYMM'), product, channel)
)
Select coalesce (product, '[all products]') "product",
    '[Channel]' [all channels',
    S201901 "January", s201902 "February", s201903 "March", s201904 "April", s201905 "may", s201906 "June",
    Total "total"
from d
model 
 return updated rows
 partition by (product, channel)
 dimension by (saledate)
 measures (amount, 0 s201901, 0 s201902, 0 s201903, 0 s201904, 0 s201905, 0 s201906, 0 stotal)
 unique dimension
 rules upsert all
 (s201901[0] = amount['201901'],
  s201902[0] = amount['201902'],
  s201903[0] = amount['201903'],
  s201904[0] = amount['201904'],
  s201905[0] = amount['201905'],
  s201906[0] = amount['201906'],
  stotal[0] = sum(amount)[saledate between '201901' and '201906'])
order by product, channel;

First, get the basic data through the with clause. Then use the model clause to realize the row column; Return updated rows means that only the updated and inserted data of the calculation model is returned. Partition by is used to define partitions (products and channels). Each partition is calculated independently; Dimension by specifies the dimension of the measurement (month); Measures defines the measurement. The amount comes from the source table. 0 s201901 means to create a measurement s201901 and initialize it to 0; Unique dimension means that each cell in the model can be uniquely determined by adding the dimension by field; Rules is used to define the expression assigned to each measure. Upsert all means to update the existing cell. If it does not exist, the cell will be created; S201901 [0] is a reference to a cell by position (dimension is 1). Amount [‘201901’] represents the amount field value corresponding to month 201901, and total [0] is the sum of all months.

The results returned by the above statements are as follows:

Total of products | channels | January | February | March | April | may | June ||
———-|———-|——|——|——|——|——|——|——-|
Orange | Jingdong | 41289 | 43913 | 49803 | 49256 | 64889 | 62649 | 311799|
Orange storefront 41306 37906 48866 48673 58998 58931 294680|
Orange | Taobao | 43488 | 37598 | 48621 | 49919 | 58530 | 64626 | 302782|
Orange | [all channels] | 126083 | 119417 | 147290 | 147848 | 182417 | 186206 | 909261|
Apple | JD | 38269 | 40593 | 56552 | 56662 | 64493 | 62045 | 318614|
Apple store | 43845 | 40539 | 44909 | 55646 | 56771 | 64933 | 306643|
Apple | Taobao | 42969 | 43289 | 48769 | 58052 | 58872 | 59844 | 311795|
Apple | all channels | 125083 124421 | 150230 | 170360 | 180136 | 186822 | 937052|
Banana | Jingdong | 36879 | 36981 | 51748 | 54801 | 64936 | 60688 | 306033|
Banana storefront | 41210 | 39420 | 50884 | 52085 | 60249 | 67597 | 311445|
Banana | Taobao | 42468 | 41955 | 52780 | 54971 | 56504 | 59213 | 307891|
Banana | [all channels] | 120557 | 118356 | 155412 | 161857 | 181689 | 187498 | 925369|
[all products] | [all channels] | 371723 | 362194 | 452932 | 480065 | 544242 | 560526 | 2771682|

The dimension (by) and the dimension (by) in the array are allowed to be updated through the operation of dimension (by) and dimension (DIM) in the array. Among them, the rule supports wildcard and cyclic iteration, and the measurement can use aggregate function and window function.

Please refer to the official document for the complete usage posture of the model clause.

This is the end of this article about explaining several methods of implementing PivotTable in Oracle. For more information about Oracle PivotTable, please search the previous articles of developeppaer or continue to browse the relevant articles below. I hope you will support developeppaer in the future!