提要:在本教程中,您将学习如何使用PostgreSQL窗口函数对与当前行相关的一组行进行计算。

简介

我们将创建以下表来作演示:

CREATE TABLE product_groups (
 group_id serial PRIMARY KEY,
 group_name VARCHAR (255) NOT NULL
);

CREATE TABLE products (
 product_id serial PRIMARY KEY,
 product_name VARCHAR (255) NOT NULL,
 price DECIMAL (11, 2),
 group_id INT NOT NULL,
 FOREIGN KEY (group_id) REFERENCES product_groups (group_id)
);

INSERT INTO product_groups (group_name)
VALUES
 ('Smartphone'),
 ('Laptop'),
 ('Tablet');

INSERT INTO products (product_name, group_id,price)
VALUES
 ('Microsoft Lumia', 1, 200),
 ('HTC One', 1, 400),
 ('Nexus', 1, 500),
 ('iPhone', 1, 900),
 ('HP Elite', 2, 1200),
 ('Lenovo Thinkpad', 2, 700),
 ('Sony VAIO', 2, 700),
 ('Dell Vostro', 2, 800),
 ('iPad', 3, 700),
 ('Kindle Fire', 3, 150),
 ('Samsung Galaxy Tab', 3, 200);

理解窗口函数的最简单的方法是先来看看聚合函数。聚合函数将来自一组行的数据聚合到单行中。

比如,下例使用 AVG() 函数计算产品的平均价格:

SELECT
 AVG (price)
FROM
 products;
         avg
----------------------
 636.3636363636363636
(1 row)

要将聚合函数应用于行的子集,可以使用 GROUP BY 子句。以下语句返回每个产品组的平均价格。

SELECT
 group_name,
 AVG (price)
FROM
 products
INNER JOIN product_groups USING (group_id)
GROUP BY
 group_name;
 group_name |         avg
------------+----------------------
 Tablet     | 600
 Smartphone | 500
 Laptop     | 800
(3 rows)

如您所见,在两种情况下,AVG() 聚合函数都会减少查询返回的行数。

与集合函数一样,窗口函数对一组行进行操作,但不会减少查询返回的行数。

术语 窗口(window) 描述了窗口函数在其上运行的一组行。 窗口函数从窗口中的行中返回一个值。

例如,以下查询将返回产品名称,价格,产品组名称以及每个产品组的平均价格。

SELECT
 product_name,
 price,
 group_name,
 AVG (price) OVER (PARTITION BY group_name)
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    |  price  | group_name |         avg
--------------------+---------+------------+----------------------
 HP Elite           |  900.00 | Laptop     | 800
 Lenovo Thinkpad    | 1100.00 | Laptop     | 800
 Sony VAIO          |  600.00 | Laptop     | 800
 Dell Vostro        |  600.00 | Laptop     | 800
 Microsoft Lumia    |  300.00 | Smartphone | 500
 HTC One            |  400.00 | Smartphone | 500
 Nexus              |  500.00 | Smartphone | 500
 iPhone             |  800.00 | Smartphone | 500
 iPad               |  700.00 | Tablet     | 600
 Kindle Fire        |  300.00 | Tablet     | 600
 Samsung Galaxy Tab |  800.00 | Tablet     | 600
(11 rows)

在这个查询中,AVG() 函数作为一个窗口函数,对 OVER 子句指定的一组行进行操作。每组行被称为一个窗口。

这个查询的新语法是 AVG (price) OVER (PARTITION BY group_name) 子句。

在幕后,PostgreSQL 按照 group_name 列中的值对行进行排序,PARTITION BY 将行分组,AVG() 函数计算每个产品组的平均价格。

JOINWHEREGROUP BYHAVING 子句完成后,但在 ORDER BY 之前,窗口函数对结果集执行计算。

窗口函数语法

PostgreSQL 为窗口函数调用提供了一个复杂的语法:

window_function(arg1, arg2,..) OVER (PARTITION BY expression ORDER BY expression)
  • window_function(arg1,arg2,...) 是一个窗口函数。
  • 要将行分组或分区,可以使用 PARTITION BY 子句
  • 要对分区内的行进行排序,请使用 ORDER BY子句。

ROW_NUMBER, RANK, 和 DENSE_RANK 函数

ROW_NUMBER(),RANK()DENSE_RANK() 函数根据它们的顺序将整数值分配给行。

ROW_NUMBER() 函数为每个分区中的行分配一个正在运行的序列号。 请参阅以下查询:

SELECT
 product_name,
 group_name,
 price,
 ROW_NUMBER () OVER (
 PARTITION BY group_name
 ORDER BY
 price
 )
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    | group_name |  price  | row_number
--------------------+------------+---------+------------
 Dell Vostro        | Laptop     |  600.00 |          1
 Sony VAIO          | Laptop     |  600.00 |          2
 HP Elite           | Laptop     |  900.00 |          3
 Lenovo Thinkpad    | Laptop     | 1100.00 |          4
 Microsoft Lumia    | Smartphone |  300.00 |          1
 HTC One            | Smartphone |  400.00 |          2
 Nexus              | Smartphone |  500.00 |          3
 iPhone             | Smartphone |  800.00 |          4
 Kindle Fire        | Tablet     |  300.00 |          1
 iPad               | Tablet     |  700.00 |          2
 Samsung Galaxy Tab | Tablet     |  800.00 |          3
(11 rows)

RANK() 函数在有序分区内分配序号。如果两行的值相同,则RANK()函数指定相同的序号,下一个序号将被跳过。

SELECT
 product_name,
 group_name,
  price,
 RANK () OVER (
 PARTITION BY group_name
 ORDER BY
 price
 )
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    | group_name |  price  | rank
--------------------+------------+---------+------
 Dell Vostro        | Laptop     |  600.00 |    1
 Sony VAIO          | Laptop     |  600.00 |    1
 HP Elite           | Laptop     |  900.00 |    3
 Lenovo Thinkpad    | Laptop     | 1100.00 |    4
 Microsoft Lumia    | Smartphone |  300.00 |    1
 HTC One            | Smartphone |  400.00 |    2
 Nexus              | Smartphone |  500.00 |    3
 iPhone             | Smartphone |  800.00 |    4
 Kindle Fire        | Tablet     |  300.00 |    1
 iPad               | Tablet     |  700.00 |    2
 Samsung Galaxy Tab | Tablet     |  800.00 |    3
(11 rows)

RANK() 函数类似,DENSE_RANK() 函数在有序分区内分配序号,但序号是连续的。

SELECT
 product_name,
 group_name,
 price,
 DENSE_RANK () OVER (
 PARTITION BY group_name
 ORDER BY
 price
 )
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    | group_name |  price  | dense_rank
--------------------+------------+---------+------------
 Dell Vostro        | Laptop     |  600.00 |          1
 Sony VAIO          | Laptop     |  600.00 |          1
 HP Elite           | Laptop     |  900.00 |          2
 Lenovo Thinkpad    | Laptop     | 1100.00 |          3
 Microsoft Lumia    | Smartphone |  300.00 |          1
 HTC One            | Smartphone |  400.00 |          2
 Nexus              | Smartphone |  500.00 |          3
 iPhone             | Smartphone |  800.00 |          4
 Kindle Fire        | Tablet     |  300.00 |          1
 iPad               | Tablet     |  700.00 |          2
 Samsung Galaxy Tab | Tablet     |  800.00 |          3
(11 rows)

FIRST_VALUE 和 LAST_VALUE 函数

FIRST_VALUE() 函数从有序集合的第一行返回第一个值,而 LAST_VALUE() 函数返回结果集最后一行的最后一个值。

FIRST_VALUE() 函数举例:

SELECT
 product_name,
 group_name,
 price,
 FIRST_VALUE (price) OVER (
 PARTITION BY group_name
 ORDER BY
 price
 ) AS lowest_price_per_group
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    | group_name |  price  | lowest_price_per_group
--------------------+------------+---------+------------------------
 Dell Vostro        | Laptop     |  600.00 |                 600.00
 Sony VAIO          | Laptop     |  600.00 |                 600.00
 HP Elite           | Laptop     |  900.00 |                 600.00
 Lenovo Thinkpad    | Laptop     | 1100.00 |                 600.00
 Microsoft Lumia    | Smartphone |  300.00 |                 300.00
 HTC One            | Smartphone |  400.00 |                 300.00
 Nexus              | Smartphone |  500.00 |                 300.00
 iPhone             | Smartphone |  800.00 |                 300.00
 Kindle Fire        | Tablet     |  300.00 |                 300.00
 iPad               | Tablet     |  700.00 |                 300.00
 Samsung Galaxy Tab | Tablet     |  800.00 |                 300.00
(11 rows)

LAST_VALUE() 函数举例:

SELECT
 product_name,
 group_name,
 price,
 LAST_VALUE (price) OVER (
 PARTITION BY group_name
 ORDER BY
 price RANGE BETWEEN UNBOUNDED PRECEDING
 AND UNBOUNDED FOLLOWING
 ) AS highest_price_per_group
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    | group_name |  price  | highest_price_per_group
--------------------+------------+---------+-------------------------
 Dell Vostro        | Laptop     |  600.00 |                 1100.00
 Sony VAIO          | Laptop     |  600.00 |                 1100.00
 HP Elite           | Laptop     |  900.00 |                 1100.00
 Lenovo Thinkpad    | Laptop     | 1100.00 |                 1100.00
 Microsoft Lumia    | Smartphone |  300.00 |                  800.00
 HTC One            | Smartphone |  400.00 |                  800.00
 Nexus              | Smartphone |  500.00 |                  800.00
 iPhone             | Smartphone |  800.00 |                  800.00
 Kindle Fire        | Tablet     |  300.00 |                  800.00
 iPad               | Tablet     |  700.00 |                  800.00
 Samsung Galaxy Tab | Tablet     |  800.00 |                  800.00
(11 rows)

LAG 和 LEAD 函数

LAG() 函数能够访问前一行的数据,而 LEAD() 函数可以访问下一行的数据。它们的语法是一样的:

LAG  (expression [,offset] [,default])
LEAD (expression [,offset] [,default])
  • expression:一个用来计算返回值的字段或表达式。
  • offset:跳过的行数。默认是1
  • default:如果 offset 超出窗口的范围,返回的默认值。默认为 NULL

LAG() 示例:

SELECT
 product_name,
 group_name,
 price,
 LAG (price, 1) OVER (
 PARTITION BY group_name
 ORDER BY
 price
 ) AS prev_price,
 price - LAG (price, 1) OVER (
 PARTITION BY group_name
 ORDER BY
 price
 ) AS cur_prev_diff
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    | group_name |  price  | prev_price | price_diff
--------------------+------------+---------+------------+------------
 Dell Vostro        | Laptop     |  600.00 |            |
 Sony VAIO          | Laptop     |  600.00 |     600.00 |       0.00
 HP Elite           | Laptop     |  900.00 |     600.00 |     300.00
 Lenovo Thinkpad    | Laptop     | 1100.00 |     900.00 |     200.00
 Microsoft Lumia    | Smartphone |  300.00 |            |
 HTC One            | Smartphone |  400.00 |     300.00 |     100.00
 Nexus              | Smartphone |  500.00 |     400.00 |     100.00
 iPhone             | Smartphone |  800.00 |     500.00 |     300.00
 Kindle Fire        | Tablet     |  300.00 |            |
 iPad               | Tablet     |  700.00 |     300.00 |     400.00
 Samsung Galaxy Tab | Tablet     |  800.00 |     700.00 |     100.00
(11 rows)

LEAD() 示例:

SELECT
 product_name,
 group_name,
 price,
 LEAD (price, 1) OVER (
 PARTITION BY group_name
 ORDER BY
 price
 ) AS next_price,
 price - LEAD (price, 1) OVER (
 PARTITION BY group_name
 ORDER BY
 price
 ) AS cur_next_diff
FROM
 products
INNER JOIN product_groups USING (group_id);
    product_name    | group_name |  price  | next_price | cur_next_diff
--------------------+------------+---------+------------+---------------
 Dell Vostro        | Laptop     |  600.00 |     600.00 |          0.00
 Sony VAIO          | Laptop     |  600.00 |     900.00 |       -300.00
 HP Elite           | Laptop     |  900.00 |    1100.00 |       -200.00
 Lenovo Thinkpad    | Laptop     | 1100.00 |            |
 Microsoft Lumia    | Smartphone |  300.00 |     400.00 |       -100.00
 HTC One            | Smartphone |  400.00 |     500.00 |       -100.00
 Nexus              | Smartphone |  500.00 |     800.00 |       -300.00
 iPhone             | Smartphone |  800.00 |            |
 Kindle Fire        | Tablet     |  300.00 |     700.00 |       -400.00
 iPad               | Tablet     |  700.00 |     800.00 |       -100.00
 Samsung Galaxy Tab | Tablet     |  800.00 |            |
(11 rows)