Few words about Postgresql's window functions
Ranking
Understanding the principle of window functions is easy as soon as you realize it has to do with rankings. Let’s take a practical example:
CREATE TABLE
production (
company_name VARCHAR PRIMARY KEY,
country VARCHAR NOT NULL,
production INTEGER NOT NULL
);
This table will store some company’s production statistics. Let’s fill it with data
INSERT INTO production (company_name ,country, production) VALUES
('futur corp', 'états-unis', 12),
('eep techno', 'états-unis', 14),
('keystop', 'états-unis', 8),
('vehiculo longo', 'brésil', 16),
('Kastarok', 'russie', 13),
('Moldonov', 'russie', 7),
('Glaß gmbh', 'allemagne', 8),
('Gien zu', 'chine', 11),
('chang intl', 'chine', 15),
('medong', 'chine', 12)
;
Of course it is easy to display the company list sorted by production but why not adding the ranking information ?
SELECT company_name, country, production, rank() OVER (ORDER BY production DESC) FROM production ;company_name | country | production | rank ----------------+------------+------------+------ vehiculo longo | brésil | 16 | 1 chang intl | chine | 15 | 2 eep techno | états-unis | 14 | 3 Kastarok | russie | 13 | 4 futur corp | états-unis | 12 | 5 medong | chine | 12 | 5 Gien zu | chine | 11 | 7 Glaß gmbh | allemagne | 8 | 8 keystop | états-unis | 8 | 8 Moldonov | russie | 7 | 10
You can see here the obvious advantages of using window functions, two companies have by example the same production, they have rown 8 and there is no 9th rank ! (This behavior can be changed by using the _dense_rank()_ function.). The rank() window function has here been invoqued over a unique partition of all results order by production DESC. This means we can add a rank information per country by example:
SELECT company_name, country, production, rank() OVER (PARTITION BY country ORDER BY production DESC) as country_rank FROM production ;company_name | country | production | country_rank ----------------+------------+------------+-------------- Glaß gmbh | allemagne | 8 | 1 vehiculo longo | brésil | 16 | 1 chang intl | chine | 15 | 1 medong | chine | 12 | 2 Gien zu | chine | 11 | 3 eep techno | états-unis | 14 | 1 futur corp | états-unis | 12 | 2 keystop | états-unis | 8 | 3 Kastarok | russie | 13 | 1 Moldonov | russie | 7 | 2
Here you see the two first results are marked with a rank of 1 because these countries own one factory. Note Postgresql has naturally sorted the results by country then ranking ASC. Of course you can filter on a rank but you have to use a sub select:
SELECT
company_name,
country,
production,
country_rank
FROM (
SELECT
company_name,
country,
production,
rank() OVER (PARTITION BY country ORDER BY production DESC) as country_rank
FROM
production
) sub_select
WHERE
sub_select.country_rank = 1
;
Window functions are not just about ranking. What if we want to indicate the maximum production for each country next to the _country_rank_ ?
SELECT company_name, country, production, first_value(production) OVER (PARTITION BY country ORDER BY production DESC) as max_country_production, rank() OVER (PARTITION BY country ORDER BY production DESC) as country_rank FROM production ORDER BY country, country_rank ASC ;company_name | country | production | max_country_production | country_rank ----------------+------------+------------+------------------------+-------------- Glaß gmbh | allemagne | 8 | 8 | 1 vehiculo longo | brésil | 16 | 16 | 1 chang intl | chine | 15 | 15 | 1 medong | chine | 12 | 15 | 2 Gien zu | chine | 11 | 15 | 3 eep techno | états-unis | 14 | 14 | 1 futur corp | états-unis | 12 | 14 | 2 keystop | états-unis | 8 | 14 | 3 Kastarok | russie | 13 | 13 | 1 Moldonov | russie | 7 | 13 | 2
Or maybe you want to display the total production for each country:
SELECT
company_name,
country,
production,
first_value(production) OVER (PARTITION BY country ORDER BY production DESC) as max_country_production,
sum(production) OVER (PARTITION BY country) AS country_production,
rank() OVER (PARTITION BY country ORDER BY production DESC) as country_rank
FROM
production
ORDER BY country_production DESC, country_rank ASC
;
company_name | country | production | max_country_production | country_production | country_rank
----------------+------------+------------+------------------------+--------------------+--------------
chang intl | chine | 15 | 15 | 38 | 1
medong | chine | 12 | 15 | 38 | 2
Gien zu | chine | 11 | 15 | 38 | 3
eep techno | états-unis | 14 | 14 | 34 | 1
futur corp | états-unis | 12 | 14 | 34 | 2
keystop | états-unis | 8 | 14 | 34 | 3
Kastarok | russie | 13 | 13 | 20 | 1
Moldonov | russie | 7 | 13 | 20 | 2
vehiculo longo | brésil | 16 | 16 | 16 | 1
Glaß gmbh | allemagne | 8 | 8 | 8 | 1
You may notice we have several partitions with the same definition above. We can regroup these definitions into one window statement:
SELECT
company_name,
country,
production,
first_value(production) OVER country_prod as max_country_production,
sum(production) OVER (PARTITION BY country) AS country_production,
rank() OVER country_prod as country_rank
FROM
production
WINDOW
country_prod AS (PARTITION BY country ORDER BY production DESC)
ORDER BY
country_production DESC,
country_rank ASC
;
Playing with partitions
As the window function knows its row it is possible to create cumulative statistics:
SELECT
company_name,
country,
production,
first_value(production) OVER country_prod as max_country_production,
rank() OVER country_prod as country_rank,
sum(production) OVER (PARTITION BY country ORDER BY production DESC RANGE UNBOUNDED PRECEDING) as cumul_prod
FROM
production
WHERE
country = 'chine'
WINDOW
country_prod AS (PARTITION BY country ORDER BY production DESC)
ORDER BY
country DESC,
country_rank ASC
;
Or cumulative production percentage:
SELECT
company_name,
country,
production,
rank() OVER (PARTITION BY country ORDER BY production DESC) AS country_rank,
sum(production) OVER (PARTITION BY country ORDER BY production DESC RANGE UNBOUNDED PRECEDING) / sum(production) OVER (PARTITION BY country) as cumul_prod
FROM
production
WHERE
country = 'chine'
ORDER BY
country DESC,
country_rank ASC
This is it, hope you have found this article useful. Take care !