How to Pivot in PostgreSQL

Question 1

Having the following data in a table:

ID Category Value
1234 Cat01 V001
1234 Cat02 V002
1234 Cat03 V003
1234 Cat03 V004
1234 Cat03 V005

I want to have the following output:

ID Cat01 Cat02 Cat03
1234 V001 V002 V003
1234 V001 V002 V004
1234 V001 V002 V005

The output I want to achieve is a kind of pivot table where I have all the values vertically in a table and I want to have those values, horizontally, having the category as a column. But there are some categories that have multiples values, in that case, I need to repeat the values of all other categories and create a row per each repeated value

How can it be done in PostgreSQL?

Question 2

This is a tricky one. crosstab() expects one (or no) value per category for each row_name.

We can work around this restriction like this:

SELECT id
 , COALESCE(cat01, max(cat01) OVER w)
 , COALESCE(cat02, max(cat02) OVER w)
 , COALESCE(cat03, max(cat03) OVER w)
FROM crosstab(
 'SELECT id::text || row_number() OVER (PARTITION BY id, category ORDER BY value) * -1 AS ext_id
 , id, category, value
 FROM tbl
 ORDER BY ext_id, category, value'
 ,$$VALUES ('Cat01'::text), ('Cat02'), ('Cat03')$$
 ) AS ct (xid text, id int, cat01 text, cat02 text, cat03 text)
WINDOW w AS (PARTITION BY id);

Returns your desired result.

How?

Add an extended id: ext_id from the existing id and a row number for each value of the category for the same id. This way we ensure as many rows per id in as there are values for the most common category. We get a derived table like this to build our crosstab() on:
```
ext_id | id | category | value
---------+------+----------+-------
'1234-1' | 1234 | 'Cat01' | 'V001'
'1234-1' | 1234 | 'Cat02' | 'V002'
'1234-1' | 1234 | 'Cat03' | 'V003'
'1234-2' | 1234 | 'Cat03' | 'V004'
'1234-3' | 1234 | 'Cat03' | 'V005'
```
Now we can feed it to crosstab() using the safe 2-parameter form for missing attributes. Read the basics first if you are not familiar with this:

PostgreSQL Crosstab Query

The original id is carried over as "extra column". See:
Pivot on Multiple Columns using Tablefunc

Your question leaves room for interpretation. My solution pairs the lowest values per category first and keeps filling the following rows until there are no values left. (We could combine multiple values per category any other way, it has not been defined.) If a category is short of values for a given id, the rest is filled in with NULL values.
In the final step I replace those NULL values with the maximum value of each category per id:
```
 COALESCE(cat01, max(cat01) OVER (PARTITION BY id, category))
```
which is effectively the same as:
```
 max(cat01) OVER (PARTITION BY id, category)
```
I am hoping to make it slightly faster if we only default to the window function if the value is NULL.

Question 3

Take a look here for an example of how to use the CROSSTAB function. Also, take a good look at Erwin Brandstetter's post in the same thread and links within (especially the "Basics for crosstab():" link.

Be careful with NULLs (see the discussion in link).

If you're not using a PostgreSQL version compiled from source, then all you have to do to access the CROSSTAB function is to input

CREATE EXTENSION tablefunc;

On the command line (see EB link).

I'm not sure that I've totally grasped your supplementary information, but maybe the CTE approach I used to cross join status and slots (your equivalent of category and value) might help. If not, please expand on your comment. EB's code might also be of help.

Question 4

What you look at is a crosstab. Assuming your table is called "Fact_Table", write:

select * from crosstab('select id, category, value from Fact_Table')

Also see http://www.postgresql.org/docs/9.5/static/tablefunc.html if you look for other variants.

Question 5

'tablefunc' is an extension

score 6 · Answer 1 · 2016-04-18 01:48:40Z

This is a tricky one. crosstab() expects one (or no) value per category for each row_name.

We can work around this restriction like this:

SELECT id
 , COALESCE(cat01, max(cat01) OVER w)
 , COALESCE(cat02, max(cat02) OVER w)
 , COALESCE(cat03, max(cat03) OVER w)
FROM crosstab(
 'SELECT id::text || row_number() OVER (PARTITION BY id, category ORDER BY value) * -1 AS ext_id
 , id, category, value
 FROM tbl
 ORDER BY ext_id, category, value'
 ,$$VALUES ('Cat01'::text), ('Cat02'), ('Cat03')$$
 ) AS ct (xid text, id int, cat01 text, cat02 text, cat03 text)
WINDOW w AS (PARTITION BY id);

Returns your desired result.

How?

Add an extended id: ext_id from the existing id and a row number for each value of the category for the same id. This way we ensure as many rows per id in as there are values for the most common category. We get a derived table like this to build our crosstab() on:
```
ext_id | id | category | value
---------+------+----------+-------
'1234-1' | 1234 | 'Cat01' | 'V001'
'1234-1' | 1234 | 'Cat02' | 'V002'
'1234-1' | 1234 | 'Cat03' | 'V003'
'1234-2' | 1234 | 'Cat03' | 'V004'
'1234-3' | 1234 | 'Cat03' | 'V005'
```
Now we can feed it to crosstab() using the safe 2-parameter form for missing attributes. Read the basics first if you are not familiar with this:

PostgreSQL Crosstab Query

The original id is carried over as "extra column". See:
Pivot on Multiple Columns using Tablefunc

Your question leaves room for interpretation. My solution pairs the lowest values per category first and keeps filling the following rows until there are no values left. (We could combine multiple values per category any other way, it has not been defined.) If a category is short of values for a given id, the rest is filled in with NULL values.
In the final step I replace those NULL values with the maximum value of each category per id:
```
 COALESCE(cat01, max(cat01) OVER (PARTITION BY id, category))
```
which is effectively the same as:
```
 max(cat01) OVER (PARTITION BY id, category)
```
I am hoping to make it slightly faster if we only default to the window function if the value is NULL.

Vérace Vérace 31k9 gold badges73 silver badges86 bronze badges · Answer 2 · 2016-04-12 21:51:51Z

Take a look here for an example of how to use the CROSSTAB function. Also, take a good look at Erwin Brandstetter's post in the same thread and links within (especially the "Basics for crosstab():" link.

Be careful with NULLs (see the discussion in link).

If you're not using a PostgreSQL version compiled from source, then all you have to do to access the CROSSTAB function is to input

CREATE EXTENSION tablefunc;

On the command line (see EB link).

I'm not sure that I've totally grasped your supplementary information, but maybe the CTE approach I used to cross join status and slots (your equivalent of category and value) might help. If not, please expand on your comment. EB's code might also be of help.

Razvan Popovici Razvan Popovici 1774 bronze badges · Answer 3 · 2016-04-12 17:04:31Z

What you look at is a crosstab. Assuming your table is called "Fact_Table", write:

select * from crosstab('select id, category, value from Fact_Table')

Also see http://www.postgresql.org/docs/9.5/static/tablefunc.html if you look for other variants.

'tablefunc' is an extension

Sahap Asci
– Sahap Asci

2016年04月12日 18:22:10 +00:00
Commented Apr 12, 2016 at 18:22

Stack Exchange Network

How to Pivot in PostgreSQL

3 Answers 3

How?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

How to Pivot in PostgreSQL

3 Answers 3

How?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions