What is the most efficient way, to create this output, from a single SQL table?

Question 1

In short: I would like to use this input:

+---+---+---+
| x | y | z |
+---+---+---+
| 1 | 1 | a |
| 1 | 2 | b |
| 1 | 3 | c |
| 2 | 1 | d |
| 2 | 2 | e |
| 2 | 3 | f |
| 3 | 1 | g |
| 3 | 2 | h |
| 3 | 2 | i |
| . | . | . |
| n | . | .
+---+---+---+

to generate this output:

+---+---------+---------+---------+---------+
| y | z (x=1) | z (x=2) | z (x=3) | z (x=n) |
+---+---------+---------+---------+---------+
| 1 | a | d | g | . |
| 2 | b | e | h | . |
| 3 | c | f | i | . |
+---+---------+---------+---------+---------+

Table sample:

CREATE TABLE "public"."data" (
 "x" text NOT NULL,
 "y" text NOT NULL,
 "z" text NOT NULL,
);

The goal is to generate the output, in the most efficient way possible.
max(x) will increase over time (->n)
max(y) should remain constant but may increase by ~10%
dynamic creation of z(x) columns & names

So far I have the following:

select * from crosstab('select y, x, z from data order by 1,2')
as ct (y varchar, x1z varchar, x2z varchar, x3z varchar, 
 x4z varchar, x5z varchar, x6z varchar)
;

which seems to work well (so far):

+----+-----+-----+-----+-----+-----+-----+
| y | x1z | x2z | x3z | x4z | x5z | x6z |
+----+-----+-----+-----+-----+-----+-----+
| 10 | fo | ob | ar | fo | ob | ar |
| 20 | ob | ar | fo | ob | ar | fo |
| 30 | ar | fo | ob | ar | fo | ob |
+----+-----+-----+-----+-----+-----+-----+

In the previous SQL snippet, I manually defined the static column names.

These should be based on x values & hence 'dynamic' matching below

select array (select distinct x from data order by x)

| x_campaigns |
| ------------------------- |
| ["1","2","3","4","5","6"] |

Another example to add clarity

using the same crosstab SQL snippet, with arbitrarily defined column names
these column names should be dynamically defined, in this example you can say: 'worldcup'+'year'
in the previous case only 'x' is required, as is

CREATE TABLE world_cup(
 year varchar(5), 
 game varchar(5), 
 score varchar(5))
 ;
-- insert values ...
select * from crosstab('select game, year, score from world_cup order by 1,2')
as ct (game varchar, WorldCup17 varchar, WorldCup18 varchar,
 WorldCup19 varchar, WorldCup20 varchar, WorldCup21 varchar, WorldCup22 varchar)

+-------+------------+------------+------------+------------+------------+------------+
| match | worldcup17 | worldcup18 | worldcup19 | worldcup20 | worldcup21 | worldcup22 |
+-------+------------+------------+------------+------------+------------+------------+
| DE_FR | 2-2 | 1-1 | 0-0 | 3-2 | 0-2 | 1-2 |
| EN_DE | 2-0 | 0-2 | 2-1 | 0-0 | 3-0 | 0-0 |
| ES_FR | 0-1 | 0-0 | 1-5 | 0-5 | 1-1 | 3-1 |
+-------+------------+------------+------------+------------+------------+------------+

Thoughts?

fiddle

Version: PostgreSQL 13.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

Question 2

"these should based on x values & hence dynamic" - not possible. One fundamental restriction of the SQL language is, that the number, names and data types of a query must be known to the database engine while the query is parsed/analyzed. The columns can't be "defined" while the data that is retrieved. SQL wasn't designed for this. Crosstab reports are much better done in the application displaying those results.

Question 3

This is PIVOT (in PostgreSQL terms - crosstab). If your values list and, hence, output structure is dynamic then you'd use dynamic SQL, build and execute proper CROSSTAB query.

Question 4

@a_horse_with_no_name, only the column names, the structure remains constant e.g. next set is 100-103

Question 5

@Akina, an example?

Question 6

If the column names change then this by definition changes the structure of the query.

Question 7

As mentioned, it's impossible to create a query that returns a different number of columns each time you run it. In general I recommend to do this kind of pivot/crosstab in the frontend (UI)

Possible alternatives are to aggregate into a JSON value:

select y, 
 jsonb_object_agg(concat('x',x, 'z'), y) as xz
from data
group by y
order by y;

This returns something like this:

y | xz 
---+-------------------------------------------------------------------------------
10 | {"x1z": "10", "x2z": "10", "x3z": "10", "x4z": "10", "x5z": "10", "x6z": "10"}
20 | {"x1z": "20", "x2z": "20", "x3z": "20", "x4z": "20", "x5z": "20", "x6z": "20"}
30 | {"x1z": "30", "x2z": "30", "x3z": "30", "x4z": "30", "x5z": "30", "x6z": "30"}

JSON example

Another alternative is to write a procedure that dynamically creates a view that does the pivot/crosstab. I am not a fan of the crosstab() function and prefer filtered aggregation:

select y, 
 max(z) filter (where x = '1') as x1z,
 max(z) filter (where x = '2') as x2z,
 max(z) filter (where x = '3') as x3z,
 max(z) filter (where x = '4') as x4z,
 max(z) filter (where x = '5') as x5z,
 max(z) filter (where x = '6') as x6z
from data
group by y
order by y;

This statement follows a pattern that can be automated to dynamically create a view based on the filtered aggregation.

create or replace procedure create_crosstab_view()
as
$$
declare
 l_sql text;
begin
 select 'create view crosstab_view as select y, '||
 string_agg(format('max(z) filter (where x = %L) as %I', x, concat('x',x,'z')), ',' order by x)||' from data group by y'
 into l_sql
 from (
 select distinct x from data
 ) t;
 execute 'drop view if exists crosstab_view cascade;';
 execute l_sql;
end;
$$
language plpgsql;

After creating the view, you run

select *
from crosstab_view;

If the source data changes, you re-create the view by running the procedure. You could put that into a trigger if you want to.

Dynamic view example

Question 8

+1 I cannot up vote, I am too n00b & don't have enough points yet...

Question 9

ah but I can accept...game on

user1822user1822 · Accepted Answer · 2022-11-23 09:10:20Z

As mentioned, it's impossible to create a query that returns a different number of columns each time you run it. In general I recommend to do this kind of pivot/crosstab in the frontend (UI)

Possible alternatives are to aggregate into a JSON value:

select y, 
 jsonb_object_agg(concat('x',x, 'z'), y) as xz
from data
group by y
order by y;

This returns something like this:

y | xz 
---+-------------------------------------------------------------------------------
10 | {"x1z": "10", "x2z": "10", "x3z": "10", "x4z": "10", "x5z": "10", "x6z": "10"}
20 | {"x1z": "20", "x2z": "20", "x3z": "20", "x4z": "20", "x5z": "20", "x6z": "20"}
30 | {"x1z": "30", "x2z": "30", "x3z": "30", "x4z": "30", "x5z": "30", "x6z": "30"}

JSON example

Another alternative is to write a procedure that dynamically creates a view that does the pivot/crosstab. I am not a fan of the crosstab() function and prefer filtered aggregation:

select y, 
 max(z) filter (where x = '1') as x1z,
 max(z) filter (where x = '2') as x2z,
 max(z) filter (where x = '3') as x3z,
 max(z) filter (where x = '4') as x4z,
 max(z) filter (where x = '5') as x5z,
 max(z) filter (where x = '6') as x6z
from data
group by y
order by y;

This statement follows a pattern that can be automated to dynamically create a view based on the filtered aggregation.

create or replace procedure create_crosstab_view()
as
$$
declare
 l_sql text;
begin
 select 'create view crosstab_view as select y, '||
 string_agg(format('max(z) filter (where x = %L) as %I', x, concat('x',x,'z')), ',' order by x)||' from data group by y'
 into l_sql
 from (
 select distinct x from data
 ) t;
 execute 'drop view if exists crosstab_view cascade;';
 execute l_sql;
end;
$$
language plpgsql;

After creating the view, you run

select *
from crosstab_view;

If the source data changes, you re-create the view by running the procedure. You could put that into a trigger if you want to.

Dynamic view example

+1 I cannot up vote, I am too n00b & don't have enough points yet...

Stack Exchange Network

What is the most efficient way, to create this output, from a single SQL table?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

What is the most efficient way, to create this output, from a single SQL table?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions