3

I have a table deflator that is defined as:

 Table "deflator"
 Column | Type | Modifiers
-------------+-------------------+-----------
country_code | smallint | not null
country_name | character varying | not null
year | smallint | not null
deflator | numeric |
source | character varying |

Sample output from this table looks like:

country_code | country_name | year | deflator | source
-------------+---------------+------+----------+----------
 1 | country_1 | 2016 | 12 | source_1
 1 | country_1 | 2015 | 11 | source_2
 1 | country_1 | 2014 | 10 | source_2
 2 | country_2 | 2016 | 15 | source_1
 2 | country_2 | 2015 | 14 | source_1
 2 | country_2 | 2014 | 13 | source_2
 3 | country_3 | 2016 | 18 | source_1
 3 | country_3 | 2015 | 17 | source_2
 3 | country_3 | 2014 | 16 | source_3
(9 rows)

I use the following query to pivot the table if I exclude the column source:

SELECT
 *
FROM CROSSTAB (
 'SELECT
 country_code
 , country_name
 , year
 , deflator
 FROM dimension.master_oecd_deflator
 ORDER BY 1;'
 , $$ VALUES ('2014'::TEXT), ('2015'::TEXT), ('2016'::TEXT) $$
) AS "ct" (
 "country_code" SMALLINT
 , "country_name" TEXT
 , "2014" NUMERIC
 , "2015" NUMERIC
 , "2016" NUMERIC
);

The above query gives me:

country_code | country_name | 2016 | 2015 | 2014 |
-------------+-------------------+------+--- --+------+
 1 | country_1 | 12 | 11 | 10 |
 2 | country_2 | 15 | 14 | 13 |
 3 | country_3 | 18 | 17 | 16 |

But because the source of the deflator varies from year to year for each country I want to include the source column in the pivot for my desired output to look like:

country_code | country_name | 2016 | 2016_source | 2015 | 2015_source | 2014 | 2014_source
-------------+-------------------+------+-------------+------+-------------+------+------------
 1 | country_1 | 12 | source_1 | 11 | source_2 | 10 | source_2
 2 | country_2 | 15 | source_1 | 14 | source_1 | 13 | source_2
 3 | country_3 | 18 | source_1 | 17 | source_2 | 16 | source_3

How do I modify this query to give me the desired output? (With the source for each year listed next to the year). Is this even possible?

asked Dec 14, 2016 at 19:51

2 Answers 2

3

Yes it is possible, here is the solution:

WITH cte AS 
 ( SELECT * 
 FROM CROSSTAB 
 ( 'SELECT country_code, country_name, year, 
 deflator || '',''|| source
 FROM deflator 
 ORDER BY 1;', 
 $$ VALUES ('2014'::TEXT), ('2015'::TEXT), ('2016'::TEXT) $$
 ) AS "ct" ( "country_code" SMALLINT, 
 "country_name" TEXT , 
 "2014" text, "2015" text, "2016" text
 )
 )
SELECT country_code, country_name, 
 split_part("2014",',',1) AS "2014", 
 split_part("2014",',',2) AS "2014_source", 
 split_part("2015",',',1) AS "2015", 
 split_part("2015",',',2) AS "2015_source", 
 split_part("2016",',',1) AS "2016", 
 split_part("2016",',',2) AS "2016_source" 
FROM cte ;
dw8547
9473 gold badges11 silver badges24 bronze badges
answered Dec 15, 2016 at 9:49
0
5

Saddam has a smart solution, but it carries some weaknesses. Imagine a source named 'Fresno, CA' (with comma in the string). split_part() would be fooled by the separator character in the string ...

To avoid such corner case problems and preserve original data types, use a (well-defined!) row type instead. You can create a composite type permanently with CREATE TYPE or register a temporary one with CREATE TEMP TABLE:

CREATE TEMP TABLE defso (def numeric, so varchar); -- once per session!
SELECT country_code
 , country_name
 , (d14).def AS deflator_2014 -- note the parentheses!
 , (d14).so AS source_2014
 , (d15).def AS deflator_2015
 , (d15).so AS source_2015
 , (d16).def AS deflator_2016
 , (d16).so AS source_2016
FROM crosstab (
 'SELECT country_code, country_name, year, (deflator, source)::defso
 FROM deflator
 ORDER BY 1'
 , 'SELECT generate_series(2014, 2016)::int2'
 ) AS ct (country_code int2
 , country_name text
 , d14 defso
 , d15 defso
 , d16 defso
 );

I also removed the unnecessary CTE and simplified a bit.


While dealing with only a hand full of years, you can do without crosstab() and use self-joins:

SELECT country_code, country_name
 , d14.deflator AS deflator_2014
 , d14.source AS source_2014
 , d15.deflator AS deflator_2015
 , d15.source AS source_2015
 , d16.deflator AS deflator_2016
 , d16.source AS source_2016
FROM (SELECT * FROM deflator WHERE year = int2 '2014') d14
FULL JOIN (SELECT * FROM deflator WHERE year = int2 '2015') d15 USING (country_code, country_name)
FULL JOIN (SELECT * FROM deflator WHERE year = int2 '2016') d16 USING (country_code, country_name)
ORDER BY country_code;

Using FULL [OUTER] JOIN since we can't assume a row for every combination of (country_code, year). This way we get the same result as with the crosstab query above.

Including country_name in the join condition seems redundant, but if we don't, we have to use COALESCE(d14.country_name, d15.country_name, d16.country_name) AS country_name to defend against missing rows. This functionally dependent value shouldn't be in the table to begin with. Should be in a country table in a properly normalized schema.

answered Dec 15, 2016 at 13:43
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.