Select column name that has the maximum value (only list the column names once in the SQL query)

Question 1

Table name: employment_by_industry

For each row, I want to select the column name that has the maximum value:

What is the most succinct way to do that in PostgreSQL, with the column names only being listed once in the SQL query?

For example, this is how it would be done in Oracle:

select objectid, max(col_name) keep (dense_rank first order by col_val desc) max_col_name
from employment_by_industry
unpivot (
 col_val
 for col_name in (
 agr_forest_fish, mining_quarry, mfg, electric, water_sew --cols only listed once
 )
)
group by objected

https://dbfiddle.uk/DeaRoikf

Question 2

Is the number of objectids small and known in advance or can it change? And what do you do in the case of ties?

Question 3

And why do you have AGR_FOREST_FISH twice?

Question 4

@Vérace The OBJECTIDs are relatively small, yes. Thousands, not millions. The values are static in a table; they won’t change. AGR_FOREST_FISH is listed twice in the desired output table because it is the column with the highest count in both row 1 and 3.

Question 5

No need to explode the values out and regroup them. Can do dbfiddle.uk/ZdJhvNZC (though ignores the "only list once")

Question 6

So do you have your answer?

Question 7

Basic query

You can use a VALUES expression to unpivot, like Martin already suggested. Plus, if any value column can be null use DESC NULLS LAST, and add a tiebreaker to ORDER BY to make the result deterministic:

SELECT e.objectid
 , (SELECT t.industry
 FROM (
 VALUES
 (e.agr_forest_fish, 'agr_forest_fish')
 , (e.mining_quarry , 'mining_quarry')
 , (e.mfg , 'mfg')
 -- ...
 ) t (val, industry)
 ORDER BY t.val DESC NULLS LAST, t.industry -- industry as tiebreaker
 LIMIT 1)
FROM employment_by_industry e;

Don't spell out column names

You ask to ...

only list the column names once in the SQL query

if you look up column names in the system dynamically, you don't have to list them at all.
Well, you have to mention objectid once to exclude it (or even avoid that and look up the PK column instead).
Base query:

SELECT attname
FROM pg_catalog.pg_attribute
WHERE attrelid = 'employment_by_industry'::regclass
AND attnum > 0
AND NOT attisdropped
AND attname <> 'objectid' -- exclude more?

Concatenate that into the query from above - pretty format completely optional - and execute the generated query.
Demonstrating \gexec in psql:

test=> SELECT 'SELECT e.objectid
test'> , (SELECT t.industry
test'> FROM (
test'> VALUES
test'> '
test-> || string_agg(format('(e.%1$I , %1$L)', attname), E'\n , ')
test-> || '
test'> ) t (val, industry)
test'> ORDER BY t.val DESC NULLS LAST, t.industry
test'> LIMIT 1
test'> )
test'> FROM public.employment_by_industry e'
test-> FROM pg_attribute
test-> WHERE attrelid = 'public.employment_by_industry'::regclass
test-> AND attnum > 0
test-> AND NOT attisdropped
test-> AND attname <> 'objectid'\gexec
 objectid | industry 
----------+-----------------
 1 | agr_forest_fish
 2 | mfg
 3 | agr_forest_fish
(3 rows)

Or write a (temporary) function and execute it:

CREATE OR REPLACE FUNCTION pg_temp.f_maxcol()
 RETURNS TABLE (objectid int, industry text)
 LANGUAGE plpgsql PARALLEL SAFE AS
$func$
BEGIN 
 RETURN QUERY EXECUTE
 -- RAISE NOTICE '%', -- print to debug
 (
 SELECT 'SELECT e.objectid
 , (SELECT t.industry
 FROM (
 VALUES
 '
 || string_agg(format('(e.%1$I , %1$L)', attname), E'\n , ')
 || '
 ) t (val, industry)
 ORDER BY t.val DESC NULLS LAST, t.industry
 LIMIT 1)
FROM public.employment_by_industry e'
 FROM pg_catalog.pg_attribute
 WHERE attrelid = 'public.employment_by_industry'::regclass
 AND attnum > 0
 AND NOT attisdropped
 AND attname <> 'objectid'
 );
END 
$func$;

Call:

SELECT * FROM pg_temp.f_maxcol();

fiddle

Or just use two steps: first generate, then execute.

Function to return dynamic set of columns for given table

score 4 · Answer 1 · 2024-08-18 23:33:10Z

Basic query

You can use a VALUES expression to unpivot, like Martin already suggested. Plus, if any value column can be null use DESC NULLS LAST, and add a tiebreaker to ORDER BY to make the result deterministic:

SELECT e.objectid
 , (SELECT t.industry
 FROM (
 VALUES
 (e.agr_forest_fish, 'agr_forest_fish')
 , (e.mining_quarry , 'mining_quarry')
 , (e.mfg , 'mfg')
 -- ...
 ) t (val, industry)
 ORDER BY t.val DESC NULLS LAST, t.industry -- industry as tiebreaker
 LIMIT 1)
FROM employment_by_industry e;

Don't spell out column names

You ask to ...

only list the column names once in the SQL query

if you look up column names in the system dynamically, you don't have to list them at all.
Well, you have to mention objectid once to exclude it (or even avoid that and look up the PK column instead).
Base query:

SELECT attname
FROM pg_catalog.pg_attribute
WHERE attrelid = 'employment_by_industry'::regclass
AND attnum > 0
AND NOT attisdropped
AND attname <> 'objectid' -- exclude more?

Concatenate that into the query from above - pretty format completely optional - and execute the generated query.
Demonstrating \gexec in psql:

test=> SELECT 'SELECT e.objectid
test'> , (SELECT t.industry
test'> FROM (
test'> VALUES
test'> '
test-> || string_agg(format('(e.%1$I , %1$L)', attname), E'\n , ')
test-> || '
test'> ) t (val, industry)
test'> ORDER BY t.val DESC NULLS LAST, t.industry
test'> LIMIT 1
test'> )
test'> FROM public.employment_by_industry e'
test-> FROM pg_attribute
test-> WHERE attrelid = 'public.employment_by_industry'::regclass
test-> AND attnum > 0
test-> AND NOT attisdropped
test-> AND attname <> 'objectid'\gexec
 objectid | industry 
----------+-----------------
 1 | agr_forest_fish
 2 | mfg
 3 | agr_forest_fish
(3 rows)

Or write a (temporary) function and execute it:

CREATE OR REPLACE FUNCTION pg_temp.f_maxcol()
 RETURNS TABLE (objectid int, industry text)
 LANGUAGE plpgsql PARALLEL SAFE AS
$func$
BEGIN 
 RETURN QUERY EXECUTE
 -- RAISE NOTICE '%', -- print to debug
 (
 SELECT 'SELECT e.objectid
 , (SELECT t.industry
 FROM (
 VALUES
 '
 || string_agg(format('(e.%1$I , %1$L)', attname), E'\n , ')
 || '
 ) t (val, industry)
 ORDER BY t.val DESC NULLS LAST, t.industry
 LIMIT 1)
FROM public.employment_by_industry e'
 FROM pg_catalog.pg_attribute
 WHERE attrelid = 'public.employment_by_industry'::regclass
 AND attnum > 0
 AND NOT attisdropped
 AND attname <> 'objectid'
 );
END 
$func$;

Call:

SELECT * FROM pg_temp.f_maxcol();

fiddle

Or just use two steps: first generate, then execute.

Function to return dynamic set of columns for given table

Stack Exchange Network

Select column name that has the maximum value (only list the column names once in the SQL query)

1 Answer 1

Basic query

Don't spell out column names

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Select column name that has the maximum value (only list the column names once in the SQL query)

1 Answer 1

Basic query

Don't spell out column names

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions