Why can I select all fields when grouping by primary key but not when grouping by another column

Question 1

How is this a valid statement (where id is the primary key of the table):

select * from table group by id ;

and this is not:

select * from table group by name ;

ERROR: column "pgluser.id" must appear in the GROUP BY clause or be used in an aggregate function

Fiddle .

The question is why is the first a legal query, ie why grouping by primary key is valid?

Question 2

Please give a description of table and the error message.

Question 3

@mcNets it doesn't (in Postgres) if (id) is the primary key.

Question 4

@ypercubeTM thanks, I was just trying in rextester. I suppose it's possible because, in fact, there is nothing to GROUP using PK.

Question 5

@a_horse_with_no_name, while logically right, this behaviour does not apply for unique (not null) (see my answer)

Question 6

id is a primary key.
As far as I remember, this is actually a legal query according to ANSI/ISO SQL.
Grouping by primary key results in a single record in each group which is logically the same as not grouping at all / grouping by all columns, therefore we can select all other columns.

create table t (id int primary key,c1 int,c2 int)
insert into t (id,c1,c2) values (1,2,3),(4,5,6);
select * from t group by id;

+----+----+----+
| id | c1 | c2 |
+----+----+----+
| 1 | 2 | 3 |
+----+----+----+
| 4 | 5 | 6 |
+----+----+----+

Reference given by @a_horse_with_no_name

https://www.postgresql.org/docs/current/static/sql-select.html#SQL-GROUPBY

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

While logically we would expect UNIQUE NOT NULL to follow the same behaviour, it applies only for PK (as described in the documentation)

create table t (id int unique not null,c1 int,c2 int);
insert into t (id,c1,c2) values (1,2,3),(4,5,6);
select * from t group by id;

[Code: 0, SQL State: 42803] ERROR: column "t.c1" must appear in the GROUP BY clause or be used in an aggregate function

Question 7

thanks but the question is why is this a legal query IE why grouping by primary key is valid?

Question 8

Is it clear or additional explanation is needed?

Question 9

Is it just primary key that sql spec states this for, or all unique keys? could you find it in the SQL spec? this is interesting, never knew of this.

Question 10

@EvanCarroll, tested, does not work for UNIQUE (UNIQUE NOT NULL)

Question 11

I asked a question about this very behavior on PostgreSQL's developer mailing list a few years ago, and got a very informative answer about why PG recognizes functional dependencies from primary keys but not from unique not null constraints.

Question 12

I think the reason would be:

id is primary key here(unique) and group by primary key is alike to group by *. So it's just similar to

select * from table group by *

which should be fine.

score 12 · Accepted Answer · 2016-12-13 10:48:49Z

id is a primary key.
As far as I remember, this is actually a legal query according to ANSI/ISO SQL.
Grouping by primary key results in a single record in each group which is logically the same as not grouping at all / grouping by all columns, therefore we can select all other columns.

create table t (id int primary key,c1 int,c2 int)
insert into t (id,c1,c2) values (1,2,3),(4,5,6);
select * from t group by id;

+----+----+----+
| id | c1 | c2 |
+----+----+----+
| 1 | 2 | 3 |
+----+----+----+
| 4 | 5 | 6 |
+----+----+----+

Reference given by @a_horse_with_no_name

https://www.postgresql.org/docs/current/static/sql-select.html#SQL-GROUPBY

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

While logically we would expect UNIQUE NOT NULL to follow the same behaviour, it applies only for PK (as described in the documentation)

create table t (id int unique not null,c1 int,c2 int);
insert into t (id,c1,c2) values (1,2,3),(4,5,6);
select * from t group by id;

[Code: 0, SQL State: 42803] ERROR: column "t.c1" must appear in the GROUP BY clause or be used in an aggregate function

thanks but the question is why is this a legal query IE why grouping by primary key is valid?
Is it just primary key that sql spec states this for, or all unique keys? could you find it in the SQL spec? this is interesting, never knew of this.
@EvanCarroll, tested, does not work for UNIQUE (UNIQUE NOT NULL)
I asked a question about this very behavior on PostgreSQL's developer mailing list a few years ago, and got a very informative answer about why PG recognizes functional dependencies from primary keys but not from unique not null constraints.

Stack Exchange Network

Why can I select all fields when grouping by primary key but not when grouping by another column

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Why can I select all fields when grouping by primary key but not when grouping by another column

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions