0

I have the following setup:

  1. First, I create a temp table q10c_debug_sql to avoid clutter
create table q10c_debug_sql as
SELECT 
 movie_id, 
 company_id, 
 company_type_id 
 FROM 
 "postgres"."imdb_int"."movie_companies" 
 WHERE 
 (company_id) IN (
 SELECT 
 company_id 
 FROM 
 "postgres"."imdb"."q10c_company_name"
 ) 
 AND (company_type_id) NOT IN (
 SELECT 
 company_type_id 
 FROM 
 "postgres"."imdb_int"."company_type"
 )

The resulting table is an empty table

postgres=# select * from q10c_debug_sql;
 movie_id | company_id | company_type_id 
----------+------------+-----------------
(0 rows)
  1. Now, I issue the following two queries
postgres=# select count(*) from (select * from imdb_int.movie_companies except select * from q10c_debug_sql) as foo;
 count 
---------
 2549109
(1 row)
postgres=# select count(*) from (select * from imdb_int.movie_companies as a left join q10c_debug_sql as b on a.movie_id = b.movie_id and a.company_id = b.company_id and a.company_type_id = b.company_type_id) as foo;
 count 
---------
 2609129
(1 row)

As one can see they return different count. On paper, these two queries are equivalent and should return 2609129, the size of movie_companies table:

postgres=# select count(*) from imdb_int.movie_companies;
 count 
---------
 2609129
(1 row)

I don't know why this happens? I want to use EXCEPT for clarity but the query gives unexpected result. Any pointers are appreciated.

My psql versions

psql (15.3 (Ubuntu 15.3-1.pgdg20.04+1), server 13.11 (Ubuntu 13.11-1.pgdg20.04+1))
asked May 27, 2023 at 0:24

1 Answer 1

1

It turns out EXCEPT returns only distinct values, i.e., EXCEPT returns any distinct values from the left query that are not also found on the right query. Thus, semantically, EXCEPT is not the same as left join: the former is set semantics but the latter is bag semantics.

Thanks to this page for pointer.

postgres=# select count(distinct(movie_id, company_id, company_type_id)) from imdb_int.movie_companies;
 count 
---------
 2549109
(1 row)
answered May 27, 2023 at 0:37
4
  • 1
    EXCEPT is synonym for EXCEPT DISTINCT. There Is also EXCEPT ALL. In the same way the other set operators have two variants, UNION, UNION ALL, INTERSECT and INTERSECT ALL Commented May 27, 2023 at 6:06
  • 1
    @Lennart-SlavaUkraini Thanks for the informative comment. Commented May 27, 2023 at 22:14
  • FWIW, there are some surprising effects using <SETOP> DISTINCT on bags. One example is that A UNION A may have less cardinality than A. Then throw null into the mix and we are in for a treat;-) Commented May 28, 2023 at 5:11
  • About EXCEPT ALL: dba.stackexchange.com/a/120680/3684, stackoverflow.com/a/19364694/939860 Commented May 30, 2023 at 4:26

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.