0

Problem

I have three tables that are structured like so:

t1

 id | count_1 
--------------
 1 | 9
 2 | 4 
 3 | 3

t2

 id | count_2 
--------------
 1 | 2
 3 | 3

t3

 id | count_3 
--------------
 1 | 1
 4 | 8

id is unique in each table. Note that not all ids occur in each table. Here is the SQL to create those tables if you'd like to test.

I'm trying to merge all those tables with a column for each count, defaulting to zero if there is no count for that particular id. Like this:

 id | count_1 | count_2 | count_3 
----------------------------------
 1 | 9 | 2 | 1 
 2 | 4 | 0 | 0 
 3 | 3 | 3 | 0 
 4 | 0 | 0 | 8 

Attempt

I thought this was a natural use case for a full outer join, like this:

SELECT
 COALESCE(t1.id, t2.id, t3.id) as id,
 COALESCE(t1.count_1, 0) as count_1,
 COALESCE(t2.count_2, 0) as count_2,
 COALESCE(t3.count_3, 0) as count_3
FROM
 t1
FULL OUTER JOIN t2
 ON t1.id = t2.id
FULL OUTER JOIN t3
 ON t1.id = t3.id
ORDER BY id ASC;

But this returns a result with non unique ids, where each row is just a row from one of the original tables with zeroes filling in the remaining columns:

 id | count_1 | count_2 | count_3 
----------------------------------
 1 | 9 | 0 | 0 # <- should
 1 | 0 | 2 | 0 # <- be
 1 | 0 | 0 | 1 # <- one row
 2 | 4 | 0 | 0 
 3 | 3 | 0 | 0 # <- should also be
 3 | 0 | 3 | 0 # <- one row
 4 | 0 | 0 | 8 

Evidently I don't understand outer joins as well as I thought I did. Can anyone show me the correct way to do this?

asked May 16, 2017 at 20:44
3
  • The specific data and query would give 4 rows in the result, not 7. Commented May 16, 2017 at 21:02
  • FYI - I tried your code in a SQLFiddle. The only change I made was adding the commas after your fields in your select. I actually got exactly the results you wanted - one line per ID, with the appropriate counts in the appropriate columns. However, this one, with counts for id = 5 in tables 2 and 3 only, shows the problem. @ypercubeTM 's solution (COALESCE the t1 and t2 ids) does resolve it. Commented May 16, 2017 at 21:03
  • That's interesting. That outer join query in my attempt does seem to be working with this simpler example. Interesting. I'll need to more carefully compare this tiny example and the actual production query where I saw the undesired behaviour. Commented May 16, 2017 at 22:59

2 Answers 2

2

You could use FULL JOIN but the code gets a bit messy - at least for my taste. With 3 tables it's not so bad, you'd only need to change:

FULL OUTER JOIN t3
 ON t1.id = t3.id

to:

FULL OUTER JOIN t3
 ON COALESCE(t1.id, t2.id) = t3.id

but with more tables, it gets rather ugly. The other option is to gather all distinct id values and then LEFT JOIN all the tables:

SELECT
 d.id,
 COALESCE(t1.count_1, 0) AS count_1,
 COALESCE(t2.count_2, 0) AS count_2,
 COALESCE(t3.count_3, 0) AS count_3
FROM
 ( SELECT id FROM t1
 UNION
 SELECT id FROM t2
 UNION
 SELECT id FROM t3
 ) AS d
 LEFT JOIN t1 ON t1.id = d.id
 LEFT JOIN t2 ON t2.id = d.id
 LEFT JOIN t3 ON t3.id = d.id
ORDER BY id ;
answered May 16, 2017 at 20:50
1
  • This works perfectly. Exactly the desired result. Commented May 16, 2017 at 22:19
0

In the code you've provided, you have

SELECT
 COALESCE(t1.id, t2.id, t3.id) as id,
 COALESCE(t1.count_1, 0) as count_1,
 COALESCE(t2.count_2, 0) as count_2,
 COALESCE(t3.count_3, 0) as count_3
FROM
 t1
FULL OUTER JOIN t2
 ON t1.id = t2.id
FULL OUTER JOIN t3
 ON t1.id = t3.id
ORDER BY id ASC;

This is somewhat fine. However, this will produce a Cartesian product if you have two rows with the same id. For instance, this is fine, returning one row.

SELECT *
FROM ( VALUES (1,2) ) AS t(id,count_1)
INNER JOIN ( VALUES (1,3) ) AS g(id,count_2)
 USING (id);

But the addition of (1,7) to g causes two rows to be rendered here,

SELECT *
FROM ( VALUES (1,2) ) AS t(id,count_1)
INNER JOIN ( VALUES (1,3),(1,7) ) AS g(id,count_2)
 USING (id);

If this is what you're seeing then you have two rows with the same id in one of those three tables you're joining.

You now have to ask,

  1. In which table do I have duplicate ids? You can find this out with a simple GROUP BY (id) HAVING count(*) > 1.
  2. Which row's count_x do I want in my final output?
answered May 16, 2017 at 22:18
1
  • From the question: "id is unique in each table." Commented May 17, 2017 at 12:26

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.