Problem
I have three tables that are structured like so:
t1
id | count_1
--------------
1 | 9
2 | 4
3 | 3
t2
id | count_2
--------------
1 | 2
3 | 3
t3
id | count_3
--------------
1 | 1
4 | 8
id
is unique in each table. Note that not all ids occur in each table. Here is the SQL to create those tables if you'd like to test.
I'm trying to merge all those tables with a column for each count, defaulting to zero if there is no count for that particular id
. Like this:
id | count_1 | count_2 | count_3
----------------------------------
1 | 9 | 2 | 1
2 | 4 | 0 | 0
3 | 3 | 3 | 0
4 | 0 | 0 | 8
Attempt
I thought this was a natural use case for a full outer join, like this:
SELECT
COALESCE(t1.id, t2.id, t3.id) as id,
COALESCE(t1.count_1, 0) as count_1,
COALESCE(t2.count_2, 0) as count_2,
COALESCE(t3.count_3, 0) as count_3
FROM
t1
FULL OUTER JOIN t2
ON t1.id = t2.id
FULL OUTER JOIN t3
ON t1.id = t3.id
ORDER BY id ASC;
But this returns a result with non unique ids, where each row is just a row from one of the original tables with zeroes filling in the remaining columns:
id | count_1 | count_2 | count_3
----------------------------------
1 | 9 | 0 | 0 # <- should
1 | 0 | 2 | 0 # <- be
1 | 0 | 0 | 1 # <- one row
2 | 4 | 0 | 0
3 | 3 | 0 | 0 # <- should also be
3 | 0 | 3 | 0 # <- one row
4 | 0 | 0 | 8
Evidently I don't understand outer joins as well as I thought I did. Can anyone show me the correct way to do this?
2 Answers 2
You could use FULL JOIN
but the code gets a bit messy - at least for my taste. With 3 tables it's not so bad, you'd only need to change:
FULL OUTER JOIN t3
ON t1.id = t3.id
to:
FULL OUTER JOIN t3
ON COALESCE(t1.id, t2.id) = t3.id
but with more tables, it gets rather ugly. The other option is to gather all distinct id values and then LEFT JOIN
all the tables:
SELECT
d.id,
COALESCE(t1.count_1, 0) AS count_1,
COALESCE(t2.count_2, 0) AS count_2,
COALESCE(t3.count_3, 0) AS count_3
FROM
( SELECT id FROM t1
UNION
SELECT id FROM t2
UNION
SELECT id FROM t3
) AS d
LEFT JOIN t1 ON t1.id = d.id
LEFT JOIN t2 ON t2.id = d.id
LEFT JOIN t3 ON t3.id = d.id
ORDER BY id ;
-
This works perfectly. Exactly the desired result.kdbanman– kdbanman2017年05月16日 22:19:57 +00:00Commented May 16, 2017 at 22:19
In the code you've provided, you have
SELECT
COALESCE(t1.id, t2.id, t3.id) as id,
COALESCE(t1.count_1, 0) as count_1,
COALESCE(t2.count_2, 0) as count_2,
COALESCE(t3.count_3, 0) as count_3
FROM
t1
FULL OUTER JOIN t2
ON t1.id = t2.id
FULL OUTER JOIN t3
ON t1.id = t3.id
ORDER BY id ASC;
This is somewhat fine. However, this will produce a Cartesian product if you have two rows with the same id
. For instance, this is fine, returning one row.
SELECT *
FROM ( VALUES (1,2) ) AS t(id,count_1)
INNER JOIN ( VALUES (1,3) ) AS g(id,count_2)
USING (id);
But the addition of (1,7)
to g
causes two rows to be rendered here,
SELECT *
FROM ( VALUES (1,2) ) AS t(id,count_1)
INNER JOIN ( VALUES (1,3),(1,7) ) AS g(id,count_2)
USING (id);
If this is what you're seeing then you have two rows with the same id in one of those three tables you're joining.
You now have to ask,
- In which table do I have duplicate ids? You can find this out with a simple
GROUP BY (id) HAVING count(*) > 1
. - Which row's
count_x
do I want in my final output?
-
From the question: "id is unique in each table."ypercubeᵀᴹ– ypercubeᵀᴹ2017年05月17日 12:26:47 +00:00Commented May 17, 2017 at 12:26
id
= 5 in tables 2 and 3 only, shows the problem. @ypercubeTM 's solution (COALESCE
the t1 and t2id
s) does resolve it.