When using join on many to many relationship the result is split on multiple rows. What I'd like to do is convert the right side of a join into an array so the result is one row.
Example with 3 tables:
CREATE TABLE items (
id serial primary key,
title text
);
CREATE TABLE tags (
id serial primary key,
title text
);
CREATE TABLE items_tags (
item_id int references items(id),
tag_id int references tags(id),
primary key (item_id, tag_id)
);
When selecting items with their tags I can do it this way:
SELECT i.id, i.title, i.title
FROM items i
INNER JOIN items_tags it
ON it.item_id = i.id
INNER JOIN tags t
ON t.id = it.tag_id;
And the result will come up as:
(1, "item n1", "sport")
(1, "item n1", "soccer")
(2, "item n2", "adventure")
(2, "item n2", "mountain climbing")
(2, "item n2", "sport")
(2, "item n2", "nature")
What I'd like to have is this:
(1, "item n1", ["sport", "soccer"])
(2, "item n2", ["adventure", "mountain climbing", "sport" , "nature"])
2 Answers 2
To aggregate most rows
While querying all or most items, it is typically substantially faster to aggregate rows from the "many"-table first and join later:
SELECT id, i.title AS item_title, t.tag_array
FROM items i
JOIN ( -- or LEFT JOIN ?
SELECT it.item_id AS id, array_agg(t.title) AS tag_array
FROM items_tags it
JOIN tags t ON t.id = it.tag_id
GROUP BY it.item_id
) t USING (id);
Use LEFT [OUTER] JOIN
in the outer query if there can be items without tags - which would be excluded with [INNER] JOIN
.
Since that does not multiply rows in the join, we need no GROUP BY
in the outer SELECT
.
Joining before aggregation also gets out of hands with more than one 1:n table in the FROM
list (not in this simple case). See:
To aggregate few rows
For a small percentage of rows, use a [`LATERAL`][2] join with an [ARRAY constructor][3]:
SELECT id, title AS item_title, t.tag_array
FROM items i, LATERAL ( -- this is an implicit CROSS JOIN
SELECT ARRAY (
SELECT t.title
FROM items_tags it
JOIN tags t ON t.id = it.tag_id
WHERE it.item_id = i.id
) AS tag_array
) t;
Since an ARRAY constructor always produces a row (with empty array if the subquery is empty - subtle difference in the result!), LEFT JOIN LATERAL (...) ON true
is not needed here. See:
- What is the difference between LATERAL and a subquery in PostgreSQL?
- Why is array_agg() slower than the non-aggregate ARRAY() constructor?
- Understanding multiple table join with aggregation
Aside
You had a typo in your query. 3rd column would be t.title
. I added aliases to your original (un-aggregated) query to clarify:
SELECT i.id, i.title AS item_title, t.title AS tag_title
FROM items i
JOIN items_tags it ON it.item_id = i.id
JOIN tags t ON t.id = it.tag_id;
"id" or "title" are poor identifiers. See:
-
i found it helpful to see these working: op sqlfiddle: sqlfiddle.com/#!15/710a3/3/0 and aggregate join: sqlfiddle.com/#!15/710a3/4/0bristweb– bristweb2023年01月09日 19:05:01 +00:00Commented Jan 9, 2023 at 19:05
-
also something similar to postgrest many:many join table maps (postgrest.org/en/stable/api.html#many-to-many-relationships). this keeps the items list even if there are no tags (left outer join) and uses empty array instead of null: sqlfiddle.com/#!15/f94e3/4/0.bristweb– bristweb2023年01月09日 19:18:22 +00:00Commented Jan 9, 2023 at 19:18
You need to add the group by
clause and use array_agg
.
SELECT i.id, i.title, array_agg(i.title)
FROM items i
INNER JOIN items_tags it
ON it.item_id = i.id
INNER JOIN tags t
ON t.id = it.tag_id
GROUP BY i.id, i.title,
Explore related questions
See similar questions with these tags.