I am working on a Postgres 9.4 project with the intarray extension enabled. We have a table that looks like this:
items
-------------------------------------
id name tag_ids
--------------------------------------
1 a car {1,4}
2 a room to rent {1}
3 a boat {1,2,4,11}
4 a wine {2}
5 emily {3}
I'd like to group the tag ids if possible. Like get a count of all the elements that have a tag_id
of '{1,2,4,11}'
tag_id count
1 3
2 2
4 2
11 1
Is this possible? I would think an intersection like this:
select * from items where tag_ids && '{1,2,4,11}'
But I need to group by the array elements inside the intersection result. If I group by tag_ids, it is just the unique value.
How would I do it?
1 Answer 1
Keep the basic query you already have to cheaply identify rows with any overlapping array elements using an index.
Then unnest only the intersection (tag_ids & '{1,2,4,11}'
) in a LATERAL
join. Finally, aggregate:
SELECT tag_id, count(*) AS ct
FROM items i
, unnest(tag_ids & '{1,2,4,11}'::int[]) tag_id
WHERE tag_ids && '{1,2,4,11}'::int[]
GROUP BY tag_id
ORDER BY count(*) DESC;
Once again, the intersection operator &
of the intarray module is instrumental.
Previous related answer:
Without intarray
If you don't have intarray installed, or for any other array type, we need another join instead:
SELECT tag_id, count(*) AS ct
FROM (
SELECT *
FROM items
WHERE tag_ids && '{1,2,4,11}'::int[]
) i, unnest(tag_ids) t(tag_id)
JOIN unnest('{1,2,4,11}'::int[]) x(tag_id) USING (tag_id)
GROUP BY tag_id
ORDER BY count(*) DESC;
Subtle difference: the intersection operator folds duplicates to produce distinct elements, while this query does not. Only matters if there can be duplicate array elements ...
The fiddle demonstrates both queries operating off a table with an additional row that mixes matching and not-matching elements to show the need for the intersection or an additional join to eliminate unwanted elements: