Using Postgres 9.4, I'm interested in having an array of integers like user_ids_who_like
and provide an array of users (like user_ids_i_am_following
) to sort that intersection.
Something like:
select *
from items
where [there is an intersection between
user_ids_who_like with user_ids_i_am_following]
order by intersection(user_ids_who_like).count
Is grouping and ordering by an array intersection possible?
Example data:
items
name | user_ids_who_like
'birds' | '{1,3,5,8}'
'planes' | '{2,3,4,11}'
'spaceships' | '{3,4,6}'
For a given user_ids_who_i_follow = [3,4,11]
, can I do something like:
select * from items
where <user_ids_who_like intersects with user_ids_who_i_follow>
order by <count of that intersection>
Desired result:
name | user_ids_who_like | count
'planes' | '{2,3,4,11}' | 3
'spaceships' | '{3,4,6}' | 2
'birds' | '{1,3,5,8}' | 1
One possibility seems to be something like this:
select id, user_ids_who_like, (user_ids_who_like & '{514, 515}'::int[]) as jt
from queryables
where user_ids_who_like && '{514, 515}'
order by icount(user_ids_who_like & '{514, 515}'::int[]) desc;
But I can't tell if this style (using the intarray extension rather than native array functions and operators) is outdated; any feedback from more sophisticated users here? It's not clear to me how to do the intersection of two arrays using the methods and operators.
1 Answer 1
With tools of the basic Postgres installation only, you might unnest()
and count in a LATERAL
subquery:
SELECT i.name, i.user_ids_who_like, x.ct
FROM items i
, LATERAL (
SELECT count(*) AS ct
FROM unnest(i.user_ids_who_like) uid
WHERE uid = ANY('{3,4,11}'::int[])
) x
ORDER BY x.ct DESC; -- add PK as tiebreaker for stable sort order
We don't need a LEFT JOIN
to preserve rows without match because count()
always returns a row - 0 for "no match".
intarray
Assuming integer
arrays without NULL values or duplicates, the intersection operator &
of the intarray
module would be much simpler:
SELECT name, user_ids_who_like
, array_length(user_ids_who_like & '{3,4,11}', 1) AS ct
FROM items
ORDER BY 3 DESC NULLS LAST;
I added NULLS LAST
to sort empty arrays last - after the reminder from your later question:
Install intarray
once per database for this.
Use the overlap opertaor &&
in the WHERE
clause to rule out rows without any overlap:
SELECT ...
FROM ...
WHERE user_ids_who_like && '{3,4,11}'
ORDER BY ...
Why? Per documentation:
intarray
provides index support for the&&
,@>
,<@
, and@@
operators, as well as regular array equality.
Applies to standard array operators in a similar fashion. Details:
Alternatively and more radically, a normalized schema with a separate table instead of the array column user_ids_who_like
would occupy more disk space, but offer simple solutions with plain btree indexes for these problems.
intarray
still has its use cases, this might be one of them. The extension has a few funtions, too, which are very handy.