6

Using Postgres 9.4, I'm interested in having an array of integers like user_ids_who_like and provide an array of users (like user_ids_i_am_following) to sort that intersection.

Something like:

select * 
from items 
where [there is an intersection between 
 user_ids_who_like with user_ids_i_am_following] 
order by intersection(user_ids_who_like).count

Is grouping and ordering by an array intersection possible?

Example data:

items
name | user_ids_who_like
'birds' | '{1,3,5,8}'
'planes' | '{2,3,4,11}'
'spaceships' | '{3,4,6}'

For a given user_ids_who_i_follow = [3,4,11], can I do something like:

select * from items
where <user_ids_who_like intersects with user_ids_who_i_follow>
order by <count of that intersection>

Desired result:

name | user_ids_who_like | count
'planes' | '{2,3,4,11}' | 3
'spaceships' | '{3,4,6}' | 2
'birds' | '{1,3,5,8}' | 1

One possibility seems to be something like this:

select id, user_ids_who_like, (user_ids_who_like & '{514, 515}'::int[]) as jt 
from queryables 
where user_ids_who_like && '{514, 515}' 
order by icount(user_ids_who_like & '{514, 515}'::int[]) desc;

But I can't tell if this style (using the intarray extension rather than native array functions and operators) is outdated; any feedback from more sophisticated users here? It's not clear to me how to do the intersection of two arrays using the methods and operators.

Paul White
95.3k30 gold badges439 silver badges689 bronze badges
asked Nov 26, 2015 at 13:23
1
  • 1
    intarray still has its use cases, this might be one of them. The extension has a few funtions, too, which are very handy. Commented Nov 27, 2015 at 14:28

1 Answer 1

8

With tools of the basic Postgres installation only, you might unnest() and count in a LATERAL subquery:

SELECT i.name, i.user_ids_who_like, x.ct
FROM items i
 , LATERAL (
 SELECT count(*) AS ct
 FROM unnest(i.user_ids_who_like) uid
 WHERE uid = ANY('{3,4,11}'::int[])
 ) x
ORDER BY x.ct DESC; -- add PK as tiebreaker for stable sort order

We don't need a LEFT JOIN to preserve rows without match because count() always returns a row - 0 for "no match".

intarray

Assuming integer arrays without NULL values or duplicates, the intersection operator & of the intarray module would be much simpler:

SELECT name, user_ids_who_like
 , array_length(user_ids_who_like & '{3,4,11}', 1) AS ct
FROM items
ORDER BY 3 DESC NULLS LAST;

I added NULLS LAST to sort empty arrays last - after the reminder from your later question:

Install intarray once per database for this.

Use the overlap opertaor && in the WHERE clause to rule out rows without any overlap:

SELECT ...
FROM ...
WHERE user_ids_who_like && '{3,4,11}'
ORDER BY ...

Why? Per documentation:

intarray provides index support for the &&, @>, <@, and @@ operators, as well as regular array equality.

Applies to standard array operators in a similar fashion. Details:


Alternatively and more radically, a normalized schema with a separate table instead of the array column user_ids_who_like would occupy more disk space, but offer simple solutions with plain btree indexes for these problems.

answered Dec 1, 2015 at 1:35
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.