How to do an =ANY(SELECT...query in postgresql?

Question 1

I'm trying to select all the latitudes and longitudes for a group of users based on their id being in an array stored in another table. Here's my attempt:

SELECT latitude, longitude 
FROM userloc WHERE id = ANY( SELECT interested FROM donedeals WHERE deals_id=67);

But it gives me the following error:

ERROR: operator does not exist: integer = integer[]
LINE 1: SELECT latitude, longitude FROM userloc WHERE id = ANY( SELE...
 ^
HINT: No operator matches the given name and argument type(s).
 You might need to add explicit type casts.

donedeals has an int column for deals_id and an int array column for interested, which contains id's corresponding to the id column of userloc, which stores latitude and longitude:

 deals_id | interested 
----------+---------------
 67 | {377,387,376}
 64 | {381,384}
 66 | {377,387}
 latitude | longitude | id 
------------+-------------+-----
 40.6439417 | -73.964927 | 384
 40.7554919 | -73.925891 | 380
 40.6434067 | -73.9657654 | 385
 40.746452 | -73.90732 | 378
 40.643459 | -73.964586 | 381
 40.6430341 | -73.9656954 | 382

This is all in Postgres 9.3.5.

I'd like to select all latitudes and longitudes for id's corresponding to the interested array for a given deals_id. This seems like it should be doable in a single call, but I can't seem to figure out the syntax. Any recommendations would be greatly appreciated.

Question 2

Unfortunately = ANY (array) only works with an array literal on the right hand side, not a sub-select.

You need to "normalize" your de-normalized model, using unnest():

SELECT latitude, longitude 
FROM userloc 
WHERE id IN (SELECT unnest(interested) 
 FROM donedeals 
 WHERE deals_id = 64);

If deals_id is unique in the donedeals table, another option is to "convert" the id on the left side to an array and then use the "is contained by" operator: <@:

SELECT latitude, longitude 
FROM userloc 
WHERE array[id] <@ (SELECT interested 
 FROM donedeals 
 WHERE deals_id=64 );

Not sure which one would be faster. You will need to check the execution plan.

Question 3

The 2nd query breaks if more than 1 row is returned from the subselect. Cannot happen as long as deal_id is unique, which seems like but hasn't been specified.

Question 4

@ErwinBrandstetter: you are absolutely right, thanks. I added that

Question 5

Typically, this whould better be rewritten as JOIN:

SELECT u.latitude, u.longitude 
FROM userloc u
JOIN donedeals d ON u.id = ANY (d.interested)
WHERE d.deals_id = 67;

I also considered the "is contained by" operator: <@, that @a_horse already mentioned. It can use a GIN index on interested. But on a second look, that's irrelevant here. This query needs indexes on userloc.id and donedeals.deals_id.

Or with unnest() in a LATERAL join (Postgres 9.3+):

SELECT u.latitude, u.longitude 
FROM donedeals d
 , unnest(d.interested) i(id) -- implicit JOIN LATERAL
JOIN userloc u ON u.id = i.id
WHERE d.deals_id = 67;

The latter should be faster since it can use indexes on both userloc.id and donedeals.deals_id.

There is one possible difference: In your original, distinct rows are returned from userloc (which might still hold duplicate values for (latitude, longitude)). If (and only if) that is relevant:

SELECT DISTINCT ON (u.id) -- id unique
 u.latitude, u.longitude 
FROM ...

Or GROUP BY u.id with id being the PK.
You could also:

SELECT DISTINCT u.latitude, u.longitude 
FROM ...

That would additionally fold duplicates on (latitude, longitude). It all depends on exact table definitions and requirements.

Alternative: normalize

Another option would be to normalize your schema, which would simplify the query to plain joins. Looks like a typical many-to-many relationship. Reference implementation:

How to implement a many-to-many relationship in PostgreSQL?

user1822user1822 · Answer 1 · 2015-01-28 20:05:04Z

Unfortunately = ANY (array) only works with an array literal on the right hand side, not a sub-select.

You need to "normalize" your de-normalized model, using unnest():

SELECT latitude, longitude 
FROM userloc 
WHERE id IN (SELECT unnest(interested) 
 FROM donedeals 
 WHERE deals_id = 64);

If deals_id is unique in the donedeals table, another option is to "convert" the id on the left side to an array and then use the "is contained by" operator: <@:

SELECT latitude, longitude 
FROM userloc 
WHERE array[id] <@ (SELECT interested 
 FROM donedeals 
 WHERE deals_id=64 );

Not sure which one would be faster. You will need to check the execution plan.

The 2nd query breaks if more than 1 row is returned from the subselect. Cannot happen as long as deal_id is unique, which seems like but hasn't been specified.
@ErwinBrandstetter: you are absolutely right, thanks. I added that

score 4 · Answer 2 · 2015-01-28 21:42:33Z

Typically, this whould better be rewritten as JOIN:

SELECT u.latitude, u.longitude 
FROM userloc u
JOIN donedeals d ON u.id = ANY (d.interested)
WHERE d.deals_id = 67;

I also considered the "is contained by" operator: <@, that @a_horse already mentioned. It can use a GIN index on interested. But on a second look, that's irrelevant here. This query needs indexes on userloc.id and donedeals.deals_id.

Or with unnest() in a LATERAL join (Postgres 9.3+):

SELECT u.latitude, u.longitude 
FROM donedeals d
 , unnest(d.interested) i(id) -- implicit JOIN LATERAL
JOIN userloc u ON u.id = i.id
WHERE d.deals_id = 67;

The latter should be faster since it can use indexes on both userloc.id and donedeals.deals_id.

There is one possible difference: In your original, distinct rows are returned from userloc (which might still hold duplicate values for (latitude, longitude)). If (and only if) that is relevant:

SELECT DISTINCT ON (u.id) -- id unique
 u.latitude, u.longitude 
FROM ...

Or GROUP BY u.id with id being the PK.
You could also:

SELECT DISTINCT u.latitude, u.longitude 
FROM ...

That would additionally fold duplicates on (latitude, longitude). It all depends on exact table definitions and requirements.

Alternative: normalize

Another option would be to normalize your schema, which would simplify the query to plain joins. Looks like a typical many-to-many relationship. Reference implementation:

How to implement a many-to-many relationship in PostgreSQL?

Stack Exchange Network

How to do an =ANY(SELECT...query in postgresql?

2 Answers 2

Alternative: normalize

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to do an =ANY(SELECT...query in postgresql?

2 Answers 2

Alternative: normalize

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions