I'm trying to select all the latitudes and longitudes for a group of users based on their id being in an array stored in another table. Here's my attempt:
SELECT latitude, longitude
FROM userloc WHERE id = ANY( SELECT interested FROM donedeals WHERE deals_id=67);
But it gives me the following error:
ERROR: operator does not exist: integer = integer[]
LINE 1: SELECT latitude, longitude FROM userloc WHERE id = ANY( SELE...
^
HINT: No operator matches the given name and argument type(s).
You might need to add explicit type casts.
donedeals
has an int
column for deals_id
and an int array
column for interested
, which contains id's corresponding to the id column of userloc
, which stores latitude and longitude:
deals_id | interested
----------+---------------
67 | {377,387,376}
64 | {381,384}
66 | {377,387}
latitude | longitude | id
------------+-------------+-----
40.6439417 | -73.964927 | 384
40.7554919 | -73.925891 | 380
40.6434067 | -73.9657654 | 385
40.746452 | -73.90732 | 378
40.643459 | -73.964586 | 381
40.6430341 | -73.9656954 | 382
This is all in Postgres 9.3.5.
I'd like to select all latitudes and longitudes for id's corresponding to the interested
array for a given deals_id
. This seems like it should be doable in a single call, but I can't seem to figure out the syntax. Any recommendations would be greatly appreciated.
2 Answers 2
Unfortunately = ANY (array)
only works with an array literal on the right hand side, not a sub-select.
You need to "normalize" your de-normalized model, using unnest()
:
SELECT latitude, longitude
FROM userloc
WHERE id IN (SELECT unnest(interested)
FROM donedeals
WHERE deals_id = 64);
If deals_id
is unique in the donedeals
table, another option is to "convert" the id
on the left side to an array and then use the "is contained by" operator: <@
:
SELECT latitude, longitude
FROM userloc
WHERE array[id] <@ (SELECT interested
FROM donedeals
WHERE deals_id=64 );
Not sure which one would be faster. You will need to check the execution plan.
-
The 2nd query breaks if more than 1 row is returned from the subselect. Cannot happen as long as
deal_id
is unique, which seems like but hasn't been specified.Erwin Brandstetter– Erwin Brandstetter2015年01月29日 01:52:48 +00:00Commented Jan 29, 2015 at 1:52 -
@ErwinBrandstetter: you are absolutely right, thanks. I added thatuser1822– user18222015年01月29日 07:01:52 +00:00Commented Jan 29, 2015 at 7:01
Typically, this whould better be rewritten as JOIN
:
SELECT u.latitude, u.longitude
FROM userloc u
JOIN donedeals d ON u.id = ANY (d.interested)
WHERE d.deals_id = 67;
I also considered the "is contained by" operator: <@
, that @a_horse already mentioned. It can use a GIN index on interested
. But on a second look, that's irrelevant here. This query needs indexes on userloc.id
and donedeals.deals_id
.
Or with unnest()
in a LATERAL
join (Postgres 9.3+):
SELECT u.latitude, u.longitude
FROM donedeals d
, unnest(d.interested) i(id) -- implicit JOIN LATERAL
JOIN userloc u ON u.id = i.id
WHERE d.deals_id = 67;
The latter should be faster since it can use indexes on both userloc.id
and donedeals.deals_id
.
There is one possible difference: In your original, distinct rows are returned from userloc
(which might still hold duplicate values for (latitude, longitude)
). If (and only if) that is relevant:
SELECT DISTINCT ON (u.id) -- id unique
u.latitude, u.longitude
FROM ...
Or GROUP BY u.id
with id being the PK.
You could also:
SELECT DISTINCT u.latitude, u.longitude
FROM ...
That would additionally fold duplicates on (latitude, longitude)
. It all depends on exact table definitions and requirements.
Alternative: normalize
Another option would be to normalize your schema, which would simplify the query to plain joins. Looks like a typical many-to-many relationship. Reference implementation: