I have the following table data in a table named queryables
in Postgres 9.4:
queryable_id | liked_count | users_who_like_ids
--------------+-------------+--------------------
2376 | 3 | {1,80,78,101, 188}
18771 | 1 | {78,101, 123,125}
1790 | 1 | {78}
2257 | 1 | {78}
and I want to select for users who like 1,78,80
The user_who_like_ids
column was defined as a text array:
users_who_like_ids | text[] | default '{}'::text[]
but I has been updated to an integer.
users_who_like_ids | integer[] | default '{}'::integer[]
I perform my query as this:
SELECT * FROM queryables WHERE users_who_like_ids @> ARRAY['1','78','80'];
but this requires users_who_like_ids
to match ALL of them. I'd like it to be ANY of them, and then group by the length.
How would I do this?
2 Answers 2
Use the "overlaps" operator:
select *
from queryables
where array[users_who_like_ids] && ARRAY['1','78','80'];
Your design is really quesionable: first why are you storing numbers as strings? And secondly why don't you store this as a properly normalized one-to-many relationship?
-
agreed on first point and just updated it. On the second, we do but we crunch the results do to this. It takes too long when we're joining so want to see what performance is with this. Hmmm so, looks like rails can only handle a creation pattern. Will adjust for this if query works.timpone– timpone2015年11月05日 18:38:14 +00:00Commented Nov 5, 2015 at 18:38
Query on array with GROUP BY
To add on to the response by @a_horse_with_no_name, if you want to also GROUP BY
against the length of the you need to provide an aggregate function to be used on the grouped values.
SELECT aggregate_function(aggregated_field)
FROM queryables
WHERE array[users_who_like_ids] && ARRAY['1','78','80']
GROUP BY array_length(users_who_like_ids,1);
For example, I could GROUP BY
against the length, and place all the relevant user ids in an array:
SELECT array_agg(queryable_id)
FROM queryables
WHERE array[users_who_like_ids] && ARRAY['1','78','80']
GROUP BY array_length(users_who_like_ids,1);
I could also GROUP BY
against the length, get the average liked_count
by array length:
SELECT avg(liked_count) AS avg_liked,
array_length(users_who_like_ids,1) AS liked_id_count
FROM queryables
WHERE array[users_who_like_ids] && ARRAY['1','78','80']
GROUP BY array_length(users_who_like_ids,1);
Index that array field!
Also, if this table is sufficiently large, lookups will be very slow.
In order to counter this, you will want to index the data in the users_who_like_ids
column, but since it is an array field, you cannot simply build a B-Tree index (the default index type), and you should use the GIN index instead:
CREATE INDEX idx_gin_users_who_like_ids
ON queryables
USING gin
(users_who_like_ids);
Then, when applying the array operators &&
, @>
, or <@
, you will find performance to be much better.
-
thx Chris, this looks like it's getting closer. I updted it to an integer array field so no quotes around the integers. Two questions on the ouput. It looks like the liked_id_count is the total number of elements in that array not the number of this in the array. Would it be possible to say like 3 and 1 rather than 5 and 4? Also, even though grouping, how would I output discreates values from the row? Like say each row has a name which I'd like to output? Also, I think the first 'array' needs to be upcased - at least I had to to not get an error.timpone– timpone2015年11月05日 19:26:39 +00:00Commented Nov 5, 2015 at 19:26
-
duh, sorry - need to figure out how to get this workingtimpone– timpone2015年11月05日 19:32:09 +00:00Commented Nov 5, 2015 at 19:32
-
I'm not totally sure I understand your question, sorry. Can you try rephrasing it so maybe I can provide advice?Chris– Chris2015年11月05日 20:10:06 +00:00Commented Nov 5, 2015 at 20:10