6

We have a table of

id|school_id|parent_ids

where parent_ids is an array of ids.

If we don't have the school_id and only parent_id to search for, then the query will search through all the table rows in parent_ids array, there might be thousands of rows, and parent_id might actually be within just few of them.

Does using IN in query for the array column could be a performance barrier in this case?

EDIT

Here is the dump of table structure:

-- ----------------------------
-- Table structure for schools_messages
-- ----------------------------
DROP TABLE IF EXISTS "public"."schools_messages";
CREATE TABLE "public"."schools_messages" (
 "id" int4 NOT NULL DEFAULT nextval('schools_messages_id_seq'::regclass),
 "message" jsonb NOT NULL DEFAULT '[]'::jsonb,
 "details" jsonb NOT NULL DEFAULT '[]'::jsonb,
 "school_id" int4 NOT NULL,
 "created_at" timestamp(0),
 "updated_at" timestamp(0),
 "parents_ids" int4[] DEFAULT ARRAY[]::integer[]
)
;
ALTER TABLE "public"."schools_messages" OWNER TO "prod_schools";
-- ----------------------------
-- Primary Key structure for table schools_messages
-- ----------------------------
ALTER TABLE "public"."schools_messages" ADD CONSTRAINT "schools_messages_pkey" PRIMARY KEY ("id");
-- ----------------------------
-- Foreign Keys structure for table schools_messages
-- ----------------------------
ALTER TABLE "public"."schools_messages" ADD CONSTRAINT "schools_messages_school_id_foreign" FOREIGN KEY ("school_id") REFERENCES "public"."trk_schools" ("id") ON DELETE CASCADE ON UPDATE NO ACTION;
asked Jan 24, 2018 at 11:19
8

1 Answer 1

2

I agree with Jack that your schema needs help. But you can still do this. Here we do this with one index lookup, using two core extensions intarray and btree_gist

CREATE EXTENSION intarray;
CREATE EXTENSION btree_gist;
CREATE INDEX ON public.schools_messages
 USING gist(school_id, parents_ids gist__int_ops);
VACUUM ANALYZE public.schools_messages;
SELECT *
FROM public.schools_messages
WHERE school_id = 42
 OR parent_id @> ARRAY[42];
answered Jan 24, 2018 at 16:50
5
  • Would this be efficient when having thousands of rows, each array in a row has thousands of integers? It seems like searching in a 2D array.. well? Commented Jan 25, 2018 at 8:52
  • Depends on what you mean by "efficient" it's better than anything else except changing the schema. Commented Jan 25, 2018 at 9:02
  • Absolutely, it's better than anything else, but changing the schema is an option too, but I need a recommendation, I thought about adding an array in the parent table, that has foreign ids to the schools_messages table, this way with the parent_id, I can get all the messages parent involved in using one query, but the downside, is that when adding a new message, I will have to add its id to all parents sent to, what do you think? Commented Jan 25, 2018 at 9:08
  • 3
    @simo I think you need to ask another question and tag it with database-design =) Commented Jan 25, 2018 at 9:09
  • here: dba.stackexchange.com/questions/196207/… Commented Jan 25, 2018 at 10:09

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.