I have a jsonb
field registration
that contains an array of objects. I need to create two separates indexes to look for id
, by example:
d61802ff-3eec-4a72-97ca-832f51b96bf0 = [ .tour_id.id ]
f492b26d-d260-4bcb-8e96-6532fdf38665 = [ .customer_id.id ]
Example data (I have more keys and arrays, but it's a resume)
[
{
"tour_id":
{
"id":"d61802ff-3eec-4a72-97ca-832f51b96bf0",
"name":"Paris 2018"
},
"customer_id":
{
"id":"f492b26d-d260-4bcb-8e96-6532fdf38665",
"name":"OBAMA John"
}
},
{
"tour_id":
{
"id":"a52d38d4-9bfb-4ffa-1122-e536f04b0c60",
"name":"London 2018"
},
"customer_id":
{
"id":"f492b26d-d260-4bcb-8e96-6532fdf38665",
"name":"OBAMA John"
}
}
]
How to index by [array].customer_id.id
?
Or is there a method to index by subkey.subkey? Or deeper nested subkey?
I'm using Postgres 9.6.
1 Answer 1
First off, I would consider a normalized DB design instead of the convoluted jsonb
column, which makes indexing and complex queries harder.
How to index by
[array].customer_id.id
?
See my answer to your previous question:
You could have two GIN indexes like instructed there, one on tour_id
, another one on customer_id
, or a single multicolumn index, and then query:
SELECT * FROM tbl
WHERE f_extract_tour_ids(registration) @> '{d61802ff-3eec-4a72-97ca-832f51b96bf0}'::uuid[]
AND f_extract_customer_ids(registration) @> '{f492b26d-d260-4bcb-8e96-6532fdf38665}'::uuid[];
Note, this returns rows with tour_id
and customer_id
matching in any top-level object of the array, not necessarily in the same. If you want to find rows where tour_id
and customer_id
match in the same object, you have to do more ...
jsonb_path_ops
index
However, with a growing number of different queries (requiring multiple special indexes), a single generic (much bigger) jsonb_path_ops
index on the whole column becomes more competitive. Or even one with the default operator class operator class jsonb_ops
- which is bigger, yet.
CREATE INDEX foo ON tbl USING GIN (registration jsonb_path_ops);
Queries to go with it:
For rows with tour_id
and customer_id
matching in the same top-level object:
SELECT * FROM tbl
WHERE registration @> '[{"tour_id": {"id": "d61802ff-3eec-4a72-97ca-832f51b96bf0"}
, "customer_id": {"id": "f492b26d-d260-4bcb-8e96-6532fdf38665"}}]';
For rows with tour_id
and customer_id
matching in any top-level object, not necessarily the same:
SELECT * FROM tbl
WHERE registration @> '[{"tour_id": {"id": "d61802ff-3eec-4a72-97ca-832f51b96bf0"}}
, {"customer_id":{"id":"f492b26d-d260-4bcb-8e96-6532fdf38665"}}]';
This solution is slower because of the much bigger index carrying all the other noise in your jsonb
column. But it is much more versatile and simpler.
If the first solutions make your head ache, use this one.
-
'first solutions makes your head ache, use this one.', this is better CREATE INDEX foo ON tbl USING GIN (registration jsonb_path_ops) ? yes sometimes i will need to look for just record widh a tour_id; sometimes with customer_id, really i don't want to come back to traditional normalize database after using it long years, JSONB is modern and it's a fresh air that match very well with the front react-app redux-form and array fields. My idea is sync postgresql with couchdb, and then with pouchdb on the browser.stackdave– stackdave2018年03月10日 06:57:24 +00:00Commented Mar 10, 2018 at 6:57
-
@stackdave: There is nothing wrong with JSON in the DB per se. If you can work with JSON documents in your app directly, that's a major benefit. There are just many use cases that perform much better with a normalized design. The right tool for the job ...Erwin Brandstetter– Erwin Brandstetter2018年03月10日 13:48:16 +00:00Commented Mar 10, 2018 at 13:48