I have a very simple JSON table which I populate with some sample data:
CREATE TABLE jsonthings(d JSONB NOT NULL);
INSERT INTO jsonthings VALUES ('{"name":"First","tags":["foo"]}');
INSERT INTO jsonthings VALUES ('{"name":"Second","tags":["foo","bar"]}');
INSERT INTO jsonthings VALUES ('{"name":"Third","tags":["bar","baz"]}');
INSERT INTO jsonthings VALUES ('{"name":"Fourth","tags":["baz"]}');
CREATE INDEX ON jsonthings USING GIN(d);
And am attempting to use the index when running a SELECT
. A simple SELECT
to obtain the rows where the value is a single item works just fine:
SELECT d FROM jsonthings WHERE d @> '{"name":"First"}';
But when attempting to run a query which matches more than one value of name
I can't find out how to use the index. I've tried:
SELECT d FROM jsonthings WHERE d->>'name' = ANY(ARRAY['First', 'Second']);
SELECT d FROM jsonthings WHERE d->'name' ?| ARRAY['First', 'Second'];
SELECT d FROM jsonthings WHERE d#>'{name}' ?| ARRAY['First','Second'];
and all of them show a sequential scan of the table (I'm using enable_seqscan=false
to force index use if possible). Is there some way I can rewrite the query so that it uses an index? I'm aware that I could do:
SELECT * FROM jsonthings WHERE d @> '{"name":"First"}' OR d @> '{"name":"Second"}';
but then I have a variable-length query and I'm going through JDBC so would then lose the benefits of the query being a PreparedStatement.
I'm also interested in seeing a similar query against any of a number of items in the tags
key, e.g.:
SELECT d FROM jsonthings WHERE d @> '{"tags":["foo"]}' OR d @> '{"tags":["bar"]}';
but using an ARRAY
rather than multiple conditions and using an index.
This is on PostgreSql 9.4.
-
You don't have high selectivity. You need around 2-5% data of recordset to enable indexes. Put more records and then maybe your query analyizer will choose index over sequential scan.Mladen Uzelac– Mladen Uzelac2015年01月22日 22:25:21 +00:00Commented Jan 22, 2015 at 22:25
-
Thanks for the comment. I set enable_seqscan to false to force index use so the lack of data isn't the issue. Although I did add another ten million rows during testing to make sure...jgm– jgm2015年01月22日 22:37:47 +00:00Commented Jan 22, 2015 at 22:37
-
Please post your explain plan on explain.depesz.comMladen Uzelac– Mladen Uzelac2015年01月22日 23:06:15 +00:00Commented Jan 22, 2015 at 23:06
2 Answers 2
From docs (http://www.postgresql.org/docs/9.4/static/datatype-json.html) try to use expression index:
CREATE INDEX idx_jsonthings_names ON jsonthings USING gin ((d -> 'name'));
SELECT d FROM jsonthings WHERE d @> '{"name": ["First", "Second"]}';
-
2Yeah looks like I do have to use a separate index, which seems odd given that the index obviously exists already and is used in the single-item query. And the query I need to use is
SELECT d FROM jsonthings WHERE d->'name' ?| ARRAY['First', 'Second'];
otherwise the index is not used. Thanks.jgm– jgm2015年01月23日 12:03:18 +00:00Commented Jan 23, 2015 at 12:03 -
@jgm Did you find a more generic way to achieve this? Having separate index per field is inconvenient (and impossible, if dynamic fields are needed).Tuukka Mustonen– Tuukka Mustonen2017年09月14日 10:20:50 +00:00Commented Sep 14, 2017 at 10:20
This is a response to the answer provided by Mladen. I don't have enough reputation to leave a comment, but I wanted to respond because it looks like the query may be incorrect, and was confusing me, and may cause other people to be confused in the future.
You mention using:
SELECT d FROM jsonthings WHERE d @> '{"name": ["First", "Second"]}';
To retrieve any entries that have either First
or Second
as the name, however, this doesn't seem to work for me on PostgreSQL 9.4.4
:
SELECT d FROM jsonthings WHERE d @> '{"name": ["First", "Second"]}';
d
---
(0 rows)
It seems the above query is attempting to retrieve entries where the name
attribute contains the array ["First", "Second"]
.
If I create such an entry:
INSERT INTO jsonthings VALUES ('{"name":["First", "Second"],"tags":["baz"]}');
And then try the query again, it returns a result:
SELECT d FROM jsonthings WHERE d @> '{"name": ["First", "Second"]}';
d
------------------------------------------------
{"name": ["First", "Second"], "tags": ["baz"]}
(1 row)
However, this is different from the question asked by the original poster, which was how to use an index when querying entries where the name
attribute was either First
or Second
:
SELECT * FROM jsonthings WHERE d @> '{"name":"First"}' OR d @> '{"name":"Second"}';
I wanted to provide this here so other people don't think it's possible to perform an OR query with JSON by providing "name": ["First", "Second"]
, since it's misleading.