I'm trying to query for a key present in an array of objects. This structure:
column jdata
{"name": "Somedata",
"array": [ {"name":"bla1", "attr": "somevalue"},
{"name":"bla2", "otherdata": "somevalue2"},
{"name":"bla3", "otherdata": "somevalue"}
],
"otherstuff": "stuff"
}
Now I do btree's on jdata->'name'
or (jdata->'datetime')::cast
and that works great.
I also do json_path_ops where jdata->'array' @> '[{"name":"bla3"}]'
works a real charm.
My problem is the attr
key can be in any of the objects in the array and I care about the record if the key is present, however the value can be nearly anything. Is there a way to query for this? Is there a way it can be indexed? I want to do jdata->'array' @> '[{"attr": ?}]'
Or maybe the ? 'attr'
can be used inside an array somehow?
Currently I am thinking a trigger that scans for the key and then moves it to a header with a true or false or whatever, and then a normal btree will work. Is there a better way? I need to edit around 500k records at the average site to add this values.
Please point me in a direction.
2 Answers 2
Postgres 12 or later: with SQL/JSON path expression
SELECT *
FROM tbl
WHERE jdata->'array' @? '$ ? (exists (@."attr"))';
You can look for keys or values, only in the outer nesting level or recurse, abstract arrays away (unnest them) in lax
mode or not (strict
mode). I added query variations in the fiddle to demonstrate:
db<>fiddle here
@?
is the jsonpath
existence operator.
jsonpath
value '$ ? (exists (@."attr"))'
explained:
$
... look at each value found in left operand (in default "lax" mode)
?
... run the following test
(exists (@."attr"))
... does a key with name 'attr' exist?
Can use an index, either with default jsonb_ops
:
CREATE INDEX tbl_jdata_array_idx ON tbl USING GIN ((jdata->'array'));
Or with jsonb_path_ops
:
CREATE INDEX tbl_jdata_array_path_ops_idx ON tbl USING GIN ((jdata->'array') jsonb_path_ops);
See:
- https://www.postgresql.org/docs/current/gin-builtin-opclasses.html
- https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING
Related:
Any Postgres version with jsonb
support
That particular use case is not covered by plain indexes for built-in operators.
Simple query without index support
SELECT *
FROM tbl
WHERE EXISTS (
SELECT FROM jsonb_array_elements(jdata->'array') elem
WHERE elem ? 'attr'
);
EXISTS
because we want each qualifying row once, even if multiple array elements can contain the key. And it's faster.
But this query cannot use an index.
Expression index
You can generate a text array of unique keys in the given jsonb
array of records - and wrap the expression into a simple IMMUTABLE
function:
CREATE OR REPLACE FUNCTION jsonb_arr_record_keys(jsonb)
RETURNS text[] LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT ARRAY (
SELECT DISTINCT k
FROM jsonb_array_elements(1ドル) elem, jsonb_object_keys(elem) k
)';
COMMENT ON FUNCTION jsonb_arr_record_keys(jsonb) IS '
Generates text array of unique keys in jsonb array of records.
Fails if any array element is not a record!';
Then create a GIN expression index based on this function:
CREATE INDEX tbl_special_idx ON tbl USING gin (jsonb_arr_record_keys(jdata->'array'));
Query like this, using the generic array contains operator @>
:
SELECT *
FROM tbl
WHERE jsonb_arr_record_keys(jdata->'array') @> '{attr}';
Now the index can be used efficiently.
Provide the key name nested in an array ('{attr}'
). (You can conveniently check for multiple keys this way ('{attr1, attr2}'
) or similar ... )
dbfiddle here
Related:
Is there a way it can be indexed?
Yes, however PostgreSQL can't JSON arrays like that so you'll have to write a function to simplify the structure. Then you can index on that function.
Explore related questions
See similar questions with these tags.