I have a table tickets
which can contain custom fields. It's implemented using a jsonb column tickets
.custom_fields
, which contains data in the format { "<field_id>":<field_value> }
. The <field_id> comes from a fields
table.
e.g. with the following fields
table
id | name | type |
---|---|---|
1 | Name | Text |
2 | Score | Number |
consider a ticket with custom_fields as { "1": "Field Value #1", "2": 100 }
As new fields can be added and deleted from fields table, the keys for custom_fields
jsonb column will keep changing. These custom fields can be filtered on in the UI and as such will require querying capabilities.
e.g.
SELECT COUNT(*)
FROM "tickets"
WHERE
CAST("tickets".custom_fields ->> '2' AS DOUBLE PRECISION) > 75
I have tried GIN index on custom_fields
:
CREATE INDEX idx_gin_tickets_custom_fields ON tickets USING gin(custom_fields);
Finalize Aggregate (cost=22092.83..22092.84 rows=1 width=8) (actual time=34.548..36.019 rows=1 loops=1)
-> Gather (cost=22092.62..22092.83 rows=2 width=8) (actual time=34.538..36.013 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=21092.62..21092.63 rows=1 width=8) (actual time=32.509..32.510 rows=1 loops=3)
-> Parallel Seq Scan on tickets (cost=0.00..20948.08 rows=57818 width=0) (actual time=9.096..32.504 rows=3 loops=3)
Filter: (((custom_fields ->> '2'::text))::double precision > '75'::double precision)
Rows Removed by Filter: 138760
Planning Time: 0.077 ms
Execution Time: 36.043 ms
Questions:
- How can I index the column that would result in index scan for the filtering queries.
- Is there a better way to design the solution?
2 Answers 2
Based on your explain analyze you're using the ->>
operator based on this line in your output:
Filter: (((custom_fields ->> '2'::text))::double precision > '75'::double precision)
Currently the ->>
operator is not supported for indexing. GIN indexes currently only support the following operators (depending on what operator class you defined the index with):
?, ?|, ?&, @>, @?, @@
You may find the full documentation as to what is supported here which explains the different operator classes (such as jsonb_path_ops
which does not support key-exists operators).
So your current query will not be able to use the index, however you may use the @> operator to achieve a similar result but be able to use that index:
select *
from foo f
where f.jsonb @> '{"key_to_search":"value_to_match"}'
As for the "is there any better design" question. Unfortunately, cannot comment on it as I would need to know more details about the business requirement and the design of your system.
GIN indexes do not support inequality operators, so you can't do what you want very effectively with a GIN index. They do support partial matches, so perhaps you could support inequality (in only one direction) by mapping that to a partial match, if you were willing to write your own implementations to do that.
You could use functional indexes, but then you would need to add a new index every time a new key (which you want to search on for inequality) was introduced. This is probably what I would do.
Assuming you are unwilling to do that, a better design would be to normalize the data into (at least) 2 tables, using an EAV design. Note that EAV is itself usually considered a poor design--but not all poor designs are equally poor. A query like WHERE field_num=2 and field_value >75
would use a multicolumn index on (field_num, field_value)
.