I have a table with thousands of records with a JSONB column that has a location (geometry) as a point. I am able to select points within a bounding box using this query.
SELECT
url,
ST_AsText(ST_GeomFromGeoJSON(keywords ->> 'lat_lng'::text))
FROM
photos
WHERE
ST_GeomFromGeoJSON(keywords ->> 'lat_lng'::text) && ST_MakeEnvelope(-74.494259, 39.486874, -74.774259 , 39.786874, 4326);
Example of keywords column on my table:
{"lat_lng": {"type": "Point", "coordinates": ["-74.218001", "40.107786"]}, "name": "test_name"}
I know that I can add a spatial index to speed up my queries like so, but the problem is that i dont have a lat_lng column, I have a JSONB column with geometry points. (this is just an example on how to add spatial index)
CREATE INDEX places_lat_lng_idx ON places USING gist(lat_lng);
So I did added this as an index instead:
CREATE INDEX photos_lat_lng_idx ON photos USING GIST(ST_GeomFromGeoJSON(keywords ->> 'lat_lng'::text));
I was able to create that without error and run explain on my query which doesn't do a sequential scan anymore and noticably faster. On 10k rows I am getting this without the index on the JSONB column :
And with the index on the JSONB, I get this: enter image description here
My question is, Am I right to assume that applying the index on the JSONB column worked?
Table definition:
CREATE TABLE "public"."photos" (
"id" int4 NOT NULL DEFAULT nextval('photos_id_seq'::regclass),
"url" varchar,
"created_at" int4,
"deleted_at" int4,
"keywords" jsonb,
PRIMARY KEY ("id")
);
Postgres version: 11.7
Postgis version: 2.5 USE_GEOS=1 USE_PROJ=1 USE_STATS=1
1 Answer 1
You are correctly creating a functional (spatial) index right there, and it shows: one order of magnitude less execution time.
There may be subtle ways to coerce the planner to go for a more direct index lookup, but, assuming your data structure, none will have an improvement as significant to execution time as what you get from the current setup.
EXPLAIN ANALYZE
again. You may get off Heap or Bitmap Scans, but this depends on the table.lat_long
, which is the opposite. If you data is in antartica, it is a naming issue. If you data is in the USA, it is a naming + coordinates swap