I've 2 sets of polygons in 2 tables and the sets overlap each other. For each polygon in set A, I get the ID of the polygon in set B that it overlaps the most.
This is the query:
SELECT DISTINCT ON (a.id)
a.id as a_id,
b.id as b_id,
ST_Area(ST_Intersection(a.geom, b.geom)) as intersect_area
FROM a, b
ORDER BY a.id, ST_Area(ST_Intersection(a.geom, b.geom)) DESC
It works properly so, what's the problem? It's very slow (it takes several minutes). Table "a" has more than 10,000 rows and table "b" 700 rows. So, how can I improve the perfomance?
I've tried with spatial index but I haven't had better results.
1 Answer 1
An implicit CROSS JOIN
(a, b
) creates an unconditional (no WHERE
) cartesian product of both tables, that is, for every row in a
, join every row of table b
- massing up 10.000 * 700
rows in the result set! The DISTINCT ON
filter adds overhead in parsing the complete result set for un-indexed duplicates, and the ORDER BY
is equally not optimized.
It is good practice to express relationships with explicit JOIN
statements, and use conditional expressions that can actually utilize (spatial) indexes.
Since a full traversal of a
is the basic idea behind your task, to actually get a conditionally limited (ordered) result set per row you want to use the LATERAL JOIN
mechanics:
SELECT a.id,
i.id,
i.intersect_area
FROM a
LEFT JOIN LATERAL (
SELECT b.id,
ST_Area(ST_Intersection(a.geom, b.geom)) AS intersect_area
FROM b
WHERE ST_Intersects(a.geom, b.geom)
ORDER BY
2 DESC
LIMIT 1
) AS i ON TRUE
;
While traversing a
, the LATERAL JOIN
passes the current row of a
into the sub-query, where ST_Intersects
filters (index driven) for intersecting polygons in b
, and returns only the one (LIMIT 1
) with the largest intersection area (ORDER BY [ST_Area(ST_Intersection(a.geom, b.geom))]
). Using a LEFT JOIN
ensures that rows in a
with no overlapping geometries in b
will be kept in the result; if that is undesireable, use CROSS JOIN LATERAL (<sub_query>) AS i
without ON TRUE
.
Notes:
- an index on
b.geom
is mandatory - the result of
ST_Area
will be in units of CRS - which, if applied to a geographic reference system with degree as unit, is useless - either project (ST_Transform
) to a suitable projection, or use a cast toGEOGRAPHY
-
For other proposites for example performance can be equal cross to a left join en this case?Cristián Vargas Acevedo– Cristián Vargas Acevedo2024年03月04日 20:43:20 +00:00Commented Mar 4, 2024 at 20:43
Explore related questions
See similar questions with these tags.
EXPLAIN
orEXPLAIN ANALYZE
run in SQL query questions.