How to properly create a Spatial index and write a SARGable query that leverages that Spatial index?
I can't seem to get SQL Server to leverage a basic Spatial index on a GEOGRAPHY
object for even the simplest of queries.
Here is a dbfiddle.uk repro which demonstrates creating a simple table with a GEOGRAPHY
column called Coordinates
, adding 1 row with a Polygon to that table, and then creating a Spatial index on the Coordinates
column.
The query appears like it should be SARGable, but I still get a clustered index scan, instead of a nonclustered index seek on the Spatial index:
If I try to force the index hint, I get the classic error:
Msg 8635, Level 16, State 4, Line 15
The query processor could not produce a query plan for a query with a spatial index hint. Reason: Spatial indexes do not support the comparator supplied in the predicate. Try removing the index hints or removing SET FORCEPLAN.
So it's not that SQL Server thinks the clustered index scan is more performant, rather it just can't even come up with a query plan for the Spatial index.
According to Microsoft's docs on Spatial Indexes, the STDistance()
function should be applicable in a predicate for Spatial index sargability:
Spatial indexes support the following predicate forms:
- geography1.STDistance(geography2) <= number
Am I being dumb?
1 Answer 1
The documentation you link says
Spatial indexes support the following predicate forms:
geography1.STIntersects(geography2)= 1
geography1.STEquals(geography2)= 1
geography1.STDistance(geography2) < number
geography1.STDistance(geography2) <= number
So as long as you are looking for cases where the result of the distance function is less than (or less than or equal to) some number you shouldn't be seeing a message about an invalid "comparator supplied in the predicate"
(The conditions must also be expressed as above = e.g. 1 > Coordinates.STDistance(@Point)
fails when explicitly tying to force the named spatial index with "Reason: Could not find required binary spatial method in a condition")
The spatial index can be forced with either the FORCESEEK
or an explicit INDEX
hint (Fiddle). The explicit index hint seems to give more useful information in the event of a failure.
If you do force the spatial index in your case the execution plan calls the
[GeodeticTessellation]
table valued function passing parameters @Point, 768, 9, 1, @Distance
to get the values to use in the spatial index seek 768
is the default SPATIAL_WINDOW_MAX_CELLS
and can be overridden with the corresponding hint. I'm not sure what the 9,1
correspond to.
This emits 8 rows that are then used to seek into the spatial index, an aggregate collapses the result of this join down to produce one row per PK that is then used in a clustered index lookup against the base table and finally there is a filter with a residual predicate. The optimiser rule responsible for this is SpatialIntersectFilterOverGridIndex
.
In the case of your example data this is costed as 45 times more expensive than just scanning the table and evaluating the predicate against each row (subtree costs of 0.0042838
vs 0.193235
)
Your original fiddle had an expression WHERE Coordinates.STDistance(@Point) IS NOT NULL
. This is not one of the two supported comparator predicates for STDistance
that can use a spatial index (except if the query also meets all the conditions for a "nearest neighbour" ordering).
- A spatial index must be present on one of the spatial columns and the
STDistance()
method must use that column in theWHERE
andORDER BY
clauses. - The TOP clause cannot contain a
PERCENT
statement. - The WHERE clause must contain a
STDistance()
method. - If there are multiple predicates in the
WHERE
clause then the predicate containingSTDistance()
method must be connected by an AND conjunction to the other predicates. TheSTDistance()
method cannot be in an optional part of the WHERE clause. - The first expression in the ORDER BY clause must use the
STDistance()
method. - Sort order for the first
STDistance()
expression in theORDER BY
clause must beASC
. - All the rows for which
STDistance
returnsNULL
must be filtered out.
This essentially ends up adding an implicit predicate.
The execution plan for
SELECT TOP(@N) Coordinates.STDistance(@Point), * FROM dbo.GeoTest WITH (INDEX =IX_TestSpatial)
WHERE Coordinates.STDistance(@Point) IS NOT NULL
ORDER BY Coordinates.STDistance(@Point)
Works similarly to this pseudo code
SELECT TOP (@N) *
FROM (VALUES
(0, 86.153535551610716e0),
(1, 172.30707110322143e0),
(2, 344.61414220644286e0),
(3, 689.22828441288573e0),
(4, 1378.4565688257715e0),
(5, 2756.9131376515429e0),
(6, 5513.8262753030858e0),
(7, 11027.652550606172e0),
(8, 22055.305101212343e0),
(9, 44110.610202424687e0),
(10, 88221.220404849373e0),
(11, 176442.44080969875e0),
(12, 352884.88161939749e0),
(13, 705769.76323879499e0),
(14,1411539.5264775900e0),
(15,2823079.0529551799e0),
(255,1.7976931348623157e+308 /*MAX_FLOAT*/)
) V(Level1003, MaxDistance1004)
CROSS APPLY
(
SELECT TOP (@N) *,
Coordinates.STDistance(@Point) AS Dist,
COUNT(*) OVER (PARTITION BY Level1003) AS CountMatchedThisLevel
FROM dbo.GeoTest WITH(FORCESEEK)
WHERE Coordinates.STDistance(@Point) <= MaxDistance1004
ORDER BY Dist
) CA
WHERE CountMatchedThisLevel >= @N OR Level1003 = 255
ORDER BY Level1003, Dist;
where it tries exponentially increasing distances in order and can stop on the iteration finding at least @N
results and return the TOP @N
results for that level (or the results for the final level if none of the distances tried found @N
results)
-
1I was being a bit of a knob and misread some of the documentation for the
STContains
function which is what my real use case was using. I'm all good now. Appreciate the answer!J.D.– J.D.2024年10月18日 21:12:53 +00:00Commented Oct 18, 2024 at 21:12
Explore related questions
See similar questions with these tags.