5

I have an sjoin function from geopandas that is behaving erratically: it works on some version of the "points" geodataframe but not others.

merged=sjoin(points,polygons, how='left',op='within')

The error I get is always:

rtree.core.RTreeError: Coordinates must be in the form (minx, miny, maxx, maxy) or (x, y) for 2D indexes

The "polygons" geodataframe never changes. The size of the "points" geodataframe depends on how much data I want to include (in a parameter). Generally the join fails when I include more data (e.g. 100,000 rows), and succeeds on smaller datasets (e.g. 2,000 rows). I assume this is because some rows contain invalid data. However on visual inspection I cannot find anything wrong with any row.

Is there a way to quickly find out which rows are blocking the join, or to automatically ignore them?

I can't easily share the full code and data.

Andre Silva
10.5k12 gold badges57 silver badges109 bronze badges
asked Feb 11, 2016 at 3:31
3
  • Use the classic version without GeoPandas (More Efficient Spatial join in Python without QGIS, ArcGIS, PostGIS, etc), compared with the GeoPandas version (gis.stackexchange.com/a/165413/2581)) Commented Feb 11, 2016 at 16:41
  • Thanks, but do you have any idea why this wouldn't work, what this error means or how to find problematic data? I quite like geopandas so I don't want to discard its future use for a problem I don't even understand. Commented Feb 12, 2016 at 1:19
  • 1
    Try to find the problematic data with the solution without Pandas Commented Feb 14, 2016 at 10:04

1 Answer 1

4

There are various reasons why this error can occur, here are the ones I have experienced and the solutions:

  1. Your input data sets do not have clean sequential indices (i.e. there are gaps in the sequence due to prior exclusion of rows).

I'm not sure exactly why this causes the error but it can be resolved by calling

pd.reset_index(drop=True)

on both input GeoDataFrames before applying sjoin.

  1. There are invalid geometry objects in your polygons data frame.

If your polygons were drawn by hand (i.e. manually on a GIS) they may have overlaps or self-intersections that don't translate well further in the process. Or your polygons could be empty which can happen in PostGIS with complex function sequences.

The solution is to ensure that all your polygons are of the correct type and are valid. In PostGIS you can use the functions ST_IsValid and ST_IsEmpty to check for this and remove or amend any problems. You should also check that you have Polygons or MultiPolygons not GeometryCollections.

Andre Silva
10.5k12 gold badges57 silver badges109 bronze badges
answered Jan 3, 2018 at 23:59

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.