I'm writing a script with Geopandas. I try to use a csv of blocks to make a spatial join. So I convert it as a Geodataframe. But when I want to set geometry column it returns me Input geometry column must contain valid geometry objects.
Here is my code to import csv file :
csv_df = pandas.read_csv(csv_file)
csv_gdf = gpd.GeoDataFrame(csv_df)
csv_gdf = csv_gdf.set_geometry('geometry')
Here is csv_gdf.head() before I try to set geometry column :
id name shortName accountId isMonitored varietalId
ranchId \
0 14633.0 HC4bas HC4b 346.0 False 4.0
855.0
1 14634.0 HC3haut HC3h 346.0 False 4.0
855.0
2 14637.0 HC12 HC12 346.0 False 2.0
855.0
3 14638.0 HC11haut HC11 346.0 False 72.0
855.0
4 14641.0 HC9bas HC9b 346.0 False 4.0
855.0
inRowDistance betweenRowDistance \
0 1.2 1.5
1 1.2 1.5
2 1.2 1.5
3 0.9 1.5
4 1.2 1.5
geometry ... \
0 POLYGON ((-0.1642995066034836 44.9397295596186... ...
1 POLYGON ((-0.1634854066129132 44.9405302549332... ...
2 POLYGON ((-0.1624824342183362 44.9398350833047... ...
3 POLYGON ((-0.1592356652491378 44.9399712591478... ...
4 POLYGON ((-0.1610166332996532 44.9391145465108... ...
slopeInclination slopeOrientation soilType rowOrientation
dashboardId \
0 NaN NaN NaN NaN
NaN
1 NaN NaN NaN NaN
NaN
2 NaN NaN NaN NaN
NaN
3 NaN NaN NaN NaN
NaN
4 NaN NaN NaN NaN
NaN
canopySystemId canopyWidth topWireHeight clusterWireHeight \
0 NaN 0.45 NaN NaN
1 NaN 0.45 NaN NaN
2 NaN 0.45 NaN NaN
3 NaN 0.45 NaN NaN
4 NaN 0.45 NaN NaN
pruningSystemId
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
[5 rows x 27 columns]
1 Answer 1
You probably have invalid geometries in your dataset, to find the invalid geometries you can either load your csv to qgis and run Vector -> Geometry Tools -> Check validity
or loop through your dataframe to find the invalid geometries:
for index, row in csv_gdf.iterrows():
geom = row['geometry']
if len(geom.coords) <= 2:
print "This row has an invalid polygon geometry"
# this is just one example of invalid geometries, there are also overlapping vertices, ...
I would recommend you the first check even if qgis is not tagged in your question
EDIT: generating the geometry as a shapely.geometry object
from shapely.wkt import loads
# either all at once :
csv_gdf['geometry'] = csv_gdf['geometry'].apply(loads))
# or one by one to detect possible geometry errors
for index, row in csv_gdf.iterrows():
# it will throw an error where the geometry WKT isn't valid
# csv_gdf.set_value(index, 'geometry', loads(row['geometry'])) --> deprecated
csv_gdf.loc[index, 'geometry'] = loads(row['geometry'])
-
It returns an error too :
if len(geom.coords) <= 2: AttributeError: 'str' object has no attribute 'coords'
Tim C.– Tim C.2018年01月12日 15:31:12 +00:00Commented Jan 12, 2018 at 15:31 -
it means that the geometry column is a string while it should be a shapely.geometry object, I'll edit my answer to generate the geometry as it should beHicham Zouarhi– Hicham Zouarhi2018年01月12日 15:48:28 +00:00Commented Jan 12, 2018 at 15:48
-
When you use apply and a function without modification of the cell value, you can directly write
.apply(loads)
, there is no need of lambda function.ImanolUr– ImanolUr2019年08月14日 08:40:05 +00:00Commented Aug 14, 2019 at 8:40 -
@ImanolUr I don't get it, the cell here is modified since the result goes in the same series as the inputHicham Zouarhi– Hicham Zouarhi2019年08月14日 09:35:10 +00:00Commented Aug 14, 2019 at 9:35
-
1Sorry I did not myself clear. What I mean is that if you do e.g `lambda x: function(x + 15)', so you perfom any extra action on x, then you should use lambda. But when you just take the value x, and apply a function to it, you don't need to use lambda.ImanolUr– ImanolUr2019年08月14日 12:30:02 +00:00Commented Aug 14, 2019 at 12:30
Explore related questions
See similar questions with these tags.
csv_gdf.geom_type
it returnsNone