I am trying to take a polygons shapefile, loop through all the polygons/rows, and rasterize each one, where each raster file output rasterizes its associated polygon as if it were isolated (the only polygon in the layer). In the attribute table of this shapefile, there is a "Values" column. I want to burn in each polygon's associated "Values" value into its rasterized polygon, where all other area in the raster is 0. Lastly, I want each raster file output to be numbered (e.g. "Raster_output_35" etc.) I am trying to use python, geopandas, and rasterio to accomplish this. I am trying the following code in Jupyter Notebook:
from osgeo import gdal
from osgeo import ogr
import numpy as np
import matplotlib.pyplot as plt
import geopandas as gpd
import os
from shapely.geometry import MultiPolygon
from rasterio.mask import raster_geometry_mask
# read in polygon file
gdf = gpd.read_file('polygons.shp')
# get extent of all polygons
xmin,ymin,xmax,ymax = MultiPolygon((gdf['geometry'].tolist())).bounds
# set raster resolution and use that to get width and height
res = 1 # just an arbitrary resolution i chose for my EPSG:4326 polygons
width = int((xmax-xmin)/res)
height = int((ymax-ymin)/res)
# get the affine transformation for our empty raster
transform = rasterio.transform.from_origin(xmin,ymax,res,res)
# create rasterio dataset with empty raster
with rasterio.open('new.tif','w',driver='GTiff',height=height,width=width,
count=1,dtype='uint8',crs='EPSG:4326',
transform=transform) as empty:
# loop through polygon geodataframe creating rasters from polygons
for ind,row in gdf.iterrows():
mask,mask_tform,window = raster_geometry_mask(empty,[row['geometry']],invert=True)
mask = mask.astype('uint8')*gdf['test_value'] # "Values" value, 0 elsewhere
# write out mask as a raster, use metadata from empty dataset for parameters
outpath = f'raster{ind}.tif' # raster0.tif, raster1.tif, etc
with rasterio.open(outpath,'w',**empty.meta) as dst:
dst.write(mask,1)
This code when run in Jupyter Notebook produces the following error message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_6924/1739464051.py in <module>
3
4 # get extent of all polygons
----> 5 xmin,ymin,xmax,ymax = MultiPolygon((gdf['geometry'].tolist())).bounds
6
7 # set raster resolution and use that to get width and height
~\AppData\Roaming\Python\Python39\site-packages\shapely\geometry\multipolygon.py in __init__(self, polygons, context_type)
58 return
59 elif context_type == 'polygons':
---> 60 geom, n = geos_multipolygon_from_polygons(polygons)
61 elif context_type == 'geojson':
62 geom, n = geos_multipolygon_from_py(polygons)
~\AppData\Roaming\Python\Python39\site-packages\shapely\geometry\multipolygon.py in geos_multipolygon_from_polygons(arg)
183 # no implicit flattening.
184 if isinstance(obs[0], MultiPolygon):
--> 185 raise ValueError("Sequences of multi-polygons are not valid arguments")
186
187 exemplar = obs[0]
ValueError: Sequences of multi-polygons are not valid arguments
It appears that the error is being caused by this line specifically:
xmin,ymin,xmax,ymax = MultiPolygon((gdf['geometry'].tolist())).bounds
How can I fix this code so that I can rasterize each polygon in my shapefile individually and burn in its corresponding "Values" value? I am not sure if there might be something wrong with my shapefile itself, because the prior line:
gdf = gpd.read_file('polygons.shp')
runs fine.
1 Answer 1
This is a shapely problem in the creation of MultiPolygons. See Why is a MultiPolygon of a MultiPolygon smaller than its input? for example:
The constructor of MultiPolygon calls the function geos_multipolygon_from_polygons which performs following check:
# This function does not accept sequences of MultiPolygons: there is # no implicit flattening. if isinstance(obs[0], MultiPolygon): raise ValueError("Sequences of multi-polygons are not valid arguments")
import geopandas as gpd
gdf = gpd.read_file('polygons.shp')
gdf.geometry
0 POLYGON ((251828.453 139390.562, 251854.391 13...
1 MULTIPOLYGON (((257510.859 138175.375, 257428....
2 POLYGON ((257397.594 138708.062, 257397.031 13...
3 MULTIPOLYGON (((251627.188 144678.219, 251608....
Name: geometry, dtype: geometry
Reproduction of the error:
a = gdf.iloc[0:3]
a.geometry
0 POLYGON ((251828.453 139390.562, 251854.391 13...
1 MULTIPOLYGON (((257510.859 138175.375, 257428....
2 POLYGON ((257397.594 138708.062, 257397.031 13...
Name: geometry, dtype: geometry
MultiPolygon((a.geometry.tolist())).bounds
(247965.21875, 138174.9375, 258014.09375, 141038.390625)
# but
b = gdf.iloc[1:4]
b.geometry
1 MULTIPOLYGON (((257510.859 138175.375, 257428....
2 POLYGON ((257397.594 138708.062, 257397.031 13...
3 MULTIPOLYGON (((251627.188 144678.219, 251608....
Name: geometry, dtype: geometry
MultiPolygon((b.geometry.tolist())).bounds
Traceback (most recent call last):
......
raise ValueError("Sequences of multi-polygons are not valid arguments")
ValueError: Sequences of multi-polygons are not valid arguments
To get the extent of the shapefile/GeoDataFrame (all polygons), use total_bounds
xmin,ymin,xmax,ymax = a.total_bounds
print(xmin,ymin,xmax,ymax)
247965.21875 138174.9375 258014.09375 141038.390625
# and
xmin,ymin,xmax,ymax = b.total_bounds
print(xmin,ymin,xmax,ymax)
242520.0625 138175.0 258014.09375 146067.203125
xmin,ymin,xmax,ymax = gdf.total_bounds
print(xmin,ymin,xmax,ymax)
242520.0625, 138174.9375, 258014.09375, 146067.203125
Or unary_union
a.geometry.unary_union.bounds
(247965.21875, 138174.9375, 258014.09375, 141038.390625)
b.geometry.unary_union.bounds
(242520.0625, 138175.0, 258014.09375, 146067.203125)
gdf.geometry.unary_union.bounds
(242520.0625, 138174.9375, 258014.09375, 146067.203125)
New
The problem is the creation of a Multipolygon with your data, so don't use it (I don't use it in my solutions)
MultiPolygon((b['geometry'].tolist())).total_bounds
Traceback (most recent call last):
.........
ValueError: Sequences of multi-polygons are not valid arguments
b['geometry'].tolist().total_bounds
Traceback (most recent call last):
........
AttributeError: 'list' object has no attribute 'total_bounds'
b['geometry'].total_bounds
array([242520.0625, 138175.0., 258014.0937,146067.203125])
b.total_bounds
array([242520.0625, 138175.0, 258014.09375 , 146067.203125])
-
This is very informative, thank you. Though I tried both
xmin,ymin,xmax,ymax = MultiPolygon((gdf['geometry'].tolist())).total_bounds
andxmin,ymin,xmax,ymax = MultiPolygon((gdf['geometry'].tolist())).unary_union
and got the error as before:ValueError: Sequences of multi-polygons are not valid arguments
Did I misunderstand the syntax you were referring to?LostinSpatialAnalysis– LostinSpatialAnalysis2021年12月06日 01:35:57 +00:00Commented Dec 6, 2021 at 1:35 -
Explore related questions
See similar questions with these tags.