I have a folder of raster files, each containing a single rasterized polygon, where a value of 1 is burned into the polygon shapes, and where all else is no data (or could be simply changed to 0).
I would like to loop through this folder to run Euclidean distance on each of the individual raster files. To do this I would like to use gdal_proximity
, though I would like to use this as a Python function rather than as a command line command. I am referring to this GDAL gdal_proximity documentation here: https://gdal.org/programs/gdal_proximity.html
Referencing this resource, the command line syntax looks like:
gdal_proximity.py <srcfile> <dstfile> [-srcband n] [-dstband n]
[-of format] [-co name=value]*
[-ot Byte/UInt16/UInt32/Float32/etc]
[-values n,n,n] [-distunits PIXEL/GEO]
[-maxdist n] [-nodata n] [-use_input_nodata YES/NO]
[-fixed-buf-val n]
How might I restructure this command syntax so that it works as a Python function that I could simply run in Jupyter Notebook? I have referred to this "GDAL/OGR Python API" resource here: https://gdal.org/python/
However, I cannot find a Python Jupyter Notebook-ready version of gdal_proximity
here. How can I format this command to accept an example input raster of raster.tif
?
I am hoping I am using the terminology API
correctly in this context, meaning a Python "version" of the original command. I see the gdal command is entered as gdal_proximity.py
, but I am not sure if this would be the same thing as something like gdal.Proximity()
.
1 Answer 1
Probably the best docs are the C API docs for GDALComputeProximity
CPLErr GDALComputeProximity(GDALRasterBandH hSrcBand, GDALRasterBandH hProximityBand, char **papszOptions, GDALProgressFunc pfnProgress, void *pProgressArg)
The progress function args may be NULL or a valid progress reporting function such as GDALTermProgress/NULL.
Options:
VALUES=n[,n]*
A list of target pixel values to measure the distance from. If this option is not provided proximity will be computed from non-zero pixel values. Currently pixel values are internally processed as integers.
DISTUNITS=[PIXEL]/GEO
Indicates whether distances will be computed in pixel units or in georeferenced units. The default is pixel units. This also determines the interpretation of MAXDIST.
MAXDIST=n
The maximum distance to search. Proximity distances greater than this value will not be computed. Instead output pixels will be set to a nodata value.
NODATA=n
The NODATA value to use on the output band for pixels that are beyond MAXDIST. If not provided, the hProximityBand will be queried for a nodata value. If one is not found, 65535 will be used.
USE_INPUT_NODATA=YES/NO
If this option is set, the input data set no-data value will be respected. Leaving no data pixels in the input as no data pixels in the proximity output.
FIXED_BUF_VAL=n
If this option is set, all pixels within the MAXDIST threshold are set to this fixed value instead of to a proximity distance.
The Python docs for osgeo.gdal.ComputeProximity
are just:
osgeo.gdal.ComputeProximity(Band srcBand, Band proximityBand, char ** options=None, GDALProgressFunc callback=0, void * callback_data=None)→ int
As mentioned in the comments, the best way to see how this is used is scripts/compute_proximity.py
:
import gdal
# open rasterized file and get information
fn = '../data/raster/COL_rails.tif'
ds = gdal.Open(fn, 0)
band = ds.GetRasterBand(1)
gt = ds.GetGeoTransform()
sr = ds.GetProjection()
cols = ds.RasterXSize
rows = ds.RasterYSize
# create empty proximity raster
out_fn = '../data/raster/COL_rails_proximity.tif'
driver = gdal.GetDriverByName('GTiff')
out_ds = driver.Create(out_fn, cols, rows, 1, gdal.GDT_Float32)
out_ds.SetGeoTransform(gt)
out_ds.SetProjection(sr)
out_band = out_ds.GetRasterBand(1)
# compute proximity
gdal.ComputeProximity(band, out_band, ['VALUES=1', 'DISTUNITS=PIXEL'])
# delete input and output rasters
del ds, out_ds
gdal.ComputeProximity()
at github.com/marcelovilla/gdal-py-snippets/blob/master/scripts/…. Seems a bit simpler.