3

I just started working with QGIS, and for the life of me I cannot figure out how to solve this problem:

I have several points of interest, these are stored in a shapefile (just simple points). I have (for now) one raster file with values. What I want to do basically is select those cells in my raster that are near the points in the shapefile (I want several window-sizes, like first only one cell, than 9 (3x3), than 25 (5x5) etc). I want to average these cells and get the median and stdev.

I want to do this in a Python script, since I will probably have to do this again (and again) at some point. I am new to QGIS, but in ArcGIS I would have tried converting my points to a raster, and 'grow' this raster while calculating zonal statistics.

However since in QGIS it should be easier to get data-level access, I was hoping to avoid all that and just convert my points to a position in a raster, and than loop over the values using Python. I already figured out how to get to the data using ReadAsArray:

raster = gdal.Open(inputRaster)
band = raster.GetRasterBand(1)
data = band.ReadAsArray() 

However I am not sure how (if it is even possible once the data is converted to an array) to link the positions in my shapefile to the position in the array.

If this is not possible is there another strategy I should consider?

nmtoken
13.6k5 gold badges39 silver badges91 bronze badges
asked Nov 27, 2014 at 13:13
2
  • To clarify, you want to identify the raster cell containing a point, use that cell as a central cell and calculate the statistics of neighboring cells of a defined area? Commented Nov 27, 2014 at 17:09
  • Yes, that is exactly what I want. Commented Nov 28, 2014 at 7:58

1 Answer 1

3

You should use a scanline and the unpack struct function with Python/GDAL. The scanline/struct method depends on fmttypes and their values can be supplied in a dictionary. For testing my answer I prepared a little raster of 29 rows x 29 columns and selected a region of 9 x 9 pixels in the arbitrary position (xoff,yoff) = (5,7). I registered manually the values in the 9 x 9 data matrix:

(1669, 1666, 1661, 
1670, 1667, 1665, 
1672, 1671, 1669)

and their selected statistics (mean, median, minimun, maximun) were tested with the results of the next code at the Python Console Editor of QGIS:

from osgeo import gdal
import struct
import numpy as np
layer = iface.activeLayer()
provider = layer.dataProvider()
fmttypes = {'Byte':'B', 'UInt16':'H', 'Int16':'h', 'UInt32':'I', 'Int32':'i', 'Float32':'f', 'Float64':'d'}
path= provider.dataSourceUri()
dataset = gdal.Open(path)
band = dataset.GetRasterBand(1)
BandType = gdal.GetDataTypeName(band.DataType)
scanline = band.ReadRaster(5, 7, 3, 3, 3, 3, band.DataType)
values = struct.unpack(fmttypes[BandType] * 9, scanline)
print values
print "mean = ", np.mean(values)
print "median = ", np.median(values)
print "min = ", np.min(values)
print "max = ", np.max(values)
dataset = None

The results were the following:

(1669, 1666, 1661, 1670, 1667, 1665, 1672, 1671, 1669)
mean = 1667.77777778
median = 1669.0
min = 1661
max = 1672

and the statistics coincided. For complete information on format specifiers of 'ReadRaster' function see the Python GDAL documentation.

Taras
35.7k5 gold badges77 silver badges151 bronze badges
answered Feb 8, 2015 at 20:27

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.