When I open a raster stack with GDAL and call it as a numpy array, lines with 'no data values' also appear. Since I do not want to include these 'no data values' (mine is 128) in the calculations I will make, I am looking for a way to prevent.
Is there a way to prevent 'no data values' from getting into numpy arrays when opening the raster stack? Or what would you recommend?
My codes are here:
outvrt = ('result/raster_stack_vrt.tif')
outtif = ('result/raster_stack.tif')
tifs = glob.glob('data/*.tif')
outds = gdal.BuildVRT(outvrt, tifs, separate = True)
outds = gdal.Translate(outtif, outds)
2 Answers 2
> import rasterio
> import numpy as np
You can create a mask with numpy:
- Open the raster like a numpy array then run this code and plot the raster.
Blockquote
raster = rasterio.open(inputpath_raster)
raster = raster.read(1)
value = 0
raster = raster.astype('float32') # You can change the format
raster_copy = copy.copy(raster)
raster_copy[raster == value] = np.nan # Value equal 'nan value'
raster_copy[raster > value] = 1
raster_nan = raster_copy * raster
The process is something like this, I think you have to repeat the process por each band in your stack
-
No data value is 128GeoMonkey– GeoMonkey2023年06月04日 11:12:20 +00:00Commented Jun 4, 2023 at 11:12
The most efficient way to do this is with numpy and masked arrays:
import numpy as np
import rasterio as rio
src = rio.open(file.tif)
arr = src.read(1)
masked_arr = np.where(arr==128, np.nan, arr)
Then you can treat the array as you usually would when dealing with nan values. You can mask the array if you like:
https://numpy.org/devdocs/reference/maskedarray.generic.html
But if your calculations are simple you could use the numpy built in nan calculations:
numpy.nanmean(masked_arr)