I work with large drone image raster files, on the order of 40,000 x 40,000 and above. I have a large uncompressed GeoTIFF file and I want to use rasterio
to rewrite the file in compressed format. I can do this by loading all data into memory, but is there a way to execute this writing without loading everything to memory?
For my original file I can open it with:
dat = rasterio.open("grid_001.tif")
Then to rewrite the file with compression I tried:
profile = dat.profile.copy()
profile.update(
compress='lzw')
with rasterio.open("grid_001_compressed.tif", 'w', **profile) as dst:
dst.write(dat)
This will give me an error:
ValueError: Source shape (44134, 44162) is inconsistent with given indexes 3
This is an expected error, because when I open
the dataset it creates an iterator or lazy object without actually accessing the data. Now, if I did a command like:
dat = dat.read()
This will load all of the data from the file into memory, and I can an array of dimensions [3, 44134, 44162]
that I can write. BUT, this takes a lot of memory to implement.
Hence, is there a way to perform the same operation, but without loading everything into memory? I am not sure if windowed reads would help in this case or anything.
1 Answer 1
You can use rasterio's window or block reading & writing
dat = rasterio.open("grid_001.tif")
profile = dat.profile.copy()
profile.update(compress='lzw')
with rasterio.open("grid_001_compressed.tif", 'w', **profile) as dst:
for ji, window in dat.block_windows(1)
dst.write(dat.read(window=window), window=window)
-
this is great. Thanks for the direction. Will this work for multiband rasters too? I just ask because it has
dat.block_windows(1)
. My raster is bands, so would I need to iterate over bands, or could I just usedat.block_windows(3)
?krishnab– krishnab2020年07月27日 21:53:14 +00:00Commented Jul 27, 2020 at 21:53 -
2Assuming each band has the same block size (as per the documentation I linked to), then this will read & write all bands at the same timeuser2856– user28562020年07月27日 21:56:26 +00:00Commented Jul 27, 2020 at 21:56
gdal_translate
would work. I was looking for something that was more python and less command line, but your suggestion would work.gdal.Translate('output.tif', openDataset, creationOptions = ['COMPRESS=LZW'])