I have a BigTIFF file that I need to split into tiles with a set tile size and overlap. I have a script for this using PIL
:
tile_height = tile_width = 1000
overlap = 80
stride = tile_height - overlap
start_num=0
def crop(infile, tile_height, tile_width, stride, img_dict, prj_name):
im = Image.open(infile)
img_width, img_height = im.size
print(im.size)
print(img_width * img_height / (tile_height - stride) / (tile_width - stride))
count = 0
for r in range(0, img_height-tile_height+1, stride):
for c in range(0, img_width-tile_width+1, stride):
#tile = im[r:r+100, c:c+100]
box = (c, r, c+tile_width, r+tile_height)
top_pixel = [c,r]
img_dict[prj_name + "---" + str(count) + ".png"] = top_pixel
count += 1
yield im.crop(box)
img = Image
img_dict = {}
# create the dir if it doesn't already exist
if not os.path.exists(img_dir):
os.makedirs(img_dir)
# break it up into crops
for k, piece in enumerate(crop(infile, tile_height, tile_width, stride, img_dict, prj_name), start_num):
img=Image.new('RGB', (tile_height, tile_width), (255, 255, 255))
print(img.size)
print(piece.size)
img.paste(piece)
image_name = prj_name + "---%s.png" % k
path=os.path.join(img_dir, image_name)
img.save(path)
#add a json file with all image names and geospatial metadata
full_dict = {"image_name" : infile,
"image_locations" : img_dict,
"crs" : str(dataset.crs)
}
with open(img_dir + '/data.json', 'w') as fp:
json.dump(full_dict, fp)
I can't use PIL on my other rasters, as they are "BigTiff" files and not supported in PIL. I am looking for a way to translate this script into another module keeping these exact parameters. I need the parameters and naming methods to stay exactly the same as I'm using these tiles for a deep learning model that I have already created.
I have never used something like GDAL before, but I've read that this may be my best bet for Big TIFF tiling? I would really like to find a way to do this in Python.
-
rasterio should pretty much drop into your existing script since it'll return a numpy array. It wraps gdal but is much more convenient to work with than the Python bindings. It normally comes with BigTIFF support but depends on how it was built, I believe. If you grab it via conda/conda-forge it should, at least.mikewatt– mikewatt2020年04月03日 19:27:46 +00:00Commented Apr 3, 2020 at 19:27
2 Answers 2
You can do a for
loop and read one tile at a time using gdal.ReadAsArray()
, passing both an offset and a window size as arguments. This function returns a numpy
array which you can then easily export to a JPG file.
Your code could look something:
from osgeo import gdal
# open TIFF file (reading) mode and get dimensions
ds = gdal.Open(r'C:\path\to\your\raster.tif', 0)
width = ds.RasterXSize
height = ds.RasterYSize
# define tile size and number of pixels to move in each direction
tile_size_x = 256
tile_size_y = 256
stride_x = 128
stride_y = 128
for x_off in range(0, width, stride_y):
for y_off in range(0, height, stride_x):
# read tile
arr = ds.ReadAsArray(x_off, y_off, tile_size_x, tile_size_y)
# export image using either PIL, gdal or some other library
Of course, you'll need to deal with the edge cases when there are not enough pixels left in the x or y axis.
CLI can also be used as below. Make suitable changes to use function parameters for tile and overlap sizes.
def generate_tiles(input_geotiff):
targetDir=ntpath.basename(input_geotiff).split('.')[0]+"_tiles"
Path(targetDir).mkdir(parents=True, exist_ok=True)
command = "gdal_retile.py -ps 512 512 -overlap 128 -targetDir "+ targetDir + " " + input_geotiff
print(os.popen(command).read())
return targetDir
Explore related questions
See similar questions with these tags.