I'm trying to do something fairly simple. Take two subdatasets from an hdf file, read them as an array, do some math and write the new array back to a geotiff, using the same projection as the input files.
Because I'll be repeating this process for a list of .hdf files where the projection varies, I cannot hardcode the projection string for my outputs. While the projection varies from hdf file to file, it is the same for all subdatasets in a particular hdf file. So I only need to get the proj from one of the subdatasets. I thought it would be fairly easy to store the projection info into a variable, but the normal routines I use aren't working on the hdf file.
Here are few different things I've tried:
ds = open(hdfFile)
sds_b3 = ds.GetSubDatasets()[4][0]
sds_b4 = ds.GetSubDatasets()[5][0]
# this gives me an empty string:
proj = ds.GetProjection()
# both of these give me "AttributeError: 'str' object has no attribute 'GetProjection":
proj = ds.GetSubDatasets()[4][0].GetProjection()
proj = ds.GetSubDatasets()[4][1].GetProjection()
# this last one works in that it prints the correct projection info, in addition
# to a bunch of other metadata, but I don't know how to extract just the projection
# and store it into a variable:
os.system('gdalinfo %s' % sds_b3)
How can I store the projection information in a variable, like 'proj = dataset.GetProjection()' normally would, so that I can use it when writing the new array to a geotiff? ie:
output = gdal.GetDriverByName("GTiff").Create(outputname, nlines, nrows)
output.SetProjection(proj)
I either need to know the correct way to use GetProjection() when using hdf's or how to isolate the projection part of gdalinfo as a variable.
2 Answers 2
This is fairly straightforward if you think of the HDF dataset as a container, where each subdataset is a raster image with its own projection.
Your error is in not opening the subdataset, as GetSubDatasets
only returns the strings you need to access them.
# open the HDF container
hdf_ds = gdal.Open(hdfFile)
# this is just a string of the name of the subdataset
b3_string = hdf_ds.GetSubDatasets()[4][0]
# open the subdataset
sds_b3 = gdal.Open(hdf_ds.GetSubDatasets()[4][0])
# get the projection
proj = sds_b3.GetProjection()
I suspect your issue is happening right in the beginning of the script, perhaps not getting at the datasets properly here:
ds = open(hdfFile)
sds_b3 = ds.GetSubDatasets()[4][0]
sds_b4 = ds.GetSubDatasets()[5][0]
...there doesn't seem to be anything wrong in your sytax for proj = ds.GetProjection()
.
I've been converting GeoTiffs to numpy arrays and back again, and this works for me:
import os, numpy, gdal
# Open the input raster and get some info about it
ds = gdal.Open(inRast)
b = ds.GetRasterBand(1) # Assuming there's only the one band
xSize = b.XSize
ySize = b.YSize
# Getting geotransform and projection information...
geoTrans = ds.GetGeoTransform()
wktProjection = ds.GetProjection() # Well-Known Text.
# Reading to numpy array...
bArr = gdal.Band.ReadAsArray(b)
# do stuff with the array now...
# Save the same-size-and-shape array (or another like it) back to GeoTiff
dst_filename = os.path.join(r"c:\my\save\path", "afilename.tif")
driver = gdal.GetDriverByName('GTiff')
# For a 32-bit integer, other formats possible too:
dataset = driver.Create(dst_filename, xSize, ySize, 1, gdal.GDT_Int32)
# Now we set the projection info from above:
dataset.SetGeoTransform(geoTrans)
dataset.SetProjection(wktProjection)
# And we write the data:
oBand = dataset.GetRasterBand(1)
# oBand.SetNoDataValue(NoDataVal) # In case you want to set this
oBand.WriteArray(bArr)
del dataset # Cleaning up and clearing memory.
-
While this may work for GeoTiffs it doesn't address the issue with HDF datasets.Kersten– Kersten2015年11月04日 07:31:21 +00:00Commented Nov 4, 2015 at 7:31