Exporting GEE satellite imagery to Numpy array

Question 1

I'm trying to extract LANDSAT-8 images around a lat/lon location, using the Google Earth Engine API for Python.

Specifically, I want to generate a bounding box of 1km by 1km around my point, and extract the pixels for the RGB bands, so my output should be 33 by 33 by 3, given that LANDSAT-8 resolution is 30m.

However, I can't get this to work. I'm generating my box using ee.Geometry.Point.buffer and ee.Geometry.Polygon.bounds, and then I'm trying to overlay my geometry over LANDSAT-8 images of 2020, of which I want to get the average value pixel-wise (so the annual average image should be 33 by 33 by 3). After that, I plan to extract the data as a numpy.Array using the ee.Image.sampleRectangle function, but it is not working. Here is a minimal working example:

#importing packages
import numpy as np
import ee
ee.Authenticate()
ee.Initialize()
#converting a lon/lat pair to an ee.point object
point=ee.Geometry.Point( [-122.2036486, 37.4237011] )
# converting the point to a patch: we define a circle with a 500m radius, and then we put a box around it
patch=point.buffer(500).bounds() 
#checking that the patch has approximately the right area 1000^2 m^2
patch.area(1).getInfo()
#defining the image: Landsat 8 collection of images from 2020, with RGB bands. We take the mean of them
imagery=ee.ImageCollection("LANDSAT/LC08/C01/T1").filterDate('2020-01-01', '2020-12-31').select(['B3','B4','B2']).mean()
#getting a matrix from the image for our patch of interest
rect_image = imagery.sampleRectangle(region=patch)
#we should obtain a 33*33 matrix for each one of the 3 bands
#extracting one band to check
band_b4 = rect_image.get('B4')
#checking the shape of the resulting matrix
np.array(band_b4.getInfo()).shape
#the result is 1*1

I feel that one problem here is that the mean() function is reducing collapsing the dimensions of the images, so my "fix" to that is instead using the first() function to get a proper image, even if it is not the average image I want. Here is what happens:

imagery2=ee.ImageCollection("LANDSAT/LC08/C01/T1").filterDate('2020-01-01', '2020-12-31').select(['B3','B4','B2']).first()
#checking the dimensions
imagery2.getInfo()['bands'][1]['dimensions']
#getting a matrix from the image for our patch of interest
rect_image2 = imagery2.sampleRectangle(region=patch)
#in theory, since this is 1000m*1000m and the resolution of the satellite is 30m,
#we should obtain a 33*33 matrix for each one of the 3 bands
#extracting 1 band to check
band_b4_2 = rect_image2.get('B4')
#checking the shape of the resulting matrix
np.array(band_b4_2.getInfo()).shape
#now we get an error

How could I solve this?

I have tried a lot of things like using ee.ImageCollection.filterBound over the image collection, ee.Image.clip, and I have tried to export the images in TIFF or TFRecord formats with the ee.batch.Export.image.toDrive method, but that also hasn't worked.

Question 2

You can export the image to your drive (or gdrive):

Consider l8_image is an image ee object which you want to save.

### Getting the filenames - might be useful
l8_img_meta = l8_img.getInfo()
imagename = l8_img_meta.get('properties',{}).get('PRODUCT_ID') #fetches the name

Storing the image as a .tif

This is a task command to GEE, you will have to wait a bit for it to finish

task_config = {
 'image': l8_img,
 'fileFormat': 'GeoTIFF',
 'folder': '<foldername>',
 'fileNamePrefix': imagename[0:19],
 'description': "clipped area",
 'scale':20,
 'region':poly_area
}
# This is how we order it to start
task = ee.batch.Export.image.toDrive(**task_config )
task.start()
### check task status - you can see if it failed, it's running or finished
task.status()

Operating with rasters in Python:

# required: pyrsgis, rasterio, pyproj
import rasterio
import rasterio.plot
import pyproj
from pyrsgis.convert import changeDimension
### loading the file
s2_data = "path to file"
### loading and checking
ds1, bands = raster.read(s2_data)
print(ds1)
print(bands.shape) 
bandByPixel = changeDimension(bands)/10000. #we have to devide all values by 10k - its a conversion from bits to reflectances
bandByPixel_t = np.transpose(bandByPixel)
print(bandByPixel.shape)
print(bandByPixel_t.shape)
#opening the raster
with rasterio.open(s2_data) as src:
 subset = src.read(7)/10000.
#plotting the raster
plt.figure(figsize=(6,8.5))
plt.imshow(subset)
plt.colorbar(shrink=0.5)
#plt.title(f'Band 4 Subset\n{window}')
plt.xlabel('Column #')
plt.ylabel('Row #')

By here, it should be np.array or easily convertible.

Question 3

Thanks Nuno! My problems were mainly how to take the mean of the ImageCollection and how to properly intersect my Geometry with my image, but I can use what you mentioned to later export the images properly to my desired format.

Question 4

Just checking to confirm that there's not a way to go straight to the np.array (eg, avoiding exporting as a .tif then importing back in?) I'm able to get a list, but would be great to just get the np.appy

img_list =ee.ImageCollection("LANDSAT/LC08/C01/T1").filterDate('2020年01月01日', '2020年12月31日').select(['B3','B4','B2']).mean().reduceRegion(reducer=ee.Reducer.toList(),geometry=patch,scale=30)

print(img_list.getInfo())

Question 5

I actually figured out how to do this in the end (I'm going to answer the question now), but I'm not aware of a good way to avoid the .tif format. In my final approach I used ee.batch.Export.image.toDrive (ee.batch.Export.image.toCloudStorage works in a very similar way) because it generates only one TIFF file with all the channels of the image rather than a ZIP file with one file for each channel. After downloading the file, I used the imageio package, and did something like np.array(imageio.imread(image_path)) to convert the image from TIFF to an np.array.

Question 6

After a comment was made here I was reminded of this question. After a while I was able to solve my issues, I wrote a little tutorial of how to export imagery in my github page, but the most relevant to me was the following:

Defining my area to be clipped didn't worked after using bounds, so I ended up doing the following to convert the patch around the pixel to an ee.Geometry.Rectangle object:

 region= point.buffer(len/2).bounds().getInfo()['coordinates']
 #defining the rectangle
 coords=np.array(region)
 #taking min and maxs of coordinates to define the rectangle
 coords=[np.min(coords[:,:,0]), np.min(coords[:,:,1]), np.max(coords[:,:,0]), np.max(coords[:,:,1])]
 rectangle=ee.Geometry.Rectangle(coords)

The rest of the problems were associated to the arguments used when exporting the function. To clip the image, I used image.filterBounds(rectangle).mean(), where image is an ee.ImageCollection object, and rectangle is the previously defined geometry. Using mean() converts this to an ee.Image object.

With that done, I added the following arguments to either the ee.batch.Export.image.toDriveor ee.batch.Export.image.toCloudStorage besides the obvious ones (like the image we are exporting or the destination of the file:
- region=str(region) to pass the previous list of coordinates of the geometry as a string.
- dimensions="33x33" to pass the number of pixels to be included in the image.
When the image was exported to either of the sources, I used the imageio package, and did something like np.array(imageio.imread(image_path)) to convert the image from TIFF to an np.array.

Nicolas Suarez Nicolas Suarez 656 bronze badges · Accepted Answer · 2021-06-15 06:28:05Z

After a comment was made here I was reminded of this question. After a while I was able to solve my issues, I wrote a little tutorial of how to export imagery in my github page, but the most relevant to me was the following:

Defining my area to be clipped didn't worked after using bounds, so I ended up doing the following to convert the patch around the pixel to an ee.Geometry.Rectangle object:

 region= point.buffer(len/2).bounds().getInfo()['coordinates']
 #defining the rectangle
 coords=np.array(region)
 #taking min and maxs of coordinates to define the rectangle
 coords=[np.min(coords[:,:,0]), np.min(coords[:,:,1]), np.max(coords[:,:,0]), np.max(coords[:,:,1])]
 rectangle=ee.Geometry.Rectangle(coords)

The rest of the problems were associated to the arguments used when exporting the function. To clip the image, I used image.filterBounds(rectangle).mean(), where image is an ee.ImageCollection object, and rectangle is the previously defined geometry. Using mean() converts this to an ee.Image object.

With that done, I added the following arguments to either the ee.batch.Export.image.toDriveor ee.batch.Export.image.toCloudStorage besides the obvious ones (like the image we are exporting or the destination of the file:
- region=str(region) to pass the previous list of coordinates of the geometry as a string.
- dimensions="33x33" to pass the number of pixels to be included in the image.
When the image was exported to either of the sources, I used the imageio package, and did something like np.array(imageio.imread(image_path)) to convert the image from TIFF to an np.array.

Stack Exchange Network

Exporting GEE satellite imagery to Numpy array

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Exporting GEE satellite imagery to Numpy array

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions