2

I have a GeoTIFF image that looks like the following

f = rxr.open_rasterio('myFile.tif')
f5 = f.sel(band=5)
f5.plot.imshow()

enter image description here

I would like to create a Pandas dataframe with the values of each pixel and the coordinates of the centroid of the pixels. How can I find the values of the centroid of each pixel and and two columns lat and lon to the following dataframe?

a = np.ravel(f5)
df = pd.DataFrame({'LSTDay':a})
df.head()
 LSTDay
0 15631.0
1 15647.0
2 15624.0
3 15624.0
4 15590.0

Here the link to the GeoTIFF file

Vince
20.5k16 gold badges49 silver badges65 bronze badges
asked Apr 21, 2021 at 13:52
1

2 Answers 2

7

Your raster (2015LSTDay.tif, band 5)

enter image description here

You can use rasterio with pandas:

import rasterio as rio
import pandas as pd
with rio.open('2015LSTDay.tif') as dataset:
 val = dataset.read(5) # band 5
 no_data=dataset1.nodata
 data = [(dataset1.xy(x,y)[0],dataset1.xy(x,y)[1],val[x,y]) for x,y in np.ndindex(val.shape) if val[x,y] != no_data]
 lon = [i[0] for i in data]
 lat = [i[1] for i in data]
 d = [i[2] for i in data]
 res = pd.DataFrame({"long":lon,'lat':lat,"data":v})
res.head()
 long lat val
0 12.307060 42.0571 15631.0
1 12.315384 42.0571 15647.0
2 12.323708 42.0571 15624.0
3 12.332033 42.0571 15624.0
4 12.340357 42.0571 15590.0

Result

import matplotlib.pyplot as plt
from rasterio.plot import show
fig, ax = plt.subplots(figsize=(8, 8))
show(dataset.read(5), transform=dataset.transform,ax=ax)
ax.plot(res.x,res.y,'ro', markersize=3)

enter image description here

You can use rasterio with GeoPandas:

import geopandas as gpd
from shapely.geometry import Point
with rio.open('2015LSTDay.tif') as dataset:
 val = dataset.read(5) # band 5
 no_data=dataset.nodata
 geometry = [Point(dataset.xy(x,y)[0],dataset.xy(x,y)[1]) for x,y in np.ndindex(val.shape) if val[x,y] != no_data]
 v = [val[x,y] for x,y in np.ndindex(val.shape) if val[x,y] != no_data]
 df = gpd.GeoDataFrame({'geometry':geometry,'data':v})
 df.crs = dataset.crs
df.head()
 geometry data
0 POINT (12.30706 42.05710) 15631.0
1 POINT (12.31538 42.05710) 15647.0
2 POINT (12.32371 42.05710) 15624.0
3 POINT (12.33203 42.05710) 15624.0
4 POINT (12.34036 42.05710) 15590.0

Export to shapefile

df.to_file("points.shap")

You can also use rioxarray as suggested by snowman2:

import rioxarray
rds = rioxarray.open_rasterio("2015LSTDay.tif")
rds = rds.squeeze().drop("spatial_ref").drop("band")
rds.name = "data"
res = rds.to_dataframe().reset_index()
res.head(2)
 band y x data
0 0 42.0571 12.307060 15228.0
1 0 42.0571 12.315384 15246.0

Band 5 only

gr = res.groupby(res.band)
gr.get_group('5').head()
 band y x data
11550 5 42.0571 12.307060 15652.0
11551 5 42.0571 12.315384 15671.5
11552 5 42.0571 12.323708 15702.5
11553 5 42.0571 12.332033 15702.5
11554 5 42.0571 12.340357 15642.5
answered Apr 21, 2021 at 15:56
1
  • thank you. Can we also generate a geopandas dataframe where the geometry is a rectangle, i.e. the size of the pixel? Commented Apr 22, 2021 at 13:06
3
import rioxarray
rds = rioxarray.open_rasterio("file.tif")
rds.to_dataframe()
answered Apr 21, 2021 at 13:59
1
  • What is the relevance of the two links included at the beginning of your answer? If there are three questions for which the same answer applies have you considered whether they are perhaps duplicate questions? Commented Apr 22, 2021 at 0:01

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.