9

I have a raster image with 3 bands. I would like to convert this image to a csv file where each row will be one pixel and each column will be one band, so that I can easily see the three values each pixel got.

This is how I have tried to do it:

import rasterio
import rasterio.features
import rasterio.warp
from matplotlib import pyplot
from rasterio.plot import show
import pandas as pd
import numpy as np
img=rasterio.open("01032020.tif")
show(img,0)
#read image 
array=img.read()
#create np array
array=np.array(array)
#create pandas df
dataset = pd.DataFrame({'Column1': [array[0]], 'Column2': [array[1]],'Column3': [array[2]]})
dataset

and also like this:

dataset = pd.DataFrame({'Column1': [array[0,:,:]], 'Column2': [array[1,:,:]],'Column3': [array[2:,:]]})

but i'm getting something weird like this table: enter image description here

I have also tried:

index = [i for i in range(0, len(array[0]))]
dataset = pd.DataFrame({'Column1': array[0], 'Column2': array[1],'Column3': array[2]},index=index)
dataset

but then I get the number of the rows I have and it's still not good: enter image description here

what do I do wrong?

My goal

Get one pandas table, where each row is a pixel, and it should have 3 columns, one for each band.

asked May 11, 2020 at 14:28

3 Answers 3

6

Quick solution

pd.DataFrame(array.reshape([3,-1]).T)

Explanation

  1. Take array of shape (3, x, y) and flatten out the 2nd and 3rd dimension. From the numpy docs: One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
reshaped_array = array.reshape([3,-1])
  1. Transpose array to get array of shape (x*y, 3)
transposed_array = reshaped_array.T
  1. Build DataFrame
pd.DataFrame(transposed_array)
answered May 11, 2020 at 15:04
2
  • 1
    thank you for yoir aswer, is ther any way to preserve the coordinates? Commented Oct 30, 2020 at 10:14
  • 2
    You will need one or two extra columns that store the index/indices of the original image. I think that's for a new question. -> gis.stackexchange.com/questions/ask Commented Oct 30, 2020 at 10:32
4

Or another simple solution with numpy ravel():

import rasterio as rio
src= rio.open('myraster.tif')
# number of bands
src.count
3
# read bands
array = src.read()
# convert to a DataFrame
import pandas as pd
df = pd.DataFrame()
df['band1'] = array[0].ravel() 
df['band2'] = array[1].ravel() 
df['band3'] = array[2].ravel() 
df.head(2)
 band1 band2 band 3
0 250 249 254
1 250 249 254
df.tail(2) # last
 band1 band2 band 3
78609002 190 182 180
78609003 190 186 174

Or

answered May 11, 2020 at 15:20
1

You can check that here http://shreshai.blogspot.com/

The implementation is for a multiband raster and also keeps the coordinates

with rasterio.open(RASTER_PATH) as src:
 #read image
 image= src.read()
 # transform image
 bands,rows,cols = np.shape(image)
 image1 = image.reshape (rows*cols,bands)
 print(np.shape(image1))
 # bounding box of image
 l,b,r,t = src.bounds
 #resolution of image
 res = src.res
 res = src.res
 # meshgrid of X and Y
 x = np.arange(l,r, res[0])
 y = np.arange(t,b, -res[0])
 X,Y = np.meshgrid(x,y)
 print (np.shape(X))
 # flatten X and Y
 newX = np.array(X.flatten())
 newY = np.array(Y.flatten())
 print (np.shape(newX))
 # join XY and Z information
 export = np.column_stack((newX, newY, image1))
answered Feb 25, 2023 at 8:45

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.