Convert Raster to Numpy Array with only Arcpy and Numpy

Question 1

All the articles I'm finding including GDAL or PIL(?) which I cannot use. So I am trying to convert a raster I have to a Numpy Array with Arcpy and Numpy and then calculate some statistics on it but running into an error. Not much documentation on this and not sure what I am doing wrong. I am running ArcGIS 10.3.1 with Python 2.7.8 on a Windows 7 machine with 16gb ram.

Here is my code:

import arcpy
from arcpy.sa import *
from arcpy import env
import numpy as np
import sys, os, string, glob
#Set Environment
arcpy.CheckOutExtension("spatial")
arcpy.env.overwriteOutput = True
env.extent = "MINOF"
Coordsystem = "GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]]"
#Set Input Data
tcorr = arcpy.Raster(r"G:/P38R37/test4/TcorrCon/Tcorr198.tif")
lowerLeft = arcpy.Point(tcorr.extent.XMin,tcorr.extent.YMin)
cellSize = tcorr.meanCellWidth
#Convert Raster to Numpy Array
#Raster is 7951 columns x 6971 rows, Floating point 32 bit, 211MB in size
tcorr_arr = arcpy.RasterToNumPyArray(tcorr,lowerLeft,7951,6971,nodata_to_value=0)
#Calculate the mean and standard deviation of the array
arr_mean = np.mean(tcorr_arr)
arr_std = np.std(tcorr_arr)
#Take the mean and std and do math
double_std = arr_std * 2
cfactor = mean - double_std
print cfactor

The error I am getting says:

Traceback (most recent call last):
File "C:\Python27\ArcGIS10.3\Lib\site-packages\pythonwin\pywin\framework\scriptutils.py", line 326, in RunScript
exec codeObject in __main__.__dict__
File "C:\Users\mschauer\Documents\Py Scripts\scratch\numpytest_tcorr.py", line 20, in <module>
tcorr_arr = arcpy.RasterToNumPyArray(tcorr,lowerLeft,7951,6971,nodata_to_value=0)
File "C:\ArcGIS\Desktop10.3\ArcPy\arcpy\__init__.py", line 2244, in RasterToNumPyArray
return _RasterToNumPyArray(*args, **kwargs)
MemoryError

Any ideas as to what I am doing wrong?

Question 2

Have you installed the 64bit background geoprocessing patch/installer? If not, you are using the 32bit version of Python and tools, and it won't matter how much ram your machine has, you'll be limited to 2 GB.

Question 3

Its not my machine and I do not have authorization to make modifications to my machine or install patches. And my IT department lives in the dark ages so explaining why I need to get a patch/installer for Python is too much for them. So I'm screwed then?

Question 4

If you are limited in memory it is also a good idea to streamline your numpy calculations so they create less arrays which take up memory: print np.mean(tcorr_arr) - np.std(tcorr_arr)*2.

Question 5

@MattS just to warn you, you've got it working now but you may be shooting yourself in the foot in the future as I had issues with arcgis and the different versions of python (32 vs 64bit). So what I am saying is remember that you made this change as you'll probably need to undo it to get other tools working.

Question 6

Numpy can handle no_data values. If you want to ignore no_data values you need to convert them to np.nan and then use np.nanmean and np.nanstd.

Question 7

One solution to avoid your 32-bit memory woes is to read the raster in chunks on to a memory-mapped array, then do your calculations. Something like this:

# Make a dict of data type references
dtyperef = {'U1': 'bool',
 'U2': 'int8',
 'U4': 'int8',
 'U8': 'uint8',
 'S8': 'int8',
 'U16': 'uint16',
 'S16': 'int16',
 'U32': 'uint32',
 'S32': 'int32',
 'F32': 'float32',
 'F64': 'float64'}
# Get some info
shape = (tcorr.height, tcorr.width)
dtype = dtyperef[tcorr.pixelType]
nodata = tcorr.noDataValue
cellsize_x = tcorr.meanCellWidth
cellsize_y = tcorr.meanCellHeight
topLeft = (tcorr.extent.XMin, tcorr.extent.YMax)
# Create output array using temporary file
import tempfile
with tempfile.TemporaryFile() as tf:
 tcorr_arr = np.memmap(tf, mode='w+', shape=shape, dtype=dtype)
 # Populate array using chunks
 ychunks = range(0, shape[0], 128)
 xchunks = range(0, shape[1], 128)
 for ych in zip(ychunks[:-1], ychunks[1:-1] + [shape[0]]):
 for xch in zip(xchunks[:-1], xchunks[1:-1] + [shape[1]]):
 xblock, yblock = xch[1] - xch[0], ych[1] - ych[0]
 blc = arcpy.Point(topleft[0] + (xch[0] * cellsize_x),
 topLeft[1] - (ych[1] * cellsize_y))
 tcorr_arr[ych[0]:ych[1], xch[0]:xch[1]] = arcpy.RasterToNumPyArray(tcorr, blc, xblock, yblock)
 # Mask the no data value
 tcorr_arr = np.ma.masked_equal(tcorr_arr, nodata)
 # Do some mathy stuff
 cfactor = tcorr_arr.mean() - (tcorr_arr.std() * 2)
print cfactor

Question 8

Can you try just (works fine for me when I just want the math):

tcorr_arr = arcpy.RasterToNumPyArray(tcorr,nodata_to_value=0)

Also, not sure if this is causing the errors, but doesn't seems rigth to me:

says:

cfactor = mean - double_std

Shoud say:

cfactor = arr_mean - double_std

Question 9

Just a workaround if you don't want to store a numpy array in the memory but you just need the mean and the std of a raster

import arcpy
your_raster = arcpy.Raster(r"G:/P38R37/test4/TcorrCon/Tcorr198.tif")
your_mean = your_raster.mean
your_STD = your_raster.standardDeviation

then do your math....

Note that "calculate statistics" could be needed before you ask for the mean and STD, with parameters to exclude some values if needed.

Devin Cairns Devin Cairns 2461 silver badge7 bronze badges · Accepted Answer · 2016-10-18 17:57:49Z

One solution to avoid your 32-bit memory woes is to read the raster in chunks on to a memory-mapped array, then do your calculations. Something like this:

# Make a dict of data type references
dtyperef = {'U1': 'bool',
 'U2': 'int8',
 'U4': 'int8',
 'U8': 'uint8',
 'S8': 'int8',
 'U16': 'uint16',
 'S16': 'int16',
 'U32': 'uint32',
 'S32': 'int32',
 'F32': 'float32',
 'F64': 'float64'}
# Get some info
shape = (tcorr.height, tcorr.width)
dtype = dtyperef[tcorr.pixelType]
nodata = tcorr.noDataValue
cellsize_x = tcorr.meanCellWidth
cellsize_y = tcorr.meanCellHeight
topLeft = (tcorr.extent.XMin, tcorr.extent.YMax)
# Create output array using temporary file
import tempfile
with tempfile.TemporaryFile() as tf:
 tcorr_arr = np.memmap(tf, mode='w+', shape=shape, dtype=dtype)
 # Populate array using chunks
 ychunks = range(0, shape[0], 128)
 xchunks = range(0, shape[1], 128)
 for ych in zip(ychunks[:-1], ychunks[1:-1] + [shape[0]]):
 for xch in zip(xchunks[:-1], xchunks[1:-1] + [shape[1]]):
 xblock, yblock = xch[1] - xch[0], ych[1] - ych[0]
 blc = arcpy.Point(topleft[0] + (xch[0] * cellsize_x),
 topLeft[1] - (ych[1] * cellsize_y))
 tcorr_arr[ych[0]:ych[1], xch[0]:xch[1]] = arcpy.RasterToNumPyArray(tcorr, blc, xblock, yblock)
 # Mask the no data value
 tcorr_arr = np.ma.masked_equal(tcorr_arr, nodata)
 # Do some mathy stuff
 cfactor = tcorr_arr.mean() - (tcorr_arr.std() * 2)
print cfactor

Stack Exchange Network

Convert Raster to Numpy Array with only Arcpy and Numpy

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Convert Raster to Numpy Array with only Arcpy and Numpy

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions