3

I have a regularly spaced model grid, 5000 by 5000 feet (517,300 cells), which I need to convert to a numpy array. I am investigation a couple of different methods, using arcpy, by which I can convert the grid to a numpy array using the values in a particular field. They are as follows:

  1. Polygon to raster conversion followed by a raster to numpy array conversion, and
  2. Iteration over the rows of the grid, pulling row/col indices from each row and assigning values to an empty array.

Neither of these methods seems to be particularly fast - is there a more direct way to use the regular geometry of my features to apply the values to an array more efficiently? I will need to run this process on multiple versions of my model to populate parameter arrays for each of the 9 layers with multiple parameters per layer. Any processing time I can gain will be helpful.

import os
import arcpy
import datetime
import numpy as np
arcpy.CheckOutExtension("Spatial")
arcpy.env.workspace = r'..\gis\geom.gdb'
arcpy.env.overwriteOutput = True
grid = r'..\gis\grid_5000ft_01.gdb\grid'
out_table = r'in_memory\{}'.format('TOP_sas')
arcpy.sa.ZonalStatisticsAsTable(grid, 'OID', 'geodas_fasft', out_table, 'DATA', 'ALL')
grid_lyr = r'in_memory\grid_lyr'
table_vwe = r'in_memory\table_vwe'
arcpy.MakeFeatureLayer_management(grid, grid_lyr)
arcpy.MakeTableView_management(out_table, table_vwe)
arcpy.AddJoin_management(grid_lyr, 'OID', table_vwe, 'OID_', 'KEEP_ALL')
print [field.name for field in arcpy.ListFields(grid_lyr)]
start = datetime.datetime.now()
arcpy.PolygonToRaster_conversion(grid_lyr, '{}.MEAN'.format('TOP_sas'), r'in_memory\ras', '#', '#', 5000.)
a = arcpy.RasterToNumPyArray(r'in_memory\ras')
print (datetime.datetime.now() - start) / 60.
b = np.zeros((739, 700), dtype=np.float)
start = datetime.datetime.now()
with arcpy.da.SearchCursor(grid_lyr, ['grid.row', 'grid.col', '{}.MEAN'.format('TOP_sas')]) as cur:
 for row in cur:
 (r, c, val) = row
 a[r-1, c-1] = val
print (datetime.datetime.now() - start) / 60.
a.savetxt('{}.ref'.format('TOP_sas_a'))
b.savetxt('{}.ref'.format('TOP_sas_b'))
asked Dec 10, 2014 at 21:15
4
  • Convert feature to raster much faster than polygon to raster. If this is about saving raster as text you might consider export it to ascii. Reading one row at a time with query in da.searchcursor. Table to numpy... Commented Dec 11, 2014 at 1:04
  • Forgot to mention non-GDB raster i.e. grid seems faster to create at least on my machine Commented Dec 11, 2014 at 1:08
  • @FelixIP, I like your idea of using the TableToNumpyArray conversion tool. I can probably find a way to vectorize the problem with numpy instead of using a search cursor to iterate over rows. Commented Dec 11, 2014 at 13:27
  • 1
    I think you should edit your update out of your question and into an answer that you will also be able to self-accept. Self-answering questions is perfectly acceptable. Commented Dec 15, 2014 at 23:27

1 Answer 1

2

Update (5/14/2015)

In a related question, I posted code for a function that computes zonal statistics for a raster given a regular polygon grid and converts that to a numpy array. Here is the relevant bit of code related to the grid-array conversion:

import numpy as np
import arcpy
def grid_to_array(grid, nrow, ncol, row_col_field_names, val_field_name):
 """
 Returns an array of shape (nrow, ncol) from a polygon grid of the same shape.
 Grid must have row/col values specified.
 vars:
 grid: name of grid feature class/layer; str
 nrow: number of rows; int
 ncol: number of columns; int
 row_col_field_names: names of fields containing row/col values for each cell; list, tuple
 val_field_name: name of field containing values of interest; same as row_col_field_names
 """
 # Convert regular-grid polygon shapefile to an array.
 a = arcpy.da.TableToNumPyArray(grid, row_col_field_names + val_field_name, skip_nulls=False)
 # Convert to recarray
 a_1 = np.rec.fromrecords(a.tolist(), names=['row', 'col', 'val'])
 a_1.sort(order=['row', 'col'])
 b = np.reshape(a_1.val, (nrow, ncol))
 return b

Original Answer

The grid I was using was stored as a feature class within a geodatabase; printing cursor iterations over the table revealed that it was only reading about 1,000 records at a time at which point it would pause for quite a while before reading another 1,000 records. After 45 minutes, the table still had not been traversed. By exporting to a shapefile I was able to much more quickly process the grid features (about 20 seconds). Final code:

for surface in arcpy.ListDatasets('*', 'Raster'):
 out_table = r'in_memory\{}'.format(surface)
 arcpy.sa.ZonalStatisticsAsTable(grid, 'FID', surface, out_table, 'DATA', 'MIN_MAX_MEAN')
 grid_lyr = r'in_memory\grid_lyr'
 table_vwe = r'in_memory\table_vwe'
 arcpy.MakeFeatureLayer_management(grid, grid_lyr)
 arcpy.MakeTableView_management(out_table, table_vwe)
 arcpy.AddJoin_management(grid_lyr, 'FID', table_vwe, 'FID', 'KEEP_ALL')
 a = arcpy.da.TableToNumPyArray(grid_lyr, ['grid.row', 'grid.col', '{}.MEAN'.format(surface)], skip_nulls=True)
 # OLD CODE - much less efficient
 #b = np.zeros((nrow, ncol), dtype=np.float)*np.nan
 #for idx, (row, col, val) in np.ndenumerate(a):
 # b[row-1, col-1] = val
 # NEW CODE - much more efficient (updated 5/13/2015)
 a_1 = np.rec.fromrecords(a.tolist(), names=['row', 'col', 'val'])
 a_1.sort(order=['row', 'col'])
 b = np.reshape(a_1.val, (nrow, ncol))
 np.savetxt(r'ref\{}.dat'.format(surface), b, fmt='%5.0f')
answered Dec 15, 2014 at 23:44

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.