Methods for optimizing multcore processing in ArcGIS

Question 1

I am interested in learning methods to utilize the full extent of multicore processing power available on a desktop computer. Arc states that background geoprocessing allows the user to utilize multiple cores, however, tasks essentially have to wait in line for the previous task to be completed.

Has anyone developed parallel or multithreaded geoprocessing methods in Arc/Python? Are there hardware bottlenecks that prevent multicore processing on individual tasks?

I found an interesting example in Stackoverflow that caught my interest, although it is not a geoprocessing example:

from multiprocessing import Pool
import numpy
numToFactor = 976
def isFactor(x):
 result = None
 div = (numToFactor / x)
 if div*x == numToFactor:
 result = (x,div)
 return result
if __name__ == '__main__':
 pool = Pool(processes=4)
 possibleFactors = range(1,int(numpy.floor(numpy.sqrt(numToFactor)))+1)
 print 'Checking ', possibleFactors
 result = pool.map(isFactor, possibleFactors)
 cleaned = [x for x in result if not x is None]
 print 'Factors are', cleaned

Question 2

In my Arc experience, it almost always boils down to either 1) splitting your data up into {number of core} chunks, processing and reassembling or 2) reading everything into memory and letting x API handle the threading. note that this is not meant to discourage.

Question 3

Thanks valveLondon. Perhaps newer Ivy Bridge technology and the Kepler GPU will allow for more sophisticated processing approaches.

Question 4

Here's a link to a useful blog about python multiprocessing from a product engineer on ESRIs Analysis and Geoprocessing team. blogs.esri.com/esri/arcgis/2011/08/29/multiprocessing

Question 5

Here is an example of a multicore arcpy script. The process is very CPU-intensive so it scales very well: Porting Avenue code for Producing Building Shadows to ArcPy/Python for ArcGIS Desktop?

Some more general info in this answer: Can concurrent processes be run in a single model?

Question 6

In my experience the biggest problem is managing stability. If you do six weeks of processing in a single night you will also have six weeks of inexplicable errors and bugs.

An alternative approach is to develop standalone scripts that can run independently and fail without causing problems:

Split the data into chunks that a single core can process in <20 minutes (tasks).
Build a standalone Arcpy script that can process a single task and is as simple as possible (worker).
Develop a mechanism to run tasks. Lots of pre-existing python solutions exist. Alternatively you can make your own with a simple queue.
Write some code to verify that tasks have been completed. This could be as simple as checking that an output file has been written.
Merge data back together.

Question 7

I've found that this approach, which can include using the multiprocessing module, is a good one - some extensions, such as spatial analyst, don't work very well if you have multiple copies of the same function running simultaneously, so something like what you describe that allows for a user-controlled form of queuing (ie, avoids scheduling those tasks at the same time or avoids using the same geodatabase at once for file locking reasons) is going to be best.

blah238 blah238 35.9k8 gold badges97 silver badges204 bronze badges · Accepted Answer · 2012-06-26 17:10:04Z

Here is an example of a multicore arcpy script. The process is very CPU-intensive so it scales very well: Porting Avenue code for Producing Building Shadows to ArcPy/Python for ArcGIS Desktop?

Some more general info in this answer: Can concurrent processes be run in a single model?

Stack Exchange Network

Methods for optimizing multcore processing in ArcGIS

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Methods for optimizing multcore processing in ArcGIS

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions