Python Script Performance as an ArcGIS Tool Versus Stand-Alone

Question 1

Has anyone studied the difference in running a Python script in ArcToolbox versus as a stand-alone script? I had to write a quick-and-dirty script to convert a set of RGB images to single band by extracting band 1. As a stand-alone script reading and writing to my PC it processes 1000 identically-sized images in about 350 seconds. Running the same script from ArcToolbox takes about 1250 seconds.

import arcpy
import csv
from os import path
arcpy.env.workspace = in_folder
image_list = arcpy.ListRasters()
#Create a CSV file for timing output
 with open(outfile, 'wb') as c:
 cw = csv.writer(c)
 cw.writerow(['tile_name', 'finish_time'])
 #Start the timer at 0
 start_time = time.clock()
 for image in image_list:
 #Extract band 1 to create a new single-band raster
 arcpy.CopyRaster_management(path.join(image, 'Band_1'), path.join(out_folder, image))
 cw.writerow([image, time.clock()])

I added some code to track when each tile finishes processing, and export the results as a CSV. Converting the finish time to processing time occurs in Excel. Graphing the results, the processing time is roughly the same for each tile as a script, but processing time increases linearly when run as an ArcGIS Tool.

enter image description here

If the data reads and writes are to a network device, the increase appears to be exponential.

I'm not looking for alternate ways to accomplish this particular task. I want to understand why the performance of this script degrades over time when run as an ArcGIS tool, but not as a stand-alone script. I have noticed this behavior with other scripts as well.

Question 2

python outside of ArcGIS is much faster. I only use the python window when doing very simple scripts, or want the ability to drag and drop items into the terminal.My guess is the ArcGIS terminal controls the resource allocation of the interpreter because the entire software package also needs python to operate.

Question 3

my recommendation (based on my experience, not quantified performance data like you've provided) is to only use arcpy as a last resort. In the example above, any python interpreter without arcpy is quite capable of efficiently filtering a directory for rasters and copying them to a new folder

Question 4

How much difference between x64 background geoprocessing and in-process 32 bit?

Question 5

When you say that you are "running a Python script in ArcToolbox" do you mean that you are running a Python Script tool? If so, are you running it with no parameters for your test?

Question 6

@PolyGeo yes, I created a script tool in a ArcGIS toolbox. It took 1 parameter, from which in_folder and out_folder are derived. That is all done before the timing measurements start.

Question 7

This is my take on things: running a script from ArcToolbox incurs all sorts of hidden costs as tools are trying to interact/update the main application (ArcMap). All tools will update metadata, some try to refresh the map window and the MXD is recording every tool you run in the geoprocessing history panel. None of these hidden impacts occur when running in an IDE.

So running a loop a mere 1000 times means the MXD is storing 1000 logs. As ArcMap is closed proprietary software we have no idea how the mechanics of recording processing logs is actually going on and may be the rate limiting step is the data structure they employed not being capable of handling large repetition?

Another issue would be that ArcMap is an event driven application, things happen when events occur, you pan the map and the map refreshes, you add data and a button enables. I guess it's possible that tools are firing off all sorts of events and the application becomes "overwhelmed" by them when tools are used in a repetitious way, but that's me speculating?

I think one has to way up the pros and cons, exposing a script as a script tool makes it easy to use in the ArcMap environment, especially to non-power users. That's an important issue if you want your code to be adopted. Hardcore number crunching just by you without the need to do any intermediate quality control then run the script in your preferred IDE.

Hornbydd Hornbydd 44.9k5 gold badges42 silver badges84 bronze badges · Accepted Answer · 2019-05-22 14:07:31Z

This is my take on things: running a script from ArcToolbox incurs all sorts of hidden costs as tools are trying to interact/update the main application (ArcMap). All tools will update metadata, some try to refresh the map window and the MXD is recording every tool you run in the geoprocessing history panel. None of these hidden impacts occur when running in an IDE.

So running a loop a mere 1000 times means the MXD is storing 1000 logs. As ArcMap is closed proprietary software we have no idea how the mechanics of recording processing logs is actually going on and may be the rate limiting step is the data structure they employed not being capable of handling large repetition?

Another issue would be that ArcMap is an event driven application, things happen when events occur, you pan the map and the map refreshes, you add data and a button enables. I guess it's possible that tools are firing off all sorts of events and the application becomes "overwhelmed" by them when tools are used in a repetitious way, but that's me speculating?

I think one has to way up the pros and cons, exposing a script as a script tool makes it easy to use in the ArcMap environment, especially to non-power users. That's an important issue if you want your code to be adopted. Hardcore number crunching just by you without the need to do any intermediate quality control then run the script in your preferred IDE.

Stack Exchange Network

Python Script Performance as an ArcGIS Tool Versus Stand-Alone

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Python Script Performance as an ArcGIS Tool Versus Stand-Alone

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions