I have a Python function that downloads a geopackage from somewhere then does some stuff with it and at some point converts it to an ESRI file geodatabase.
Now inside a python script I want to use this method to download and convert multiple geopackages at once.
This is some stuff that in my eyes should be easy to run in parallel, since there is no overlap between input and output data of the different geopackages whatsoever.
So i tried implementing that, but somehow arcpy.management.FeatureClassToGeodatabase
now raised the following error:
WARNING: C:\path\to\file.gpkg\lyr. ERROR 87931
which with the help of arcpy.GetIDMessage(87931)
becomes 'Object: Tool or environment <%1> not found'
I was able to isolate this error and found out, that it has to do with the ThreadPoolExecutor
. I created a minimal script which I used to reproduce this behaviour:
import asyncio
import concurrent.futures
import arcpy.conversion
import arcpy.management
async def main():
# works
await normal_test()
# doesn't work
await threadpool_test()
async def normal_test():
convert()
async def threadpool_test():
executor = concurrent.futures.ThreadPoolExecutor(4)
await asyncio.get_event_loop().run_in_executor(executor, convert)
def convert():
arcpy.management.CreateFileGDB("<path>", "out")
arcpy.conversion.FeatureClassToGeodatabase("test.gpkg/lyr", "out.gdb")
print("After converting")
# start whole script
asyncio.run(main())
So without even running something in parallel yet, the FeatureClassToGeodatabase
fails when run inside a ThreadPoolExecutor
, but works fine when called normally. CreateFileGDB
however works in both scenarios.
Does anybody have an idea why some functions work inside a ThreadPoolExecutor
while others do not and maybe knows a way to get them to work?
Environment:
I'm using ArcPy of ARCGIS Pro 2.9 and Python 3.7.11.
I'm working on Windows
Edit: Based on a comments recommendation i also tried to use the multiprocessing module without asyncio: (The convert method is the same as above)
import arcpy.conversion
import arcpy.management
from multiprocessing import Process
if __name__ == "__main__":
proc = Process(target=convert)
proc.start()
proc.join()
print("Finished")
This code has the same problem, the FeatureClassToGeodatabase
just doesn't work, however this time there isn't even an error. The print statement "After converting" just gets executed after one second and nothing happened. Certainly no converting...
multiprocessing.set_executable()
to point atpythonw.exe
, sincesys.executable
was being set to arcmap's exe. It would also simplify things to get rid of asyncio, at least for the minimal reproducible example.ProcessPoolExecutor
with asyncio and using process.start etc. directly). That won't produce an error, but somehow it doesn't work either. TheFeatureClassToGeodatabase
will just not do anything it seems. If i put a print statement after, it will get called after a second but normally this should take about 2-3 minutes...concurrent.futures
withoutasyncio
, was what I meant: docs.python.org/3/library/concurrent.futures.html