ArcPy Memory issue re-using dictionary?

Question 1

Before I start I should mention that before this I have never worked in Python before.

I am trying to process a large fc to fc update, 3 million plus rows with 70 columns. I tried using updatecursor with a searchcursor but performance was extremely slow, around 1 record per minute. I switched over to using an updatecursor with a dictionary, which hugely improved performance but as I am restricted to 32bit Python I can only process around 600000 rows at a time. To get around this I am splitting the fc into a smaller temporary fc and then running the dictionary update on these before deleting the temporary fc and dictionary and looping back through.

This works perfectly for the first iteration but then gives a memory error on the dictionary on the second iteration. I have tried forcing garbage collection to try and clear this issue but that didn't work.

I can only guess that there are still references open to the dictionary. Any insights and suggestions would be most welcome.

Here is the offending section of code

please excuse the print commands, they are only there for testing.

# start changes
# split changes into smaller feature classes
lyr = arcpy.mapping.Layer(sourceFc)
arcpy.SelectLayerByAttribute_management(lyr, "CLEAR_SELECTION")
fList = list()
with arcpy.da.SearchCursor(lyr, "OID@", "Change_Type = 'C'") as cursor:
 for row in cursor:
 fList.append(row[0])
listGroup = listSplit(fList, outputNum)
for x in listGroup:
 lyr.setSelectionSet("NEW", x)
 arcpy.CopyFeatures_management(lyr, outputFCName)# + str(nums))
 #process changes
 upDict = {r[0]:(r[1:]) for r in arcpy.da.SearchCursor(outputFCName, fields)}
 print "Dictionary Created"
 with arcpy.da.UpdateCursor(targetFc, fields) as updateRows: 
 for updateRow in updateRows: 
 keyValue = updateRow[0] 
 if keyValue in upDict: 
 for n in range (1,len(fields)): 
 updateRow[n] = upDict[keyValue][n-1]
 ucount = ucount + 1
 updateRows.updateRow(updateRow) 
 del(upDict)
 gc.collect()
 arcpy.Delete_management(outputFCName)
 print "Processed " + str(ucount) + " records."

Question 2

upDict.clear() will reset the contents.

Question 3

Just reduce the size of chunk.

Question 4

@klewis already tried that, but thanks for commenting.

Question 5

@FelixIP reducing the chunk size down to 250000 records per iteration allows it to complete. I increased this until I found the size at which it fails (500000). Maybe you could tell me why that should make a difference when the larger chunk size processed fine on one iteration. Is python slow releasing references/memory? or is it that the memory overflows due to the extra object overhead?

Question 6

@FelixIP - Just reduce the size of chunk

reducing the chunk size down to 250000 records per iteration allows it to complete. I increased this until I found the size at which it fails (500000).

Magus Magus 764 bronze badges · Accepted Answer · 2019-02-25 14:27:03Z

@FelixIP - Just reduce the size of chunk

reducing the chunk size down to 250000 records per iteration allows it to complete. I increased this until I found the size at which it fails (500000).

Stack Exchange Network

ArcPy Memory issue re-using dictionary?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

ArcPy Memory issue re-using dictionary?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions