arcpy.da.updatecursor to delete duplicate rows where values switch columns

Question 1

I'm trying to dissolve polygons using polygon neighbors in arcgis 10.1, but don't want to use duplicate rows. However, the field values will switch when they're duplicated.

If I run this:

arcpy.env.workspace = "C:\Users\Ant\Documents\ArcGIS\ITN.gdb"
fc = r"BLPUs_PolygonNeighbors"
fields = ["src_OBJECTID","nbr_OBJECTID"]
with arcpy.da.SearchCursor(fc,fields) as cursor:
 for row in cursor:
 print "{0}, {1}".format(row[0],row[1])

I get:

1, 2
1, 4
2, 1
2, 3
2, 4
3, 2
3, 4
4, 1
4, 2
4, 3

Then when I run my script to dissolve based on these values, I will end up with duplicate polygons such as 1 & 2 as well as 2 & 1. Can someone please help me with writing the UpdateCursor to go through and delete these duplicates? I don't know how to look through when the fields have then switched.

Also, what if I then have three polygons to merge? If I had three columns such as OID1, OID2 and OID3, is there an SQL expression to capture whether the three values in these have been repeated in a different order but in the same row previously? Thanks

Question 2

This should do it with a single pass.

arcpy.env.workspace = r"C:\Users\Ant\Documents\ArcGIS\ITN.gdb"
fc = r"BLPUs_PolygonNeighbors"
fields = ["src_OBJECTID","nbr_OBJECTID"]
row_pairs = set()
with arcpy.da.UpdateCursor(fc,fields) as cursor:
 for row in cursor:
 row_pair = tuple(sorted(row))
 if row_pair in row_pairs:
 cursor.deleteRow()
 else:
 row_pairs.add(row_pair)

Question 3

By creating an empty set, will row_pair only be added if it doesn't already exist in row_pairs? I'm trying to understand why to use that over creating an empty list.

Question 4

If the row_pair is not yet in the set, it gets added to the set. But if it's already in the set, the row is a known duplicate and it's deleted. A set is useful because it will only ever have one copy of each value so it won't get excessively big and uses a hash to look up values so searching it is faster than searching a list.

Question 5

thanks, that worked - had to use UpdateCursor, not SearchCursor though. Edited the answer. Cheers

Question 6

+1 Hmm, I need to look more into sets then, thanks!

Question 7

i guess it was for the deleteRow :)

Question 8

You can use sorted() (which is a built in Python function) in a generator expression to sort the sub-lists separately and then use sorted() on the entire list so that it's in ascending order. Using your example:

cursordata = [[1, 2], [1, 4], [2, 1], [2, 3], [2, 4], [3, 2], [3, 4], [4, 1], [4, 2], [4, 3]]
print sorted((sorted(x) for x in cursordata))
[[1, 2], [1, 2], [1, 4], [1, 4], [2, 3], [2, 3], [2, 4], [2, 4], [3, 4], [3, 4]]

This can easily be extended to sublists of length n.

Jason Scheirer Jason Scheirer 18k2 gold badges55 silver badges72 bronze badges · Accepted Answer · 2013-08-21 21:58:29Z

9

This should do it with a single pass.

arcpy.env.workspace = r"C:\Users\Ant\Documents\ArcGIS\ITN.gdb"
fc = r"BLPUs_PolygonNeighbors"
fields = ["src_OBJECTID","nbr_OBJECTID"]
row_pairs = set()
with arcpy.da.UpdateCursor(fc,fields) as cursor:
 for row in cursor:
 row_pair = tuple(sorted(row))
 if row_pair in row_pairs:
 cursor.deleteRow()
 else:
 row_pairs.add(row_pair)

Share

Improve this answer

edited Aug 22, 2013 at 0:40

user2581350's user avatar

user2581350

4576 silver badges18 bronze badges

answered Aug 21, 2013 at 21:58

Jason Scheirer's user avatar

Jason Scheirer Jason Scheirer

18k2 gold badges55 silver badges72 bronze badges

5

By creating an empty set, will row_pair only be added if it doesn't already exist in row_pairs? I'm trying to understand why to use that over creating an empty list.

Paul
– Paul

2013年08月21日 22:49:36 +00:00
Commented Aug 21, 2013 at 22:49
If the row_pair is not yet in the set, it gets added to the set. But if it's already in the set, the row is a known duplicate and it's deleted. A set is useful because it will only ever have one copy of each value so it won't get excessively big and uses a hash to look up values so searching it is faster than searching a list.

Jason Scheirer
– Jason Scheirer

2013年08月21日 23:30:10 +00:00
Commented Aug 21, 2013 at 23:30
1

thanks, that worked - had to use UpdateCursor, not SearchCursor though. Edited the answer. Cheers

user2581350
– user2581350

2013年08月22日 00:26:45 +00:00
Commented Aug 22, 2013 at 0:26
+1 Hmm, I need to look more into sets then, thanks!

Paul
– Paul

2013年08月22日 00:53:51 +00:00
Commented Aug 22, 2013 at 0:53
i guess it was for the deleteRow :)

user2581350
– user2581350

2013年08月22日 14:21:29 +00:00
Commented Aug 22, 2013 at 14:21

Add a comment |

Stack Exchange Network

arcpy.da.updatecursor to delete duplicate rows where values switch columns

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

arcpy.da.updatecursor to delete duplicate rows where values switch columns

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions