0

How can I remove duplicate values within a text field using ArcPy?

The following example shows a list of numbers for each row in a featureclass, some with duplicate numbers.

I am aware of UpdateCursors (code snippet below), however, I am unsure how to remove the duplicate numbers that happen to be in string format.

with arcpy.da.UpdateCursor(fc, ("field")) as cursor:
 for row in cursor:

enter image description here

Aaron
52k30 gold badges161 silver badges326 bronze badges
asked Jul 22, 2019 at 5:19
1
  • Welcome to GIS SE! We're a little different from other sites. We're a Q&A site, not a discussion forum. For questions that involve code we ask that you show us where you are stuck with your own code by including a code snippet in your question. There is an edit button beneath your question which will enable you to do that and a {} button that enables you to format any highlighted code nicely. Please check out our short tour for more about how the site works. Thanks. Commented Jul 22, 2019 at 5:59

1 Answer 1

3

Try the following using an Update Cursor. This assumes 1) your input and output fields are text fields and 2) you have created a new field new_field in which you will write your results.

import arcpy
fc = r'C:\path\to\your\geodatabase.gdb\featureclass'
with arcpy.da.UpdateCursor(fc, ("dup_field", "new_field")) as cursor:
 for row in cursor: # Iterate through each row
 if row[0] is not None: # Only perform actions if there is not a None value in row
 a = sorted(set(int(x) for x in row[0].split(','))) # Manipulate text field, convert to integer so that 
 b = str(a).strip('[').strip(']')
 row[1] = b # Write the results 
 cursor.updateRow(row)

The following line converts the text to a list of integer values, which we can then use set() to isolate unique values. sorted() obviously sorts the values

a = sorted(set(int(x) for x in row[0].split(',')))

This line cleans up the results of a by convert to text and removing the brackets so we can write the string back to the attribute table

b = str(a).strip('[').strip(']')

enter image description here

answered Jul 22, 2019 at 6:13
1
  • If the original order is important it is also possible to use collections.OrderedDict: ",".join(OrderedDict.fromkeys(row[0].split(','))) Commented Jul 22, 2019 at 7:11

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.