I was trying to speed up my data processing, and I was wondering, which would be the python code, that could browse through my shapefiles, and find special single characters (which I define) in attribute table in string fields, and replace them with defined characters.
For example, I have shapefiles, in my "D:\GIS_One" folder. All of the shapefiles have few string(text) fields in their attribute tables. For example, value in some of the fields are "Day&Night", "Mix!" "Über". In this example, I would like to replace "&" with "_" , "!" with "One" and "Ü" with "U". Just the single characters, not the whole words, since I have another code to change the whole words. I am just looking for a code to do a single character replacement.
I already tried with How to iterate fields and remove Null values and spaces but could not manage to work it out. Also, I know how to to it with Field Calculator, and Excel/LibreOffice, but I am looking for a Python solution so I could speed up my work.
EDIT: As Michael asked, I worked on a first two codes, and now I am working on a third provided from a posted link. Here is working code:
>>> import arcpy
fc = ["D:\GIS_One\Ride.shp", "D:\GIS_One\Unknown.shp" ]
fieldList = [f.name for f in arcpy.ListFields(fc) if f.type == "String"]
if fieldList:
with arcpy.da.UpdateCursor(fc, [fieldList]) as cursor:
for row in cursor:
for i in range (len(fieldList)):
if not row[i]:
row[i] == ""
elif row[i] == "&":
row[i] == "_"
elif row[i] == "!":
row[i] == "1"
elif row[i] == "Ü":
row[i] == "U"
cursor.updateRow(row)
print "Processing complete"
I get this error:
Runtime error Traceback (most recent call last): File "", line 3, in File "c:\program files\arcgis\desktop10.1\arcpy\arcpy__init__.py", line 1075, in ListFields return gp.listFields(dataset, wild_card, field_type) File "c:\program files\arcgis\desktop10.1\arcpy\arcpy\geoprocessing_base.py", line 344, in listFields self._gp.ListFields(*gp_fixargs(args, True))) IOError: "['D:/GIS_One/Ride.shp', 'D:/GIS_One/Unknown.shp']" does not exist
And if I try with one feature class from database, with this code:
>>> import arcpy
fc = "D:\GIS_Temp\TEST.gdb\ONE"
fieldList = [f.name for f in arcpy.ListFields(fc) if f.type == "String"]
if fieldList:
with arcpy.da.UpdateCursor(fc, [fieldList]) as cursor:
for row in cursor:
for i in range (len(fieldList)):
if not row[i]:
row[i] == ""
elif row[i] == "&":
row[i] == "_"
elif row[i] == "!":
row[i] == "One"
elif row[i] == "Ü":
row[i] == "U"
cursor.updateRow(row)
print "Processing complete"
I get this error:
Runtime error Traceback (most recent call last): File "", line 5, in TypeError: 'field_names' must be string or non empty sequence of strings
It would be really helpful if i could manage to make the code to work on shapefiles.
-
2How did it not work? error messages? Which answer are you using? (there's 3 of them)... can you include your non-working code? An update cursor would do it with string.replace('\&','').replace('!','') as multiple replaces can be appended on the same operation.Michael Stimson– Michael Stimson2016年04月29日 02:44:44 +00:00Commented Apr 29, 2016 at 2:44
-
They're single code points, but they're not single characters. It's no faster to change a single code point after insert, since updates are done at the field level.Vince– Vince2016年04月29日 02:46:41 +00:00Commented Apr 29, 2016 at 2:46
-
2you don't want ==, that would only work if the entire string is "&" etc.. look at string.find tutorialspoint.com/python/string_find.htm elif row[i].find("&") >= 0: row[i] = row[i].replace("&","") but really it's not worth searching for, I'd get the value (val = row[i]) and just do replaces (val = val.replace("&","") and if val != row[i] : row[i] = val then store. Your error message says something different though, something is wrong with getting the field names, try printing fieldList to see if it's already a list in which case you don't need the list identifiers on cursor.Michael Stimson– Michael Stimson2016年04月29日 03:59:48 +00:00Commented Apr 29, 2016 at 3:59
-
1I think the first error is happening because your trying to pass a list ( fc = ["D:\GIS_One\Ride.shp", "D:\GIS_One\Unknown.shp" ]) to arcpy.ListFields(fc) instead of looping each induvial layer through.TsvGis– TsvGis2016年04月29日 05:43:13 +00:00Commented Apr 29, 2016 at 5:43
-
1Also [f.name for f in arcpy.ListFields(fc) if f.type == "String"] could also be changed to [f.name for f in arcpy.ListFields(fc,field_type="String")] as the function already has the ability to search just for txt/string fields, but I dont think this will resolve errors.TsvGis– TsvGis2016年04月29日 05:48:07 +00:00Commented Apr 29, 2016 at 5:48
3 Answers 3
You need to use the str.replace()
to replace the characters in your field values.
Also needed to put a for fc in fcs:
to loop through your shapefiles, and remove the square brackets from around fieldList
in your cursor (this is what is giving you the error 'field_names' must be string or non empty sequence of strings
)
And note the u
in front of u"Ü"
so that you don't get a decode error.
import arcpy
fcs = [r"D:\GIS_One\Ride.shp", r"D:\GIS_One\Unknown.shp" ]
for fc in fcs:
fieldList = [f.name for f in arcpy.ListFields(fc) if f.type == "String"]
if fieldList:
with arcpy.da.UpdateCursor(fc, fieldList) as cursor:
for row in cursor:
for i in range (len(fieldList)):
newFieldvalue = row[i].replace("&","_").replace("!","1").replace(u"Ü","U")
row[i] = newFieldvalue
cursor.updateRow(row)
print "Processing complete"
-
3+1 Nice compact
str.replace()
method. Also nice catch on the unicode string.2016年04月29日 05:30:58 +00:00Commented Apr 29, 2016 at 5:30 -
Wow, this one worked like a charm. And thank you for the explanation, i learned so much from this post, especially. I know now how to loop through shapefiles with "for fc in fcs:" code. And also, where you explain how to use replace part of code. This is very helpfulDean7– Dean72016年04月30日 01:40:11 +00:00Commented Apr 30, 2016 at 1:40
I see a couple things going on here. In your first code example you tried to set fc to a list of two feature classes. The listFields function expects a single feature class, not a list. If you wanted to do that, you'd have to iterate through your feature class list. The next big thing I see is that those funny characters that you're trying to get rid of are confusing python. Best to figure out the ascii code for those funny characters and feed them in with the chr function. (I found this table of ascii codes with a quick google search.) And then one more note about the ascii codes. Python 2 only supported using the chr function on ascii codes up to 128. That umlat U is 220, so we have to get a little fancy with it to make it work. I believe the code below should run just fine on both shapefiles and feature classes in geodatabase.
import arcpy
fcLst = [r"D:\GIS_One\Ride.shp", r"D:\GIS_One\Unknown.shp" ]
for fc in fcLst:
fieldList = [f.name for f in arcpy.ListFields(fc) if f.type == "String"]
if fieldList:
with arcpy.da.UpdateCursor(fc, fieldList) as cursor:
for row in cursor:
for i in range (len(fieldList)):
row[i] = row[i].replace(chr(38),"_")
row[i] = row[i].replace("!","1")
row[i] = row[i].replace(unicode(chr(220), encoding="latin1"),"U")
cursor.updateRow(row)
-
This one worked too. I really liked how you pointed out to ascii table with code. could be useful with other characters, if I stumble upon those.Dean7– Dean72016年04月30日 01:51:19 +00:00Commented Apr 30, 2016 at 1:51
-
How is possible to change characters like (Š, Č, Đ, Ž )? I found a code for those characters and put it into brackets, and for encoding used windows 1250, but something is missing?Dean7– Dean72016年05月12日 18:29:59 +00:00Commented May 12, 2016 at 18:29
-
1
-
I found working codes for those characters for central European languages,Dean7– Dean72016年05月12日 21:33:36 +00:00Commented May 12, 2016 at 21:33
So, I will post an answer, which could be useful to other users which are using central European characters. I think it could work other characters and encoding for other languages, but the encoding and characters must be changed, depending of the language. This is example for central european, and character Đ. This site could be useful for the codes.
>>> import arcpy
fcLst = [r"PATH"] # Path to your shapefile
for fc in fcLst:
fieldList = [f.name for f in arcpy.ListFields(fc) if f.type == "String"]
if fieldList:
with arcpy.da.UpdateCursor(fc, fieldList) as cursor:
for row in cursor:
for i in range (len(fieldList)):
row[i] = row[i].replace(unicode(chr(209), encoding="CP852"),"D") # For number of character, use the code from the site in link, and for the encoding, use desired encoding for your characters. This is example for Central European characters, for replacing Đ with D.
cursor.updateRow(row)