When I load a csv file to join with ArcPy, it converts all string fields that are numeric to integers. Therefore, I have to add another field within the feature class that I'm using for joining that is an integer version of the one I originally intended to use.
There are two approaches I've tried to convert the contents of the original field to an int:
Approach 1:
arcpy.AddField_management(layerName, "BG10_INT", "LONG", 12, "", "", "", "NULLABLE", "")
expression = "toInt(!BG10!)"
codeblock = """def toInt(value):
if value.isdigit():
return int(value)
else:
return 0"""
arcpy.CalculateField_management (layerName, "BG10_INT", expression, "PYTHON", codeblock)
Approach 2:
for row in rows:
if row.BG10.isdigit():
row.BG10_INT = int(row.BG10)
else:
row.BG10_INT = 0
rows.updateRow(row)
With either approach, I get The value type is incompatible with the field type. [BG10_INT]
. How could something so simple as converting string to int be such a challenge?
After several comments and an answer suggesting I define the schema.ini
file for the csv file instead of the above conversions, I tried the following approach:
Approach 3
env.workspace = "C:/data/PCA"
f = open("C:/data/PCA/schema.ini", "wb")
text = ["[bg_to_tr.csv]", "ColNameHeader=True", "Format=CSVDelimited", "Col1=BG10 Text Width 12", "Col2=PrefBUC1 Text Width 10"]
for row in text:
f.writerow(row)
f.close()
env.qualifiedFieldNames = False
inFeatures = "C:/geodatabases/Intermediate.gdb/BG_sample"
layerName = "BG_sample_lyr"
joinTable = "bg_to_tr.csv"
joinField = "BG10"
arcpy.MakeFeatureLayer_management (inFeatures, layerName)
outFeature = "C:/geodatabases/Intermediate.gdb/BG_sample_w_tr"
arcpy.AddJoin_management(layerName, joinField, joinTable, joinField)
arcpy.CopyFeatures_management(layerName, outFeature)
This approach only works when I join manually (sans ArcPy) within ArcMap. However, for my purpose, this needs to be done with ArcPy. Is there an explanation why this only seems to work with ArcMap? The csv file was joined without adding it directly into the ArcMap TOC.
Here is a sample of the contents of the csv file:
"BG10","PrefBUC1"
"170312106022","0002804408"
"170312106023","0002804408"
"170312105012","0002804408"
If I open the csv file in ArcMap, it recognizes schema.ini
and indicates the BG10 column as a String of length 12. When I look at the join output in BG_sample_w_tr
, however, it indicates that it read it as data type Double.
If it matters, neither table contains null values for the joined fields.
2 Answers 2
You can take care of this before ever trying to load the data into ArcGIS by creating a schema.ini
file. Simply use a text editor to save a file called schema.ini
(no other extension) in the same folder as your CSV. Then, explicitly specify the data types for columns that you don't want ArcGIS to infer.
For example, if your CSV were called mydata.csv
and the 4th column were the one ArcGIS is loading as text, your schema.ini
would look like this:
[mydata.csv]
ColNameHeader=True
Format=CSVDelimited
Col4=BG10 Long
ArcGIS always checks schema.ini
files, and will therefore always correctly load the column as an integer if that is how you've specified it. If multiple CSV files are present in the same folder, they can all be controlled from the same schema.ini
.
-
see edit. Joining after creating a
schema.ini
file only seems to work when manually joining within ArcMap, but not by using ArcPy.baobobs– baobobs2015年03月03日 23:20:19 +00:00Commented Mar 3, 2015 at 23:20 -
Are there any rows where the "BG10" value is not an integer (even nulls)? How exactly are the "BG10" values encoded in the CSV, if you open it in a text editor?nmpeterson– nmpeterson2015年03月03日 23:25:35 +00:00Commented Mar 3, 2015 at 23:25
-
neither table contains null values for the joined fields. The contents of the BG10 field are all string.baobobs– baobobs2015年03月04日 15:39:13 +00:00Commented Mar 4, 2015 at 15:39
-
Try creating a Table View of your CSV with
arcpy.MakeTableView_management()
, and then joining the Table View to the Feature Layer instead of the raw CSV.nmpeterson– nmpeterson2015年03月04日 16:56:11 +00:00Commented Mar 4, 2015 at 16:56 -
That didn't work either. I know this deserves a separate question, but the data I'm joining was in a Python list form prior to writing to a csv. If I could simply join the Python list to the feature layer, that would be most preferable. I know this could be done with
UpdateCursor
, but that would be a nested loop - not the fastest way to do things. Otherwise an alternate option is converting the list into a dbf instead of a csv. I'll tinker with this when I get back to the office.baobobs– baobobs2015年03月07日 00:09:33 +00:00Commented Mar 7, 2015 at 0:09
I found a solution with the use of dbfpy
. I converted my Python list to a dbf using this module, and joined the resulting table to the feature layer.
db = dbf.Dbf("C:/data/test.dbf", new=True)
db.addField(
("BG10", "C", 12),
("TR", "C", 10),
)
for i in l:
rec = db.newRecord()
rec["BG10"] = str(i[0])
rec["TR"] = str(i[1])
rec.store()
try:
db.close()
except AttributeError:
pass
inFeatures = "C:/data/Intermediate.gdb/BG_sample"
layerName = "BG_sample_lyr"
joinTable = "C:/data/test.dbf"
joinField = "BG10"
arcpy.MakeFeatureLayer_management (inFeatures, layerName)
outFeature = "C:/geodatabases/Intermediate.gdb/BG_sample_w_tr"
arcpy.AddJoin_management(layerName, joinField, joinTable, joinField)
arcpy.CopyFeatures_management(layerName, outFeature)
BG10
field?