Script not working as expected

Question 1

I'm newbie to python and I wrote this script. It is supposed to check the text in the fields fld_o and fld_d and write a value (just numbers in a sequence) to a newly created field fld_name. While looping if it comes across a text in either of the two fields which was already there is any of the previous rows in either of the fields then it should write the same value from that previous row to this row in the new field. The script ran without problems. But somehow the results are not as expected. You can see the two rows highlighted both have Werl in common so how can they have a different number in the new field? When the loop reaches the first highlighted row it should add Werl to the dictionary and assign it the same value as Hamm i.e. 54 and write it to the new field and then later when it reaches the second highlighted row it should take the already assigned value of 54 from Werl and assign this to Wickede and write it in the new field as well. At least that's what I expected.

import arcpy
arcpy.env.overwriteOutput = True
fc_in = 'E:/SUT/Thesis/Geodata/NRW_Gemeinde_SpatialJoin_SelectCopy.shp'
fld_o = 'GEN'
fld_d = 'GEN_1'
fld_name1 = 'Sub_Region'
arcpy.AddField_management(fc_in, fld_name1, 'SHORT') 
dct = {}
sub_region = 0
with arcpy.da.UpdateCursor(fc_in, (fld_o, fld_d, fld_name1)) as cursor:
 for row in cursor:
 origin = row[0]
 dest = row[1]
 if origin in dct and dest in dct:
 row[2] = dct[origin]
 elif origin in dct: 
 row[2] = dct[origin]
 dct[dest] = dct[origin]
 elif dest in dct:
 row[2] = dct[dest]
 dct[origin] = dct[dest]
 else:
 sub_region = sub_region + 1
 row[2] = sub_region
 dct[origin] = sub_region
 dct[dest] = sub_region
 cursor.updateRow(row)

enter image description here

Question 2

I think this will perform what you are looking for. It iterates the first two fields of each row and compares each value to the dct.

dct = {}
subregion = 0
with arcpy.da.UpdateCursor(fc_in, (fld_o, fld_d, fld_name1)) as cursor:
 for row in rows:
 for i in row[:2]:
 if not i in dct.keys():
 dct[i] = subregion 
 subregion += 1
 row[2] = dct[i]
 cursor.updateRow(row)

You don't say what preference should be given if both of the strings have occurred before in either of the two fields... If that matters you'd need a bit more.

EDIT: As to why your script isn't working correctly: it's tricky to figure it out from here because in your screenshot you have sorted the rows by the "GEN_1" field. The cursor will iterate the rows based on the FID field, so I'd recommend sorting the table by that field before further debugging efforts...

If you want the script to iterate the rows based on the GEN_1 field, you'll have to migrate this shapefile to a geodatabase, then you can use a sql clause to sort the cursor before you iterate it. Here's more info.

EDIT 2: I'm sure the script is working just fine, but the way you have set up the if/elif/else commands is a little confusing to me, as is the premise. In you description of what you want, you say that it should "if it comes across a text in either of the two fields which was already there is any of the previous rows in either of the fields then it then it should write the same value from that previous row to this row in the new field."

The problem is that with this script you are never checking rows, only single items in each row. When you update the dct for a single item you will lose the possibility of referencing a "row". Basically, I'm not convinced that the goal you are trying to accomplish makes sense. If you could fill in the x's below with the desired numbers, it would help me figure out exactly what you're actually trying to accomplish.

gen | gen_1 | sub_region
a | b | XXXX
e | b | XXXX
b | c | XXXX
c | d | XXXX
c | e | XXXX
a | f | XXXX

Question 3

Thanks, this seems like another nice way to do it. But it would be helpful if you could point out the problem in my script. And by the way if both strings have occurred before then the same value from the row before is to be written in the current row in the new field. I have already tried to do that in my script.

Question 4

added a little more info to the answer...

Question 5

Actually I wrote the script assuming that it will be sorted on the GEN field. I just migrated the shapefile to database and sorted by GEN field using ORDER BY sql clause. But it still doesn't give the correct result. I don't know why something that the script should clearly do is not being done, although the script runs without errors.

mr.adam mr.adam 3,23415 silver badges27 bronze badges · Accepted Answer · 2015-06-12 18:52:12Z

I think this will perform what you are looking for. It iterates the first two fields of each row and compares each value to the dct.

dct = {}
subregion = 0
with arcpy.da.UpdateCursor(fc_in, (fld_o, fld_d, fld_name1)) as cursor:
 for row in rows:
 for i in row[:2]:
 if not i in dct.keys():
 dct[i] = subregion 
 subregion += 1
 row[2] = dct[i]
 cursor.updateRow(row)

You don't say what preference should be given if both of the strings have occurred before in either of the two fields... If that matters you'd need a bit more.

EDIT: As to why your script isn't working correctly: it's tricky to figure it out from here because in your screenshot you have sorted the rows by the "GEN_1" field. The cursor will iterate the rows based on the FID field, so I'd recommend sorting the table by that field before further debugging efforts...

If you want the script to iterate the rows based on the GEN_1 field, you'll have to migrate this shapefile to a geodatabase, then you can use a sql clause to sort the cursor before you iterate it. Here's more info.

EDIT 2: I'm sure the script is working just fine, but the way you have set up the if/elif/else commands is a little confusing to me, as is the premise. In you description of what you want, you say that it should "if it comes across a text in either of the two fields which was already there is any of the previous rows in either of the fields then it then it should write the same value from that previous row to this row in the new field."

The problem is that with this script you are never checking rows, only single items in each row. When you update the dct for a single item you will lose the possibility of referencing a "row". Basically, I'm not convinced that the goal you are trying to accomplish makes sense. If you could fill in the x's below with the desired numbers, it would help me figure out exactly what you're actually trying to accomplish.

gen | gen_1 | sub_region
a | b | XXXX
e | b | XXXX
b | c | XXXX
c | d | XXXX
c | e | XXXX
a | f | XXXX

Thanks, this seems like another nice way to do it. But it would be helpful if you could point out the problem in my script. And by the way if both strings have occurred before then the same value from the row before is to be written in the current row in the new field. I have already tried to do that in my script.
Actually I wrote the script assuming that it will be sorted on the GEN field. I just migrated the shapefile to database and sorted by GEN field using ORDER BY sql clause. But it still doesn't give the correct result. I don't know why something that the script should clearly do is not being done, although the script runs without errors.

Stack Exchange Network

Script not working as expected

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Script not working as expected

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions