Comparing two sequential values in order to do an operation within sequence in another field

Question 1

I have a shapefile representing points along a line. I have the downstream distance of each point on their respective line. What I would like to do is sequence the two fields in the attribute table based on the spatially joined stream id arcid and downstream distance DSDistance. I would then like to run an operation such that the difference in 'DSDistance' between two points with the same stream id is calculated:

fid arcid DSDistance to fid arcid DSDistance Distance
1 1 10 1 1 10 10
2 5 10 4 1 15 5
3 3 7 3 3 7 7
4 1 15 2 5 10 10
5 5 20 5 5 20 10
6 5 27 6 5 27 7

The code I have written to do this operation is below:

fields = ["arcid","DSDistance"]
sql_orderby = "ORDER BY" + ", ".join("{} ASC".format(field) for field in fields)
with arcpy.UpdateCursor(home + '/PourPoints1.shp',"",sql_clause=(None, sql_orderby)) as cursor: 
firstTime = True
for row in cursor:
 if firstTime:
 previous = row.getValue('arcid') 
 previous1 = row.getValue('DSDistance') 
 firstTime = False
 else:
 current = row.getValue('arcid')
 if current == previous:
 row.setValue('Distance', row.getValue('DSDistance') - previous1)
 rows.updateRow(row)
 previous1 = row.getValue('DSDistance')
 else:
 row.setValue('Distance', row.getValue('DSDistance'))
 rows.updateRow(row)
del row
del rows

I have written this code from sort rows, subtract current from next, ... and updating field on previous row.

Update using the solution below I receive an error:

Traceback (most recent call last): File "H:\Projects260円\Storm Water\GIS\PythonScript\table\BasinScript.py", line 267, in inCursor.insertRow(newRow) SystemError: error return without exception set

My workspace destination is a folder

home = r'H:/Projects/260/Storm Water/GIS/PythonScript/Workspace'
# Path to input table
tblPath = home + '/PourPoints1.shp'
# Create a defaultdict
dd = collections.defaultdict(list)
dd1 = collections.defaultdict(list)
# create an output table
print("Creating output table...")
arcpy.env.overwriteOutput=True
arcpy.CreateTable_management(home,"tblOutput.dbf")
arcpy.AddField_management(home + '/tblOutput.dbf',"arcid","LONG")
arcpy.AddField_management(home+ '/tblOutput.dbf',"DSDistance","LONG")
arcpy.AddField_management(home+ '/tblOutput.dbf',"Dist","LONG")
arcpy.AddField_management(home+ '/tblOutput.dbf',"TARGET_FID","LONG")
# read data into dictionary
print("Reading input table...")
with arcpy.da.SearchCursor(tblPath,["arcid","DSDistance","TARGET_FID"]) as cursor:
 for row in cursor:
 arcid = row[0]
 dist = row[1]
 dd[arcid].append(dist)
 fid = row[2]
 dd1[arcid].append(fid)
for k, v in dd1.items():
 dd[k] = [dd[k], v]
# sort dictionary by key into tuples of lists
sortedlist = sorted(dd.items())
# Process sortedlist
print("Writing to output table...")
with arcpy.da.InsertCursor(home + '/tblOutput.dbf',["arcid","DSDistance","Dist","TARGET_FID"]) as inCursor:
 for tup in sortedlist:
 arcid = tup[0] # e.g. 5
 disList = sorted(tup[1][0]) # e.g [10,20,27]
 distList1 = tup[1][1]
 if len(disList) > 1:
 # create a second list that is an offset of disList
 offList = list(disList) # Copy list
 offList.insert(0,0) # insert zero at start of list
 offList.pop() # remove last value, so using example above you will have [0,10,20]
 # Compute offset value
 zipobj = zip(disList,offList)
 diffList = list()
 for i1,i2 in zipobj:
 diffList.append(i1-i2)
 # Write output to table, here we are accessing the values in the list by the index position i
 for i in range(len(disList)):
 newRow = (arcid,disList[i],diffList[i],disList1[i])
 inCursor.insertRow(newRow)
 else:
 # Just one entry in input table so write it out
 newRow = (arcid,disList[0],disList[0],disList1[0])
 inCursor.insertRow(newRow)

Update: Adding the tblOutput path extension .dbf successfully produced a table with the solution I was looking for. Modified the solution to bring across a unique field ID to make joining the output table to the shape file easier.

Update: FelixIP's solution sets a warning:

extract["DELTA"] = extract[chainageField].shift(-1)-extract[chainageField] SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

Update: Updated the solution to bring a unique field from input table over to output table.

Question 2

The first rule of ArcPy cursors is Only use DA cursors. The second rule is: Don't trust ORDER BY to function on all data sources.

Question 3

The error may be down to the fact that you are creating an INFO table to store your output. I don't know but maybe cursors don't work with them? You declare a folder location H:/Projects/260/Storm Water/GIS/PythonScript/Workspace and then you create a table called tblOutput and give it no file extension. A simple solution is to explicitly state its a dBase file by putting .dbf at the end of it. Try that? My original code creates a table inside a geodatabase hence no file name extension.

Question 4

Learning how to functionally apply python to spatial data. Thank you for clarification of changing the path name. Your solution worked.

Question 5

Below is the code you need:

import arcpy
import collections
# Path to input table
tblPath = r"C:\Scratch\fGDB_Scratch.gdb\testtable"
# Create a defaultdict
dd = collections.defaultdict(list)
# create an output table
print("Creating output table...")
arcpy.env.overwriteOutput=True
arcpy.CreateTable_management(r"C:\Scratch\fGDB_Scratch.gdb","tblOutput")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","arcid","LONG")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","DSDist","LONG")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","Dist","LONG")
# read data into dictionary
print("Reading input table...")
with arcpy.da.SearchCursor(tblPath,["arcid","DSDist"]) as cursor:
 for row in cursor:
 arcid = row[0]
 dist = row[1]
 dd[arcid].append(dist)
# sort dictionary by key into tuples of lists
sortedlist = sorted(dd.items())
# Process sortedlist
print("Writing to output table...")
with arcpy.da.InsertCursor(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput",["arcid","DSDist","Dist"]) as inCursor:
 for tup in sortedlist:
 arcid = tup[0] # e.g. 5
 disList = sorted(tup[1]) # e.g [10,20,27]
 if len(disList) > 1:
 # create a second list that is an offset of disList
 offList = list(disList) # Copy list
 offList.insert(0,0) # insert zero at start of list
 offList.pop() # remove last value, so using example above you will have [0,10,20]
 # Compute offset value
 zipobj = zip(disList,offList)
 diffList = list()
 for i1,i2 in zipobj:
 diffList.append(i1-i2)
 # Write output to table, here we are accessing the values in the list by the index position i
 for i in range(len(disList)):
 newRow = (arcid,disList[i],diffList[i])
 inCursor.insertRow(newRow)
 else:
 # Just one entry in input table so write it out
 newRow = (arcid,disList[0],disList[0])
 inCursor.insertRow(newRow)

It works by reading the data into a dictionary, sorting, creating an off-set version of the list of distances and then computes the difference by taking advantage of the zip() function to iterate over both lists and finally writing it back to the output table.

Results are shown below:

Results

The simplest (path of least resistance) solution to join back the offset distance to the original pour point data is to use the Make Query Table tool. For this to work both the pour point and tblOutput need to be in the same File GeoDatabase. Then run the tool as shown below to create a temporary layer which you can right click on and export from the TOC.

Make Query Table tool

Results are:

Results

Question 6

Ill give this a shot. Will the tblOutput be a new shape file with the adjusted table or will I need to join the table to the old shape file?

Question 7

Another way to ask this is, would it be possible to instead create three new fields in my shape file and process the sorted list within those three fields in my current point shape file?

Question 8

I tried using your code but it throws a SystemError:

Traceback (most recent call last): File "x", line 267, in <module> inCursor.insertRow(newRow) SystemError: error return without exception set

Question 9

I cannot comment on the error until I see the version of code that you are running and all the full path names. This code is designed to create a table in a file geodatabase, which must already exist and in my case in a folder called scratch. You could join the output of the table back to your shapefile if you so desired, or adapt the code to write it back into the shapefile.

Question 10

To make the join process easier, what modifications do you suggest to keep a unique field ID from the shape file across to the tblOutput?

Question 11

This is a perfect case for using pandas sort, shift and series math.

Picture shows input points and their distance along line:

enter image description here

This is points' table:

enter image description here

Script:

import pandas as pd
pd.options.mode.chained_assignment = None
fid, groupField, chainageField = ["OID@", "NEAR_FID","D_ALONG"]
DF = pd.DataFrame(arcpy.da.TableToNumPyArray("points",[fid, groupField, chainageField]))
DF.sort_values(by=[chainageField],inplace=True)
bigDict = {}
RIVERS = set(DF[groupField])
for river in RIVERS:
 extract = DF[DF[groupField]==river]
 extract["DELTA"] = extract[chainageField].shift(-1)-extract[chainageField]
 extract["DELTA"].fillna(0,inplace=True)
 for row in extract.itertuples():
 bigDict[row[1]]=row[-1]
with arcpy.da.UpdateCursor("points",[fid,"DSDistance"]) as cursor:
 for oid,dsd in cursor:
 cursor.updateRow((oid,bigDict[oid]))

Output shows points labelled by their chainage and distance to next one downstream:

enter image description here

Updated shorter version of script:

import pandas as pd
fid, groupField, chainageField = ["OID@", "NEAR_FID","D_ALONG"]
DF = pd.DataFrame(arcpy.da.TableToNumPyArray("points",[fid, groupField, chainageField]))
DF.sort_values(by=[chainageField],inplace=True)
DF.set_index("OID@",inplace=True)
bigDict = {}
RIVERS = set(DF[groupField])
for river in RIVERS:
 extract = DF[DF[groupField]==river]
 DELTA = extract[chainageField].shift(-1)-extract[chainageField]
 DELTA.iloc[-1]=0
 smallDict = DELTA.to_dict()
 bigDict.update(smallDict)
with arcpy.da.UpdateCursor("points",[fid,"DSDistance"]) as cursor:
 for oid,dsd in cursor:
 cursor.updateRow((oid,bigDict[oid]))

Question 12

In your arcpy.da.UpdateCursor call you have "DSDistance" but in your list of variables you use "D_Along". For clarification, your "D_along" field is the distance along the line and the "DSDistance" is the downstream distance between points, correct?

Question 13

Yes to both questions.

Question 14

I updated my post with an error from the solution you provided when I integrated it into my code. I wasn't sure how to rectify it.

Question 15

Did you get correct result? No such warning when I ran it from map document

Question 16

It threw the error before finishing and didn't output the results.

Hornbydd Hornbydd 44.9k5 gold badges42 silver badges84 bronze badges · Accepted Answer · 2022-06-21 17:21:10Z

Below is the code you need:

import arcpy
import collections
# Path to input table
tblPath = r"C:\Scratch\fGDB_Scratch.gdb\testtable"
# Create a defaultdict
dd = collections.defaultdict(list)
# create an output table
print("Creating output table...")
arcpy.env.overwriteOutput=True
arcpy.CreateTable_management(r"C:\Scratch\fGDB_Scratch.gdb","tblOutput")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","arcid","LONG")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","DSDist","LONG")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","Dist","LONG")
# read data into dictionary
print("Reading input table...")
with arcpy.da.SearchCursor(tblPath,["arcid","DSDist"]) as cursor:
 for row in cursor:
 arcid = row[0]
 dist = row[1]
 dd[arcid].append(dist)
# sort dictionary by key into tuples of lists
sortedlist = sorted(dd.items())
# Process sortedlist
print("Writing to output table...")
with arcpy.da.InsertCursor(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput",["arcid","DSDist","Dist"]) as inCursor:
 for tup in sortedlist:
 arcid = tup[0] # e.g. 5
 disList = sorted(tup[1]) # e.g [10,20,27]
 if len(disList) > 1:
 # create a second list that is an offset of disList
 offList = list(disList) # Copy list
 offList.insert(0,0) # insert zero at start of list
 offList.pop() # remove last value, so using example above you will have [0,10,20]
 # Compute offset value
 zipobj = zip(disList,offList)
 diffList = list()
 for i1,i2 in zipobj:
 diffList.append(i1-i2)
 # Write output to table, here we are accessing the values in the list by the index position i
 for i in range(len(disList)):
 newRow = (arcid,disList[i],diffList[i])
 inCursor.insertRow(newRow)
 else:
 # Just one entry in input table so write it out
 newRow = (arcid,disList[0],disList[0])
 inCursor.insertRow(newRow)

It works by reading the data into a dictionary, sorting, creating an off-set version of the list of distances and then computes the difference by taking advantage of the zip() function to iterate over both lists and finally writing it back to the output table.

Results are shown below:

Results

The simplest (path of least resistance) solution to join back the offset distance to the original pour point data is to use the Make Query Table tool. For this to work both the pour point and tblOutput need to be in the same File GeoDatabase. Then run the tool as shown below to create a temporary layer which you can right click on and export from the TOC.

Make Query Table tool

Results are:

Results

Ill give this a shot. Will the tblOutput be a new shape file with the adjusted table or will I need to join the table to the old shape file?
Another way to ask this is, would it be possible to instead create three new fields in my shape file and process the sorted list within those three fields in my current point shape file?
I tried using your code but it throws a SystemError: Traceback (most recent call last): File "x", line 267, in <module> inCursor.insertRow(newRow) SystemError: error return without exception set
I cannot comment on the error until I see the version of code that you are running and all the full path names. This code is designed to create a table in a file geodatabase, which must already exist and in my case in a folder called scratch. You could join the output of the table back to your shapefile if you so desired, or adapt the code to write it back into the shapefile.
To make the join process easier, what modifications do you suggest to keep a unique field ID from the shape file across to the tblOutput?

Stack Exchange Network

Comparing two sequential values in order to do an operation within sequence in another field

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Comparing two sequential values in order to do an operation within sequence in another field

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions