I have a shapefile representing points along a line. I have the downstream distance of each point on their respective line. What I would like to do is sequence the two fields in the attribute table based on the spatially joined stream id arcid
and downstream distance DSDistance
. I would then like to run an operation such that the difference in 'DSDistance' between two points with the same stream id is calculated:
fid arcid DSDistance to fid arcid DSDistance Distance
1 1 10 1 1 10 10
2 5 10 4 1 15 5
3 3 7 3 3 7 7
4 1 15 2 5 10 10
5 5 20 5 5 20 10
6 5 27 6 5 27 7
The code I have written to do this operation is below:
fields = ["arcid","DSDistance"]
sql_orderby = "ORDER BY" + ", ".join("{} ASC".format(field) for field in fields)
with arcpy.UpdateCursor(home + '/PourPoints1.shp',"",sql_clause=(None, sql_orderby)) as cursor:
firstTime = True
for row in cursor:
if firstTime:
previous = row.getValue('arcid')
previous1 = row.getValue('DSDistance')
firstTime = False
else:
current = row.getValue('arcid')
if current == previous:
row.setValue('Distance', row.getValue('DSDistance') - previous1)
rows.updateRow(row)
previous1 = row.getValue('DSDistance')
else:
row.setValue('Distance', row.getValue('DSDistance'))
rows.updateRow(row)
del row
del rows
I have written this code from sort rows, subtract current from next, ... and updating field on previous row.
Update using the solution below I receive an error:
Traceback (most recent call last): File "H:\Projects260円\Storm Water\GIS\PythonScript\table\BasinScript.py", line 267, in inCursor.insertRow(newRow) SystemError: error return without exception set
My workspace destination is a folder
home = r'H:/Projects/260/Storm Water/GIS/PythonScript/Workspace'
# Path to input table
tblPath = home + '/PourPoints1.shp'
# Create a defaultdict
dd = collections.defaultdict(list)
dd1 = collections.defaultdict(list)
# create an output table
print("Creating output table...")
arcpy.env.overwriteOutput=True
arcpy.CreateTable_management(home,"tblOutput.dbf")
arcpy.AddField_management(home + '/tblOutput.dbf',"arcid","LONG")
arcpy.AddField_management(home+ '/tblOutput.dbf',"DSDistance","LONG")
arcpy.AddField_management(home+ '/tblOutput.dbf',"Dist","LONG")
arcpy.AddField_management(home+ '/tblOutput.dbf',"TARGET_FID","LONG")
# read data into dictionary
print("Reading input table...")
with arcpy.da.SearchCursor(tblPath,["arcid","DSDistance","TARGET_FID"]) as cursor:
for row in cursor:
arcid = row[0]
dist = row[1]
dd[arcid].append(dist)
fid = row[2]
dd1[arcid].append(fid)
for k, v in dd1.items():
dd[k] = [dd[k], v]
# sort dictionary by key into tuples of lists
sortedlist = sorted(dd.items())
# Process sortedlist
print("Writing to output table...")
with arcpy.da.InsertCursor(home + '/tblOutput.dbf',["arcid","DSDistance","Dist","TARGET_FID"]) as inCursor:
for tup in sortedlist:
arcid = tup[0] # e.g. 5
disList = sorted(tup[1][0]) # e.g [10,20,27]
distList1 = tup[1][1]
if len(disList) > 1:
# create a second list that is an offset of disList
offList = list(disList) # Copy list
offList.insert(0,0) # insert zero at start of list
offList.pop() # remove last value, so using example above you will have [0,10,20]
# Compute offset value
zipobj = zip(disList,offList)
diffList = list()
for i1,i2 in zipobj:
diffList.append(i1-i2)
# Write output to table, here we are accessing the values in the list by the index position i
for i in range(len(disList)):
newRow = (arcid,disList[i],diffList[i],disList1[i])
inCursor.insertRow(newRow)
else:
# Just one entry in input table so write it out
newRow = (arcid,disList[0],disList[0],disList1[0])
inCursor.insertRow(newRow)
Update:
Adding the tblOutput path extension .dbf
successfully produced a table with the solution I was looking for. Modified the solution to bring across a unique field ID to make joining the output table to the shape file easier.
Update: FelixIP's solution sets a warning:
extract["DELTA"] = extract[chainageField].shift(-1)-extract[chainageField] SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
Update: Updated the solution to bring a unique field from input table over to output table.
-
The first rule of ArcPy cursors is Only use DA cursors. The second rule is: Don't trust ORDER BY to function on all data sources.Vince– Vince2022年06月21日 23:22:01 +00:00Commented Jun 21, 2022 at 23:22
-
1The error may be down to the fact that you are creating an INFO table to store your output. I don't know but maybe cursors don't work with them? You declare a folder location H:/Projects/260/Storm Water/GIS/PythonScript/Workspace and then you create a table called tblOutput and give it no file extension. A simple solution is to explicitly state its a dBase file by putting .dbf at the end of it. Try that? My original code creates a table inside a geodatabase hence no file name extension.Hornbydd– Hornbydd2022年06月22日 09:07:28 +00:00Commented Jun 22, 2022 at 9:07
-
1Learning how to functionally apply python to spatial data. Thank you for clarification of changing the path name. Your solution worked.rweber– rweber2022年06月22日 14:26:39 +00:00Commented Jun 22, 2022 at 14:26
2 Answers 2
Below is the code you need:
import arcpy
import collections
# Path to input table
tblPath = r"C:\Scratch\fGDB_Scratch.gdb\testtable"
# Create a defaultdict
dd = collections.defaultdict(list)
# create an output table
print("Creating output table...")
arcpy.env.overwriteOutput=True
arcpy.CreateTable_management(r"C:\Scratch\fGDB_Scratch.gdb","tblOutput")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","arcid","LONG")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","DSDist","LONG")
arcpy.AddField_management(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput","Dist","LONG")
# read data into dictionary
print("Reading input table...")
with arcpy.da.SearchCursor(tblPath,["arcid","DSDist"]) as cursor:
for row in cursor:
arcid = row[0]
dist = row[1]
dd[arcid].append(dist)
# sort dictionary by key into tuples of lists
sortedlist = sorted(dd.items())
# Process sortedlist
print("Writing to output table...")
with arcpy.da.InsertCursor(r"C:\Scratch\fGDB_Scratch.gdb\tblOutput",["arcid","DSDist","Dist"]) as inCursor:
for tup in sortedlist:
arcid = tup[0] # e.g. 5
disList = sorted(tup[1]) # e.g [10,20,27]
if len(disList) > 1:
# create a second list that is an offset of disList
offList = list(disList) # Copy list
offList.insert(0,0) # insert zero at start of list
offList.pop() # remove last value, so using example above you will have [0,10,20]
# Compute offset value
zipobj = zip(disList,offList)
diffList = list()
for i1,i2 in zipobj:
diffList.append(i1-i2)
# Write output to table, here we are accessing the values in the list by the index position i
for i in range(len(disList)):
newRow = (arcid,disList[i],diffList[i])
inCursor.insertRow(newRow)
else:
# Just one entry in input table so write it out
newRow = (arcid,disList[0],disList[0])
inCursor.insertRow(newRow)
It works by reading the data into a dictionary, sorting, creating an off-set version of the list of distances and then computes the difference by taking advantage of the zip() function to iterate over both lists and finally writing it back to the output table.
Results are shown below:
The simplest (path of least resistance) solution to join back the offset distance to the original pour point data is to use the Make Query Table tool. For this to work both the pour point and tblOutput need to be in the same File GeoDatabase. Then run the tool as shown below to create a temporary layer which you can right click on and export from the TOC.
Results are:
-
Ill give this a shot. Will the tblOutput be a new shape file with the adjusted table or will I need to join the table to the old shape file?rweber– rweber2022年06月21日 17:31:47 +00:00Commented Jun 21, 2022 at 17:31
-
Another way to ask this is, would it be possible to instead create three new fields in my shape file and process the sorted list within those three fields in my current point shape file?rweber– rweber2022年06月21日 17:45:26 +00:00Commented Jun 21, 2022 at 17:45
-
I tried using your code but it throws a
SystemError
:Traceback (most recent call last): File "x", line 267, in <module> inCursor.insertRow(newRow) SystemError: error return without exception set
rweber– rweber2022年06月21日 18:47:13 +00:00Commented Jun 21, 2022 at 18:47 -
I cannot comment on the error until I see the version of code that you are running and all the full path names. This code is designed to create a table in a file geodatabase, which must already exist and in my case in a folder called scratch. You could join the output of the table back to your shapefile if you so desired, or adapt the code to write it back into the shapefile.Hornbydd– Hornbydd2022年06月21日 19:38:10 +00:00Commented Jun 21, 2022 at 19:38
-
To make the join process easier, what modifications do you suggest to keep a unique field ID from the shape file across to the tblOutput?rweber– rweber2022年06月22日 16:39:29 +00:00Commented Jun 22, 2022 at 16:39
This is a perfect case for using pandas sort, shift and series math.
Picture shows input points and their distance along line:
This is points' table:
Script:
import pandas as pd
pd.options.mode.chained_assignment = None
fid, groupField, chainageField = ["OID@", "NEAR_FID","D_ALONG"]
DF = pd.DataFrame(arcpy.da.TableToNumPyArray("points",[fid, groupField, chainageField]))
DF.sort_values(by=[chainageField],inplace=True)
bigDict = {}
RIVERS = set(DF[groupField])
for river in RIVERS:
extract = DF[DF[groupField]==river]
extract["DELTA"] = extract[chainageField].shift(-1)-extract[chainageField]
extract["DELTA"].fillna(0,inplace=True)
for row in extract.itertuples():
bigDict[row[1]]=row[-1]
with arcpy.da.UpdateCursor("points",[fid,"DSDistance"]) as cursor:
for oid,dsd in cursor:
cursor.updateRow((oid,bigDict[oid]))
Output shows points labelled by their chainage and distance to next one downstream:
Updated shorter version of script:
import pandas as pd
fid, groupField, chainageField = ["OID@", "NEAR_FID","D_ALONG"]
DF = pd.DataFrame(arcpy.da.TableToNumPyArray("points",[fid, groupField, chainageField]))
DF.sort_values(by=[chainageField],inplace=True)
DF.set_index("OID@",inplace=True)
bigDict = {}
RIVERS = set(DF[groupField])
for river in RIVERS:
extract = DF[DF[groupField]==river]
DELTA = extract[chainageField].shift(-1)-extract[chainageField]
DELTA.iloc[-1]=0
smallDict = DELTA.to_dict()
bigDict.update(smallDict)
with arcpy.da.UpdateCursor("points",[fid,"DSDistance"]) as cursor:
for oid,dsd in cursor:
cursor.updateRow((oid,bigDict[oid]))
-
In your
arcpy.da.UpdateCursor
call you have"DSDistance"
but in your list of variables you use"D_Along"
. For clarification, your"D_along"
field is the distance along the line and the"DSDistance"
is the downstream distance between points, correct?rweber– rweber2022年06月22日 14:40:33 +00:00Commented Jun 22, 2022 at 14:40 -
Yes to both questions.FelixIP– FelixIP2022年06月22日 19:06:11 +00:00Commented Jun 22, 2022 at 19:06
-
I updated my post with an error from the solution you provided when I integrated it into my code. I wasn't sure how to rectify it.rweber– rweber2022年06月22日 19:10:33 +00:00Commented Jun 22, 2022 at 19:10
-
Did you get correct result? No such warning when I ran it from map documentFelixIP– FelixIP2022年06月22日 19:21:27 +00:00Commented Jun 22, 2022 at 19:21
-
It threw the error before finishing and didn't output the results.rweber– rweber2022年06月22日 19:25:29 +00:00Commented Jun 22, 2022 at 19:25
Explore related questions
See similar questions with these tags.