1

I have a few hundred GPS data points (saved in an ArcGIS Pro 2.9 file geodatabase) that logged simultaneously with different speeds. I am looking to write a Python script that will compare each row's DateTime field, and if they are identical, update the first occurrence with the lowest speed (int field) logged.

INPUT:

|OID| DateTime |Speed|
| 1 | 12/11/2021 16:41:21 | 14 |
| 2 | 12/11/2021 16:41:21 | 12 |
| 3 | 12/11/2021 16:41:32 | 20 |
| 4 | 12/11/2021 16:41:32 | 25 |

TARGET:

|OID| Time |Speed|
| 1 | 12/11/2021 16:41:21 | 12 |
| 2 | 12/11/2021 16:41:32 | 20 |

Below is the code I used to delete exact duplicates:

check_fields = ['OBJECTID', 'SpeedDate' # concatenated speed and date fields]
found_values = defaultdict(set)
with arcpy.da.UpdateCursor(input_dataset, check_fields) as cursor:
 for row in cursor:
 for i, column in enumerate(check_fields):
 if row[i] in found_values[column]:
 cursor.deleteRow()
 break
 else:
 for column, value in zip(check_fields, row):
 found_values[column].add(value)

I'm pretty new to using UpdateCursor and ArcPy in general. I assume the first part of this code will still be useful in creating a set of unique values.

How do I go about assigning the first occurrence a value based on a second row comparison?

PolyGeo
65.5k29 gold badges115 silver badges349 bronze badges
asked Jan 3, 2022 at 18:07
0

3 Answers 3

3

Create a list of tuples using da.SearchCursor. Sort the list by date and speed, so the lowest speed for each date comes last. Create a dictionary of datetime:lowest speed. Create a dictionary of OID:lowest speed, update.

import arcpy
fc = r'C:\GIS\ArcMap_default_folder\Default.gdb\datetime_speed'
fieldlist = ['OID@','DateTime','Speed']
all_rows = [row for row in arcpy.da.SearchCursor(in_table=fc, field_names=fieldlist)]
#[(1, datetime.datetime(2010, 4, 24, 0, 0), 0.22), (2, datetime.datetime(2010, 4, 24, 0, 0), 0.9), (13, datetime.datetime(2010, 4, 19, 0, 0), 0.21), (14, datetime.datetime(2010, 4, 19, 0, 0), 0.06), (16, datetime.datetime(2010, 4, 19, 0, 0), 1.0)]
all_rows.sort(key=lambda x: (x[1], x[2]), reverse=True)
#The lowest speed is last for each date:
#[(2, datetime.datetime(2010, 4, 24, 0, 0), 0.9), (1, datetime.datetime(2010, 4, 24, 0, 0), 0.22), (16, datetime.datetime(2010, 4, 19, 0, 0), 1.0), (13, datetime.datetime(2010, 4, 19, 0, 0), 0.21), (14, datetime.datetime(2010, 4, 19, 0, 0), 0.06)]
tempdict = {row[1]:row[2] for row in all_rows}
#Store each date as key and lowest speed in dictionary:
#{datetime.datetime(2010, 4, 24, 0, 0): 0.22, datetime.datetime(2010, 4, 19, 0, 0): 0.06}
#Create a dictionary of OID:lowest speed
updatedict = {row[0]:tempdict[row[1]] for row in all_rows}
#{2: 0.22, 1: 0.22, 16: 0.06, 13: 0.06, 14: 0.06}
#Update the feature class. I store the lowest speeds in a new field
with arcpy.da.UpdateCursor(fc, ['OID@','lowest_speed']) as cursor:
 for row in cursor:
 if row[0] in updatedict:
 row[1] = updatedict[row[0]]
 cursor.updateRow(row)

enter image description here

answered Jan 3, 2022 at 18:53
1

I think a simpler approach is to use python to call the summary statistics tool to group by date field and return minimum speed value. This can all be done with 1 line of code; so no use of dictionaries or sets required. Just review the code sample at the bottom of the help page to get you going.

answered Jan 4, 2022 at 10:13
1
  • 1
    But your coding approach is interesting too. Commented Jan 4, 2022 at 10:18
0

Logically, you don't really have a first record and second/subsequent record. If both data points were captured at the same time, as evidenced by the same date-time stamp, then you simply have a slow-speed record and faster records.

I mention this because you don't have to re-assign the slowest speed to the "first" record, you just have to remove all the records but the slowest one. The easiest way to do it is to rely on SQL to sort the data in the cursor before you process it.

The following assumes you are working in ArcGIS Pro.

(arcgispro-py3) C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3>python
Python 3.7.11 [MSC v.1927 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import arcpy
>>> from arcpy.management import CreateTable, AddField
>>> from datetime import datetime
>>>
>>> # Define records for test
>>> values = (
... ("2021年12月11日 16:41:21", 14),
... ("2021年12月11日 16:41:21", 12),
... ("2021年12月11日 16:41:32", 20),
... ("2021年12月11日 16:41:32", 25)
... )
>>>
>>> # Create test table
>>> table = CreateTable("memory", "tmp_tbl")
>>> AddField(table, "DateTime", "DATE")
<Result 'memory\\tmp_tbl'>
>>> AddField(table, "Speed", "LONG")
<Result 'memory\\tmp_tbl'>
>>>
>>> # Populate test table with test records
>>> with arcpy.da.InsertCursor(table, ("DateTime", "Speed")) as cur:
... for dt, speed in values:
... cur.insertRow((datetime.fromisoformat(dt), speed))
...
1
2
3
4
>>> # Print out records to verify table population
>>> print(*arcpy.da.SearchCursor(table, "*"), sep="\n")
(1, datetime.datetime(2021, 12, 11, 16, 41, 21), 14)
(2, datetime.datetime(2021, 12, 11, 16, 41, 21), 12)
(3, datetime.datetime(2021, 12, 11, 16, 41, 32), 20)
(4, datetime.datetime(2021, 12, 11, 16, 41, 32), 25)
>>>
>>> # Loop over records deleting all records but slowest by date-time groups
>>> sql = "ORDER BY DateTime, Speed"
>>> with arcpy.da.UpdateCursor(table, ("DateTime", "Speed"), sql_clause=(None, sql)) as cur:
... dt_prev, speed = next(cur)
... for dt, speed in cur:
... if dt == dt_prev:
... cur.deleteRow()
... dt_prev = dt
...
>>> # Print out records to verify deletions worked correctly
>>> print(*arcpy.da.SearchCursor(table, "*"), sep="\n")
(2, datetime.datetime(2021, 12, 11, 16, 41, 21), 12)
(3, datetime.datetime(2021, 12, 11, 16, 41, 32), 20)
>>>
answered Jan 26, 2022 at 18:13

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.