0

I have a script that I'm using to sort through a very large feature layer with thousands of records and create a summary table of them by type. It also calculates the count of records of the last 5 years to the current date for each year rather than the entire calendar year and then calculates the average. My output table is the type, average, and current count for the year. Right now I have this outputting to a CSV but I'm trying to get that CSV to overwrite a hosted feature table. I've referenced some posts on the Esri Community forums and it seems like this should be possible but I keep getting an error. Here is my code:

import arcpy, os, uuid
import pandas as pd
import datetime
from datetime import timedelta
from arcgis.features import GeoAccessor, FeatureLayer
from arcgis.gis import GIS
# variables
url = 'arcgis online url'
username = 'username'
password = 'password'
hostedTableID = 'item ID'
inputfc = r'feature layer from sde on server'
outputCSVFile = r'csv in folder on server'
# get table from AGOL
gis = GIS(url, username, password)
CrimeTrendsTable = gis.content.get(hostedTableID)
CrimeTrendsTableLyr = CrimeTrendsTable.tables[0]
crimeLayer = FeatureLayer(CrimeTrendsTableLyr.url, gis=gis)
# truncate table
crimeLayer.manager.truncate()
# import feature class and create slice of the data set in a new dataframe
df = pd.DataFrame.spatial.from_featureclass(inputfc)
df = df.sort_values(['OFFENSES_YEAR', 'CRIME_STAT_TYPE', 'FROM_DATE'], ascending= [True, True, True], ignore_index=True)
df = df.loc[:, ['OFFENSES_YEAR', 'CRIME_STAT_TYPE', 'FROM_DATE']]
# set variables for today and the previous 5 years
today = datetime.datetime.today()
todate1 = today - timedelta(365)
todate2 = today - timedelta(730)
todate3 = today - timedelta(1095)
todate4 = today - timedelta(1460)
todate5 = today - timedelta(1825)
this_year = datetime.datetime.today().year
one_year_ago = this_year - 1
two_years_ago = this_year - 2
three_years_ago = this_year - 3
four_years_ago = this_year - 4
five_years_ago = this_year - 5
# create new dataframes for current and each previous year to date
df0 = df.loc[(df['OFFENSES_YEAR']==this_year) & (df['FROM_DATE'] < today)]
df1 = df.loc[(df['OFFENSES_YEAR']==one_year_ago) & (df['FROM_DATE'] < todate1)]
df2 = df.loc[(df['OFFENSES_YEAR']==two_years_ago) & (df['FROM_DATE'] < todate2)]
df3 = df.loc[(df['OFFENSES_YEAR']==three_years_ago) & (df['FROM_DATE'] < todate3)]
df4 = df.loc[(df['OFFENSES_YEAR']==four_years_ago) & (df['FROM_DATE'] < todate4)]
df5 = df.loc[(df['OFFENSES_YEAR']==five_years_ago) & (df['FROM_DATE'] < todate5)]
# create tables for each current and previous year grouped by crime type, add together previous year tables and average
table0 = df0.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count().reset_index()\
 .rename(columns={"OFFENSES_YEAR" : this_year})
table1 = df1.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table2 = df2.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table3 = df3.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table4 = df4.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table5 = df5.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
avgtable = pd.concat([table1, table2, table3, table4, table5]).groupby('CRIME_STAT_TYPE').mean().reset_index()\
 .rename(columns={"OFFENSES_YEAR" : "previous5yearAverage"})
trendstable = avgtable.join(table0.set_index('CRIME_STAT_TYPE'), on='CRIME_STAT_TYPE').fillna(0)
trendstable.to_csv(outputCSVFile, sep='\t', encoding='utf-8')
# update hosted table from csv file
csvDF = GeoAccessor.from_table(outputCSVFile)
adds_fs = csvDF.spatial.to_featureset()
adds_dict = adds_fs.to_dict()
adds = adds_dict["features"]
crimeLayer.edit_features(adds=adds)

Here's an example of some data from my csv table:

CRIME_STAT_TYPE previous5yearAverage 2024
0 AGGRAVATED ASSAULT 12.000000 9.0
1 ALL OTHER OFFENSES 54.800000 46.0
2 ANIMAL CRUELTY 1.666667 1.0
3 ARSON 1.000000 0.0
4 BAD CHECKS 1.750000 1.0
5 BURGLARY/BREAKING AND ENTERING 44.400000 50.0
45 WIRE FRAUD 6.000000 0.0

This is the error I'm getting when I run the adds_fs = csvDF.spatial.to_featureset() line:

---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
In [60]:
Line 1: adds_fs = csvDF.spatial.to_featureset()
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in to_featureset:
Line 3573: return FeatureSet.from_dict(self.__feature_set__)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in __feature_set__:
Line 3297: if self.sr is None:
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in sr:
Line 3503: for g in self._data[self.name]
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\frame.py, in __getitem__:
Line 3505: indexer = self.columns.get_loc(key)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\indexes\base.py, in get_loc:
Line 3631: raise KeyError(key) from err
KeyError: None
---------------------------------------------------------------------------

Also here is a screenshot of the CSV. I think something is happening with how pandas is creating the CSV. When trying to read the CSV file, it is not being read correctly. See below:

CSV Output in Python Console

I'm wondering if it has something to do with the delimiter being used or something else?

I'm very new to all of this.

Vince
20.5k16 gold badges49 silver badges65 bronze badges
asked Mar 28, 2024 at 22:14
3
  • 1
    First - you are outputting a CSV file from a data frame, and then immediately reading the CSV back into a data frame using the GeoAccessor.from_table() function. It would be more efficient and reliable to just kick off directly from the dataframe that already has the data sitting there. The second question would be, does your data have a spatial component or is this just a non-spatial table? Commented Apr 18, 2024 at 21:00
  • Also, how large is this dataset? It may not be best practice for the API to use the edit_features() function and may be better advised to switch to an append method. Commented Apr 18, 2024 at 21:03
  • @AustinAverill - thanks for the advice! It's my first time using python and pandas. I ended up pushing the data to an xls instead of a csv and that worked for some reason. It's just a non-spatial table. The original dataset is quite large but by the time it's filtered and summarized it's not that big at all. I'll post my work around as the answer. Commented Apr 23, 2024 at 17:38

1 Answer 1

1

So I couldn't get the csv to work but I was able to get it to work using an xls file. The code is as follows:

finaltable.to_excel(outputXLSFile)
# update hosted table from xls file
finaltable = arcpy.ExcelToTable_conversion(outputXLSFile, outputTable, 'Sheet1')
arcpy.DeleteField_management(outputTable, ["COL_A"])
arcpy.management.Append(inputs=[outputTable], target=CrimeTrendsTableLyr.url)[0]
answered Apr 23, 2024 at 17:41

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.