3

I have a shapefile with exactly spaced points from each other, each with species information, I don't have the exact distributions, for that I did a spatial join between those points and a grid, to get a spatial reference. Then, I made an intersection of that grid with a map with municipalities to know where these species are located approximately, but for each municipality several cells of the grid are overlapped, so my result in the attribute table were several rows with the same municipality, but those rows have lists of different species and from one row to another can share some species. As in the figure, my goal is to have a single row that contains the municipality with a single list of species, somehow combine all the rows. Species are separeted by | symbol. Which is the way I got the data. I have tried with the functions Merge, Join, Spatial Join, Dissolve, Convert the attribute table to excel and do the work manually, but it consumes a lot of time because they are more than 400 municipalities and more than 8 groups of organisms with which I am working. I found the function Concatenate Row Values (which is what I really need) but in ArcGIS only worked with few data (https://www.arcgis.com/home/item.html?id=52dfcef46fdb4c76bfbc08dc01570f3c).

enter image description here

PolyGeo
65.5k29 gold badges115 silver badges349 bronze badges
asked Sep 9, 2019 at 14:24
11
  • It's unclear to me why you are using a grid as intermediate steep rather than doing a spatial joint between your points and your municipalities ? could you explain Commented Sep 9, 2019 at 14:34
  • Hi! A point is not associated with a municipality but with a certain radius, all the info of that sampled area is stored at that point, so certainty of the distribution is lost. It was the first thing I did, but that meant that of 426 municipalities, 90 were left without info, which isnt correct. So, I drew a grid, where each point was the cell center and thus have a distribution of information. So in each cell several municipalities can intersect. Unfortunately, I don't have the exact distribution of sp data, so I have to work on an estimated number of species per municipality. Commented Sep 9, 2019 at 14:53
  • do you have several species per point before you join with the grid ? Commented Sep 9, 2019 at 14:59
  • 1
    Are you looking for a way to get from table 1 to table 2, or a better way to produce table 2 without having to produce table 1? Commented Sep 9, 2019 at 15:02
  • 1
    BERA: If the delimiter is the symbol then this "|" it is. I am using ArcMap 10.7 Commented Sep 10, 2019 at 8:02

3 Answers 3

4

You mentioned in your questions that you had been exporting to Excel.

If you all you seek is your concatenated results in a table this solution will output a csv file in the same fashion as your example.

import csv
Fields = ['Municipality', 'Scientific_Name', 'Species_Richness']
cursor = arcpy.da.SearchCursor("YourTable", Fields[:-1])
tempDict = {}
for row in cursor:
 if row[0] not in tempDict:
 tempDict[row[0]]= []
 for i in row[1].split('|'):
 if i not in tempDict[row[0]]:
 tempDict[row[0]].append(i)
 else:
 for i in row[1].split('|'):
 if i not in tempDict[row[0]]:
 tempDict[row[0]].append(i)
for key in tempDict:
 tempDict[key].sort()
for key in tempDict:
 tempDict[key].append(len(tempDict[key]))
csv_file = open(r"C:\Data\YourPath\Species.csv", 'wb')
csvfile = csv.writer(csv_file)
csvfile.writerow(Fields)
for key in tempDict:
 csvfile.writerow([key, ('|').join(tempDict[key][:-1]), tempDict[key][-1:][0]])
csv_file.close()

This example I used your filenames with underscores as Arc will only allow for field alias with spaces which don't seem to function with the search cursor.

answered Sep 9, 2019 at 21:25
5
  • +1, but you should use a set instead of a list, otherwise you will still have duplicates and you will thus overestimate the species richness. (And you could 1) add lists with tempDict[row[0]] += row[1].split('|') 2)sort and add the count in a single loop instead of 3, but this is just a detail). Commented Sep 10, 2019 at 6:14
  • Thank you for the code! Even that, I'm not advanced at all with Python, but I tried and I got this message: Runtime error Traceback (most recent call last): File "<string>", line 6, in <module> RuntimeError: General function failure >>> Commented Sep 10, 2019 at 10:16
  • ACF307 I'm not sure why you are getting that error but you will have to change the code if the field names that I have provided are not accurate and I have used "YourTable" as the name of the table, you will need to edit to the code to reflect what ever your table, featureclass or shapefile is when its in the TOC in Arcmap. radouxju my code accounts for duplicates therefore species richness should not be over counted. Commented Sep 10, 2019 at 17:51
  • F_Kellner I do carefuly changed all the fields and names depending on my data, but still happens this error: Runtime error Traceback (most recent call last): File "<string>", line 26, in <module> NameError: name 'csv' is not defined Commented Sep 11, 2019 at 9:42
  • ACF307 I made an edit to the code. But you need to import csv. Commented Sep 12, 2019 at 14:49
1

If you have table 1 you can use arcpy and pandas to create table 2. Adjust and execute in python window:

import arcpy
import pandas as pd
table = r'C:\data.gdb\points' #Change to match your table / feature class
muni = 'Municipality' #Change to match your fieldname
sname = 'Scientific Name' #Change to match your fieldname
species_delimiter = '|' #Change to match species delimiter
species = 'Species richness'
df = pd.DataFrame.from_records(data=arcpy.da.SearchCursor(table,[muni, sname])), columns = [muni, sname] #Create pandas dataframe using da.SearchCursor to read table. If you already have tbl1 as csv use pd.from_csv
df[sname] = df[sname].apply(lambda x: x.split(species_delimiter)) #Split string into list of species
df = df.groupby(muni)[sname].apply(sum).reset_index() #Group by muni and add more species
df[species] = df[sname].apply(lambda x: len(set(x))) #Create column as count of unique species
df[sname] = df[sname].apply(lambda x: species_delimiter.join(set(x))) #Drop duplicate species and convert back to string
df.to_clipboard() #Paste into excel. Or: df.to_csv(r'C:\species.csv')

From:

 Municipality Scientific Name
0 1 A|C|D
1 1 B|D
2 1 A
3 2 C|X|Y|Z
4 2 A|C|Z
5 2 X|Y|Z
6 3 M|O
7 3 M|N|O|P
8 4 A|B|C
9 4 A|C
10 5 K

To:

 Municipality Scientific Name Species richness
0 1 B|A|D|C 4
1 2 Y|X|Z|A|C 5
2 3 N|M|O|P 4
3 4 B|A|C 3
4 5 K 1
answered Sep 9, 2019 at 18:33
-1

I would go with Postgresql/Postgis and split/apply/combine approach.

  1. Import your table(s).

  2. Then transform/split your data with LATERAL join

  3. SELECT DISTINCT on two columns (Municipality, Scientific Name)

  4. Then find "Species Richness" and create a new table with:

SELECT COUNT(Municipality), Municipality FROM YOURTABLE GROUP BY Municipality

  1. Finally, combine again your data (from step 3) with string_agg and join that table with "Species Richness" table from step 4.

Something like that... I don't know if it will work but that's the logic.

answered Sep 9, 2019 at 20:04

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.