Combine multiple fields from several rows to just one row

Question 1

I have a shapefile with exactly spaced points from each other, each with species information, I don't have the exact distributions, for that I did a spatial join between those points and a grid, to get a spatial reference. Then, I made an intersection of that grid with a map with municipalities to know where these species are located approximately, but for each municipality several cells of the grid are overlapped, so my result in the attribute table were several rows with the same municipality, but those rows have lists of different species and from one row to another can share some species. As in the figure, my goal is to have a single row that contains the municipality with a single list of species, somehow combine all the rows. Species are separeted by | symbol. Which is the way I got the data. I have tried with the functions Merge, Join, Spatial Join, Dissolve, Convert the attribute table to excel and do the work manually, but it consumes a lot of time because they are more than 400 municipalities and more than 8 groups of organisms with which I am working. I found the function Concatenate Row Values (which is what I really need) but in ArcGIS only worked with few data (https://www.arcgis.com/home/item.html?id=52dfcef46fdb4c76bfbc08dc01570f3c).

enter image description here

Question 2

It's unclear to me why you are using a grid as intermediate steep rather than doing a spatial joint between your points and your municipalities ? could you explain

Question 3

Hi! A point is not associated with a municipality but with a certain radius, all the info of that sampled area is stored at that point, so certainty of the distribution is lost. It was the first thing I did, but that meant that of 426 municipalities, 90 were left without info, which isnt correct. So, I drew a grid, where each point was the cell center and thus have a distribution of information. So in each cell several municipalities can intersect. Unfortunately, I don't have the exact distribution of sp data, so I have to work on an estimated number of species per municipality.

Question 4

do you have several species per point before you join with the grid ?

Question 5

Are you looking for a way to get from table 1 to table 2, or a better way to produce table 2 without having to produce table 1?

Question 6

BERA: If the delimiter is the symbol then this "|" it is. I am using ArcMap 10.7

Question 7

You mentioned in your questions that you had been exporting to Excel.

If you all you seek is your concatenated results in a table this solution will output a csv file in the same fashion as your example.

import csv
Fields = ['Municipality', 'Scientific_Name', 'Species_Richness']
cursor = arcpy.da.SearchCursor("YourTable", Fields[:-1])
tempDict = {}
for row in cursor:
 if row[0] not in tempDict:
 tempDict[row[0]]= []
 for i in row[1].split('|'):
 if i not in tempDict[row[0]]:
 tempDict[row[0]].append(i)
 else:
 for i in row[1].split('|'):
 if i not in tempDict[row[0]]:
 tempDict[row[0]].append(i)
for key in tempDict:
 tempDict[key].sort()
for key in tempDict:
 tempDict[key].append(len(tempDict[key]))
csv_file = open(r"C:\Data\YourPath\Species.csv", 'wb')
csvfile = csv.writer(csv_file)
csvfile.writerow(Fields)
for key in tempDict:
 csvfile.writerow([key, ('|').join(tempDict[key][:-1]), tempDict[key][-1:][0]])
csv_file.close()

This example I used your filenames with underscores as Arc will only allow for field alias with spaces which don't seem to function with the search cursor.

Question 8

+1, but you should use a set instead of a list, otherwise you will still have duplicates and you will thus overestimate the species richness. (And you could 1) add lists with tempDict[row[0]] += row[1].split('|') 2)sort and add the count in a single loop instead of 3, but this is just a detail).

Question 9

Thank you for the code! Even that, I'm not advanced at all with Python, but I tried and I got this message: Runtime error Traceback (most recent call last): File "<string>", line 6, in <module> RuntimeError: General function failure >>>

Question 10

ACF307 I'm not sure why you are getting that error but you will have to change the code if the field names that I have provided are not accurate and I have used "YourTable" as the name of the table, you will need to edit to the code to reflect what ever your table, featureclass or shapefile is when its in the TOC in Arcmap. radouxju my code accounts for duplicates therefore species richness should not be over counted.

Question 11

F_Kellner I do carefuly changed all the fields and names depending on my data, but still happens this error: Runtime error Traceback (most recent call last): File "<string>", line 26, in <module> NameError: name 'csv' is not defined

Question 12

ACF307 I made an edit to the code. But you need to import csv.

Question 13

If you have table 1 you can use arcpy and pandas to create table 2. Adjust and execute in python window:

import arcpy
import pandas as pd
table = r'C:\data.gdb\points' #Change to match your table / feature class
muni = 'Municipality' #Change to match your fieldname
sname = 'Scientific Name' #Change to match your fieldname
species_delimiter = '|' #Change to match species delimiter
species = 'Species richness'
df = pd.DataFrame.from_records(data=arcpy.da.SearchCursor(table,[muni, sname])), columns = [muni, sname] #Create pandas dataframe using da.SearchCursor to read table. If you already have tbl1 as csv use pd.from_csv
df[sname] = df[sname].apply(lambda x: x.split(species_delimiter)) #Split string into list of species
df = df.groupby(muni)[sname].apply(sum).reset_index() #Group by muni and add more species
df[species] = df[sname].apply(lambda x: len(set(x))) #Create column as count of unique species
df[sname] = df[sname].apply(lambda x: species_delimiter.join(set(x))) #Drop duplicate species and convert back to string
df.to_clipboard() #Paste into excel. Or: df.to_csv(r'C:\species.csv')

From:

 Municipality Scientific Name
0 1 A|C|D
1 1 B|D
2 1 A
3 2 C|X|Y|Z
4 2 A|C|Z
5 2 X|Y|Z
6 3 M|O
7 3 M|N|O|P
8 4 A|B|C
9 4 A|C
10 5 K

To:

 Municipality Scientific Name Species richness
0 1 B|A|D|C 4
1 2 Y|X|Z|A|C 5
2 3 N|M|O|P 4
3 4 B|A|C 3
4 5 K 1

Question 14

I would go with Postgresql/Postgis and split/apply/combine approach.

Import your table(s).
Then transform/split your data with LATERAL join
SELECT DISTINCT on two columns (Municipality, Scientific Name)
Then find "Species Richness" and create a new table with:

SELECT COUNT(Municipality), Municipality FROM YOURTABLE GROUP BY Municipality

Finally, combine again your data (from step 3) with string_agg and join that table with "Species Richness" table from step 4.

Something like that... I don't know if it will work but that's the logic.

F_Kellner F_Kellner 7464 silver badges17 bronze badges · Accepted Answer · 2019-09-09 21:25:20Z

You mentioned in your questions that you had been exporting to Excel.

If you all you seek is your concatenated results in a table this solution will output a csv file in the same fashion as your example.

import csv
Fields = ['Municipality', 'Scientific_Name', 'Species_Richness']
cursor = arcpy.da.SearchCursor("YourTable", Fields[:-1])
tempDict = {}
for row in cursor:
 if row[0] not in tempDict:
 tempDict[row[0]]= []
 for i in row[1].split('|'):
 if i not in tempDict[row[0]]:
 tempDict[row[0]].append(i)
 else:
 for i in row[1].split('|'):
 if i not in tempDict[row[0]]:
 tempDict[row[0]].append(i)
for key in tempDict:
 tempDict[key].sort()
for key in tempDict:
 tempDict[key].append(len(tempDict[key]))
csv_file = open(r"C:\Data\YourPath\Species.csv", 'wb')
csvfile = csv.writer(csv_file)
csvfile.writerow(Fields)
for key in tempDict:
 csvfile.writerow([key, ('|').join(tempDict[key][:-1]), tempDict[key][-1:][0]])
csv_file.close()

This example I used your filenames with underscores as Arc will only allow for field alias with spaces which don't seem to function with the search cursor.

+1, but you should use a set instead of a list, otherwise you will still have duplicates and you will thus overestimate the species richness. (And you could 1) add lists with tempDict[row[0]] += row[1].split('|') 2)sort and add the count in a single loop instead of 3, but this is just a detail).
Thank you for the code! Even that, I'm not advanced at all with Python, but I tried and I got this message: Runtime error Traceback (most recent call last): File "<string>", line 6, in <module> RuntimeError: General function failure >>>
ACF307 I'm not sure why you are getting that error but you will have to change the code if the field names that I have provided are not accurate and I have used "YourTable" as the name of the table, you will need to edit to the code to reflect what ever your table, featureclass or shapefile is when its in the TOC in Arcmap. radouxju my code accounts for duplicates therefore species richness should not be over counted.
F_Kellner I do carefuly changed all the fields and names depending on my data, but still happens this error: Runtime error Traceback (most recent call last): File "<string>", line 26, in <module> NameError: name 'csv' is not defined
ACF307 I made an edit to the code. But you need to import csv.

Stack Exchange Network

Combine multiple fields from several rows to just one row

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Combine multiple fields from several rows to just one row

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions