I have a list with two sets of coordinates and two attributes associated with them. The list was populated from my email so the structure can be modified if need be.
Current list structure:
[(((long1, lat1), (long2, lat2)), 'attribute1', attribute2),...)))]
So an example would be:
[(((-87.932083, 26.886283), (-87.921784, 29.892553)), 'Re: Locate Message 2164792294', datetime.date(2021, 5, 27)), (((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20)),...)))]
I am trying to turn this list into a MultiLineString using shapely and then a GeoSeries using Geopandas. Or at least that's what I believe I should be doing. I had success with just a list of geometry(see below), but I can't figure it how to complete this process with the attributes attached.
Geometry only:
[((-97.932083, 29.886283), (-97.921784, 29.892553)), ((-97.940304, 29.890503), (-97.938405, 29.891903)),
My ultimate goal is to plot and perform further spatial analytics on this data.
2 Answers 2
Here's a quick attempt to deal with your data structure. Please feel free to edit things around to make it behave exactly how you want it:
# Importing necessary libraries
import pandas as pd
import geopandas as gpd
import shapely
import datetime
# Setting test input data
my_list = (
[(((-87.932083, 26.886283), (-87.921784, 29.892553)), 'Re: Locate Message 2164792294', datetime.date(2021, 5, 27)),
(((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20)),
(((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20)),
(((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20)),
(((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20)),
(((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20)),
(((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20)),
(((-86.940304, 26.890503), (-95.938405, 24.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20))])
# List of Dataframes that will later be concatenated into one large dataframe
pre_dfs = []
# Looping over all "rows" in `my_list`
for this_item in my_list:
# Generating a shapely geometry
geometry = shapely.geometry.LineString(this_item[0])
msg = this_item[1]
date = this_item[2]
# Creating a single-row-DataFrame.
this_df = pd.DataFrame({'geometry':[geometry],
'msg':[msg],
'date':[date]})
# Appending this single-row-DataFrame to the `pre_dfs` list
pre_dfs.append(this_df)
# Concatenating all the separate dataframes into one big DataFrame
single_df = pd.concat(pre_dfs, ignore_index=True).reset_index(drop=True)
# Finally, generating the actual GeoDataFrame that can be manipulated
geo_df = gpd.GeoDataFrame(single_df,
geometry='geometry',
crs='epsg:4326')
Once you have this set up, you can run geo_df.plot()
to plot it and do a whole bunch of other operations.
The big key here is inside that for
loop in which we parse each of the elements in my_list
and generate a regular Pandas DataFrame for each element. If your object structure is a bit more complex, you can just tailor that part of the code to match exactly what you want.
-
Ahh! This makes a lot of sense, especially with the comments you included. I was stuck looking for the 'right way' to format the incoming data and assumed there was a built-in function that would convert it to a gdf. I appreciate your time.Ryan Bobo– Ryan Bobo2021年06月22日 17:35:45 +00:00Commented Jun 22, 2021 at 17:35
-
No worries, I'm happy I was able to help! =)Felipe D.– Felipe D.2021年06月22日 17:44:11 +00:00Commented Jun 22, 2021 at 17:44
-
I believe that @CodeBard's answer is even simpler and easier to deal with: you don't have to manually "parse" each column.Felipe D.– Felipe D.2021年06月24日 14:49:07 +00:00Commented Jun 24, 2021 at 14:49
Another solution would be to directly convert your list to a DataFrame
.
data = [
(((-97.932083, 29.886283), (-97.921784, 29.892553)), 'Re: Locate Message 2164792294', datetime.date(2021, 5, 27)),
(((-97.940304, 29.890503), (-97.938405, 29.891903)), 'Re: Locate Message 2163994250', datetime.date(2021, 5, 20))
]
df = pd.DataFrame(data, columns=['coordinates', 'attribute1', 'attribute2'])
gdf = gpd.GeoDataFrame(df, crs="EPSG:4326", geometry=[LineString(x) for x in df['coordinates']])
gdf = gdf.drop(['coordinates'], axis=1) # drop coordinate tuples, if not needed anymore