I'm trying to get the start/End points of a Shapefile which geometry type is lines, using Shapely.
Here's what I tried :
import geopandas
from shapely.geometry import *
import fiona
schema = {
'geometry': 'Point',
'properties': {'id': 'int'},
}
for line in fiona.open("./Shapefiles/tl.shp"):
# Write a new Shapefile
with fiona.open('./Outputs/result.shp', 'w', 'ESRI Shapefile', schema) as c:
startEndPoints = Point(line['geometry']['coordinates'][0]), Point(line['geometry']['coordinates'][-1])
c.write({
'geometry': mapping(startEndPoints),
'properties': {'id': 'int'},
})
After running the script I got this error :
Traceback (most recent call last):
File "C:/Users/User/Documents/Scripts/startEndPoints.py", line 15, in <module>
'geometry': mapping(startEndPoints),
File "C:\Program Files (x86)\Python27\lib\site-packages\shapely\geometry\geo.py", line 75, in mapping
return ob.__geo_interface__
AttributeError: 'tuple' object has no attribute '__geo_interface__'
Process finished with exit code 1
Any idea on what the problem is? Am I missing something?
By the way if I just put :
from shapely.geometry import Point
import fiona
for line in fiona.open("./Shapefiles/tl.shp"):
print Point(line['geometry']['coordinates'][0]), Point(line['geometry']['coordinates'][-1])
I get the results :
POINT (-122.4136252231131 37.77444689220926) POINT (-122.4123122222042 37.77339989262978)
POINT (-122.4166172233352 37.77561789207194) POINT (-122.4168352231301 37.77544489272536)
POINT (-122.4254669503756 37.80586514841474) POINT (-122.4253414477887 37.80523738337594)
POINT (-122.4120802227251 37.77329289282841) POINT (-122.4116882221326 37.77360289269223)
POINT (-122.403928221227 37.76968489249516) POINT (-122.4038392206211 37.7686898923368)
POINT (-122.4034062214653 37.77768889337421) POINT (-122.402952221485 37.77804389367528)
POINT (-122.4083362215629 37.77378789309976) POINT (-122.4073822219715 37.77454389279772)
POINT (-122.3988001599508 37.77642529193749) POINT (-122.4004541112463 37.77511695816931)
POINT (-122.3995072203815 37.7666648926454) POINT (-122.3991442197841 37.76637989275558)
POINT (-122.3954705174111 37.75052611064412) POINT (-122.3954953388096 37.75079277249778)
POINT (-122.4096502215665 37.77017689206735) POINT (-122.4086142215983 37.76934889254507)
POINT (-122.40679422218 37.77255089251806) POINT (-122.4065396542004 37.77275206969875)
POINT (-122.3911729658918 37.76952536147485) POINT (-122.3910978699473 37.76875434317915)
POINT (-122.408514985576 37.76433570878993) POINT (-122.4084132127141 37.76323936432929)
....
Process finished with exit code 0
My only problem is saving these endpoints in another shapefile. Any ideas?
I managed to solve the problem with a workaround actually. What I did is save the startPoints and endPoints each in a different shapefile. Here's my code :
from shapely.geometry import *
import fiona
schema = {'geometry': 'Point' }
with fiona.open("./TLShapefile/TimeLimits.shp") as input:
input.schema['geometry'] = "Point"
# write the startPoint shapefile
with fiona.open('./Outputs/endPoints.shp', 'w', 'ESRI Shapefile', input.schema.copy(), input.crs) as output:
for line in input:
geom = shape(line['geometry'])
# Write a new Shapefile for the startPoints, do the same thing for endPoints afterwards
startPoints = Point(line['geometry']['coordinates'][0])
#endPoints = Point(line['geometry']['coordinates'][-1]) #For the end points
elem['geometry'] = mapping(startPoints)
#elem['geometry'] = mapping(endPoints) #For the end points
output.write(elem)
Now I have seperate shapefiles for the start points and the end points. Although my main goal is to get one shapefile with the startpoints and endpoints all together.
I have updated my code based on Gene's answer, using Shapely :
from shapely.geometry import *
import fiona
schema = {
'geometry': 'Point',
'properties': { 'AGENCY': 'str',
'CONFLICT' :'int',
'CREATED_DA' :'str',
'CREATED_US' :'str',
'CREATED__1' :'str',
'CREATED__2' :'str',
'DAYS' :'str',
'EDITOR' :'str',
'ENACTED' :'str'
with fiona.open("tl.shp") as input:
# write the startPoint shapefile
with fiona.open('./Outputs/startEndPoints.shp', 'w', 'ESRI Shapefile', schema, input.crs) as output:
for line in input:
startEndPoints = MultiPoint([line['geometry']['coordinates'][0], line['geometry']['coordinates'][-1]])
for point in startEndPoints:
output.write({'geometry': mapping(point), 'properties': {
'AGENCY': 'str',
'CONFLICT' :'int',
'CREATED_DA' :'str',
'CREATED_US' :'str',
'CREATED__1' :'str',
'CREATED__2' :'str',
'DAYS' :'str',
'EDITOR' :'str',
'ENACTED' :'str'}})
When I run the script now I get this :
Process finished with exit code -1073741819 (0xC0000005)
2 Answers 2
Why import geopandas
if you don't use it ? (GeoPandas = Pandas + Fiona + Shapely)
1) Following the suggestion of bugmenot123
The schema for a MultiLineString
schema = {
'geometry': 'MultiPoint',
'properties': {'id': 'int'},
}
And
with fiona.open("lines.shp") as inp:
with fiona.open('result.shp', 'w', 'ESRI Shapefile', schema) as out:
for line in inp:
startEndPoints = MultiPoint([line['geometry']['coordinates'][0], line['geometry']['coordinates'][-1]])
out.write({'geometry': mapping(startEndPoints),'properties': {'id': 1}})
2) But a shapefile has no MultiPoint schema geometries: a shapefile that indicates Point in its schema may yield either Point or MultiPoint features (simple list of Point).
Therefore:
schema = {
'geometry': 'Point',
'properties': {'id': 'int'},
}
with fiona.open("lines.shp") as inp:
with fiona.open('result2.shp', 'w', 'ESRI Shapefile', schema2) as out:
for line in inp:
startEndPoints = MultiPoint([line['geometry']['coordinates'][0], line['geometry']['coordinates'][-1]])
for point in startEndPoints:
out.write({'geometry': mapping(point),'properties': {'id': 1}})
3) Now, if you want to use GeoPandas only
import geopandas as gpd
from geopandas.geoseries import Point
lines = gpd.GeoDataFrame.from_file("lines.shp")
startpts = gpd.GeoSeries([Point(list(pt['geometry'].coords)[0]) for i,pt in lines.iterrows()])
endpts = gpd.GeoSeries([Point(list(pt['geometry'].coords)[-1]) for i,pt in lines.iterrows()])
points = startpts.append(endpts)
result = gpd.GeoDataFrame(points,columns=['geometry'])
# add a field
result['id']= result.index
# save the resulting shapefile
result.to_file("res_geopandas.shp")
New
# columns of the GeoDataframe line
print lines.columns.values
[u'AFFLEUR' u'IDENT' u'NATURE' 'geometry']
print lines['IDENT']
0 3560025
1 3560023
2 3560024
Name: IDENT, dtype: int64
Suppose that I want to add the IDENT column to the results
# create a GeoDataFrame with the startpts GeoSerie
dfstartpts = gpd.GeoDataFrame(startpts,columns=['geometry'])
# add a column and copy the values of IDENT
dfstartpts['ident'] = lines['IDENT']
# same for the endpoints
dfendpts = gpd.GeoDataFrame(endpts,columns=['geometry'])
dfsendpts['ident'] = lines['IDENT']
# "merge" the dataframes
result = dfstartpts.append(dfendpts)
print result.columns.values
['geometry' 'ident']
print result['ident']
0 3560025
1 3560023
2 3560024
0 3560025
1 3560023
2 3560024
Name: ident, dtype: int64
-
Thank you again for your detailed answer @gene! I just saw it. I actually just posted an answer below to my post with a workaround that I managed to do. Please refer to it. As for ur suggestions I just tried the second one and I had this error : ValueError: Record does not match collection schema: ['id'] != [u'CREATED_DA', u'CREATED_US', u'RPPAREA3', ..... The error comes from this line : out.write({'geometry': mapping(point),'properties': {'id': 1}})GeoSal– GeoSal2016年07月22日 12:43:31 +00:00Commented Jul 22, 2016 at 12:43
-
1The error comes with the fields of the shapefile that not match the shapefile schema. The schema presented here is with one field that contains an integer and no other thing ('id': 1). If you want other fields, you need to add them to the schema.gene– gene2016年07月22日 12:56:14 +00:00Commented Jul 22, 2016 at 12:56
-
I updated my post please refer to my last code. I added the other fields indeed and I get this error now : Process finished with exit code -1073741819 (0xC0000005)GeoSal– GeoSal2016年07月22日 14:52:26 +00:00Commented Jul 22, 2016 at 14:52
-
Tried ur 3rd suggestion using Geopandas and it works perfectly ! The only thing is that the output gets projected in a different place. Even if I add the line regarding the transformation of the input using to_crsGeoSal– GeoSal2016年07月22日 15:20:23 +00:00Commented Jul 22, 2016 at 15:20
-
1
You are trying to save two Point objects into a schema that wants one Point. You could either use MultiPoint instead or construct a linestring for each start-end pair.
-
Thank u for ur answer. I tried replacing Point in the schema with Multipoint, but I'm still having the same error.GeoSal– GeoSal2016年07月22日 08:57:59 +00:00Commented Jul 22, 2016 at 8:57
-
I just replaced Point(line['geometry']['coordinates'][0]), Point(line['geometry']['coordinates'][-1]) with MultiPoint(line['geometry']['coordinates'][0]), MultiPoint(line['geometry']['coordinates'][-1]) , and now I have this error : AttributeError: 'float' object has no attribute '_ndim'GeoSal– GeoSal2016年07月22日 09:01:51 +00:00Commented Jul 22, 2016 at 9:01
-
I am not entirely sure what your end goal is, could you extend your question? Do you just need all the points or do you need to know which two belong(ed) together or do you need to know which ones are start points and end points or do you need to know which ones are start and end point pairs?bugmenot123– bugmenot1232016年07月22日 11:26:54 +00:00Commented Jul 22, 2016 at 11:26
-
My end goal is to save the the startpoints and endpoints all together in one shapefile.GeoSal– GeoSal2016年07月22日 12:53:56 +00:00Commented Jul 22, 2016 at 12:53
-
I will be projecting these points later on a lines' shapefile the splitting these lines on the start/end pointsGeoSal– GeoSal2016年07月22日 13:03:19 +00:00Commented Jul 22, 2016 at 13:03
'CONFLICT' :'int
(one field of the shapefile) ,for example, and you write a value in the table'CONFLICT' :56
(row value) and not the repetition of the field definition as value