I have a Postgres DB with a lot of lines that are too long and so now I want to split those lines into smaller ones.
This is why I started writing a little python script for that job. Basically it fetches my lines like this:
cursor.execute("SELECT ogc_fid, id, ST_AsGeoJSON(wkb_geometry) as geometry, ST_Length(wkb_geometry) as length, height FROM cont OFFSET " + str(offset) + " LIMIT " + str(limit) )
Here's my first problem. In order to further work with the data I have to select the geometry as JSON. I'll come to that later.
Now I iterate over the lines and make something like this:
sqlCommand = "INSERT INTO " + table + " (ogc_fid, id, wkb_geometry, height) VALUES(" + ogc_fid + ", " + row_id + ", ST_Line_Substring(ST_GeomFromGeoJSON(\'" + geometry + "\'), " + str(rangeFrom) + ", " + str(min(rangeFrom + stepWidth, 1)) + "), " + str(height) + ");"
cursor.execute(sqlCommand)
So if I didn't use ST_AsGeoJSON
in the select, I couldn't append it to my command cause it can only append a string.
That way my whole script is very slow. It always has to convert my geometries into JSON and then back from JSON.
Is there a datatype that python can handle?
Is there a more efficient way of dealing with geometry fields?
I heard of GeoDjango. Could that help here?
1 Answer 1
You could try using geoalchemy2
. Personally I would look at using the Object Relational Mapper (ORM) model for working with your data, for example:
from sqlalchemy import create_engine
# Enter your database connection below
engine = create_engine('postgresql://gis:gis@localhost/gis', echo=True)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, Float
from geoalchemy2 import Geometry
Base = declarative_base()
class Cont(Base):
__tablename__ = "cont"
ogc_fid = Column(Integer, primary_key=True)
id = Column(Integer)
wkb_geometry = Column(Geometry("LINESTRING"))
height = Column(Float)
# Other database columns here
From there you can setup a session and easily query the database:
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
session = Session()
And write a query to get back the lines you want to update:
query = session.query(Cont).filter(Cont.geom.ST_Length() > tolerance)
Queries are iterators, so it's easy to loop over them and work with the data, especially as geoalchemy2 has inbuilt shapely integration.
from geoalchemy2.shape import from_shape, to_shape
new_rows = []
for row in query:
shapely_geometry = to_shape(row.wkb_geometry)
# ... some operations to create `new_geom`
new_rows.append(
Cont(
ogc_fid=row.ogc_fid,
id=row.id,
wkb_geometry=new_geom.wkt, #shapely geometries have a wkt property
height=row.height
)
)
Then you can commit all your rows at once:
session.add_all(new_rows)
session.commit()
Or commit by chunk by splitting your new rows into chunks (credit):
for chunk in (new_rows[i:i+chunksize] for i in range(0, len(new_rows), chunksize)):
session.add_all(chunk)
session.commit()
-
you are super awesome!!!! thank you so much for your help!!!!Georg– Georg2015年05月17日 18:10:22 +00:00Commented May 17, 2015 at 18:10
-
One more question... So, once I have a row, I'd like to figure out it's length.
shapely_geometry.ST_Length()
didn't did the trick for me. And once I have it I'd need to runST_Line_Substring
to split the line. Sorry to bother you, but my python skills are very limited... would be so super awesome if you might tell me how to do those two commands!!!Georg– Georg2015年06月02日 11:32:05 +00:00Commented Jun 2, 2015 at 11:32 -
I managed to figure it out! What took 2 weeks before took 6hours now :) thanks again!!!!Georg– Georg2015年06月03日 06:06:45 +00:00Commented Jun 3, 2015 at 6:06
shapely