How can I read in a layer's data using the python osgeo
/ogr
library and then export the whole attribute table and an extra column with the geometries coded as WKT to a CSV format?
I know that this is possible using the ogr2ogr
command-line function. The command to run from my terminal would be something like this:
ogr2ogr -overwrite -f CSV -lco GEOMETRY=AS_WKT -skipfailures "path\to\output.csv" "path\to\input.gdb" "input_layer_name"
The command above takes the "input_layer_name"
layer from the input GDB and exports it to a CSV file containing all columns in the original attribute table and another column called WKT
that contains geometries' WKTs.
But I'm not interested in using the ogr2ogr
application. I'm trying to do this from within a Python script. Sadly, using python's subprocess
library to run the command above from inside Python is not an option here.
I've cobbled up a script that almost does what I need, but it still has a few issues:
import os
from osgeo import ogr
# Use OGR specific exceptions
ogr.UseExceptions()
# Definitions for input file name and layer name
inDriverName = 'OpenFileGDB'
inGDBPath = 'path/to/input.gdb'
inLayerName = 'input_layer_name'
inDriver = ogr.GetDriverByName(inDriverName)
inDataSource = inDriver.Open(inGDBPath, 0)
inLayer = inDataSource.GetLayerByName(inLayerName)
inLayerIDColname = inLayer.GetFIDColumn()
# Name of the new CSV
outFile = "path/to/output.csv"
outLayerName = inLayerName + '_wkt'
outWKTColName = 'WKT'
outDriver = ogr.GetDriverByName("CSV")
# Remove output output CSV if it already exists
if os.path.exists(outFile):
outDriver.DeleteDataSource(outFile)
# Create the output CSV
outDataSource = outDriver.CreateDataSource(outFile)
outLayer = outDataSource.CreateLayer(outLayerName)
# Adding ID and WKT fields
idField = ogr.FieldDefn(inLayerIDColname, ogr.OFTInteger)
wktField = ogr.FieldDefn(outWKTColName, ogr.OFTString)
outLayer.CreateField(idField)
outLayer.CreateField(wktField)
outFeatureDefn = outLayer.GetLayerDefn()
# Making sure the feature reader is reset
inLayer.ResetReading()
# Iterating over every feature in the input layer
for this_inFeature in inLayer:
# Extracting the input feature's ID, geometry and WKT
this_FID = this_inFeature.GetFID()
this_inGeom = this_inFeature.GetGeometryRef()
this_inWkt = this_inGeom.ExportToIsoWkt()
# Adding the new ID and WKT data to the output file
outFeature = ogr.Feature(outFeatureDefn)
outFeature.SetField(inLayerIDColname, this_FID)
outFeature.SetField(outWKTColName, this_inWkt)
outLayer.CreateFeature(outFeature)
# Clearing the input Feature
this_inFeature = None
# Releasing the input and output files
inDataSource = None
outDataSource = None
inLayer = None
outLayer = None
I currently see three problems with the method I display above:
- I need to "manually" iterate over every feature and export them one by one, which can be pretty slow. Is there an already-built vectorized or one-line approach for this?
- The output ID field is generated with quotes (see image below). It should be exported as an integer.
- The approach above only exports the ID and WKT fields. How can I make sure all original fields get included alongside the newly-created WKT column?
I know this is a long post, but ultimately, my question is pretty simple. I just want to figure out how read in layers from osgeo
/ogr
and export them to CSV with a WKT column inside Python.
ID field generated with quotes
PS: I know that the specific input file doesn't matter much, but here's the input file I'm working with: ZipFile with GDB. The zipfile contains some other stuff too, but I'm focusing on the GDB - specifically, the layer called "TxDOT_Roadway_Linework".
1 Answer 1
Something like (untested):
gdal.VectorTranslate('path/to/output.csv', 'path/to/input.gdb', layers=[input_layer_name], format='CSV', layerCreationOptions=['GEOMETRY=AS_WKT'])
-
Hi there! Thanks for the quick response! I ran the command, but it didn't go through. Nothing actually happens. When I try to store the output of the command in a variable, the variable just contains
None
. And theoutput.csv
never gets created. Any idea what's up? I'm going to try playing around with thelayerCreationOptions
.Felipe D.– Felipe D.2021年10月05日 20:20:31 +00:00Commented Oct 5, 2021 at 20:20 -
1I just tested and didn't change anything except the filepaths and it worked as expected and wrote out a CSV with WKT geometry and the call to gdal.VectorTranslate returned a
<osgeo.gdal.Dataset;
object. When GDAL returnsNone
it means there was an error, to actually see the error you need to setgdal.UseExceptions()
user2856– user28562021年10月05日 20:58:04 +00:00Commented Oct 5, 2021 at 20:58 -
Aaahhhh!!! Got it! I believe the problem is with the projection. I'm getting this error:
RuntimeError: PROJ: proj_create_from_wkt: Cannot find proj.db
.Felipe D.– Felipe D.2021年10月05日 21:04:56 +00:00Commented Oct 5, 2021 at 21:04 -
Sweeeeeet!!! I got it to run!!! The error above seems to pop up often when installing GDAL in conda environments (see here). I just needed to include
os.environ['GDAL_DATA'] = r'C:\ProgramData\Anaconda3\envs\myenv\Library\share\gdal'
andos.environ['PROJ_LIB'] = r'C:\ProgramData\Anaconda3\envs\myenv\Library\share\proj'
and it worked. Thanks a bajillion!!!!!Felipe D.– Felipe D.2021年10月05日 21:13:12 +00:00Commented Oct 5, 2021 at 21:13 -
1Sorry, no idea then, sounds like you need to ask a whole new questionuser2856– user28562021年10月05日 23:24:01 +00:00Commented Oct 5, 2021 at 23:24
Explore related questions
See similar questions with these tags.