1

How can I read in a layer's data using the python osgeo/ogr library and then export the whole attribute table and an extra column with the geometries coded as WKT to a CSV format?

I know that this is possible using the ogr2ogr command-line function. The command to run from my terminal would be something like this:

ogr2ogr -overwrite -f CSV -lco GEOMETRY=AS_WKT -skipfailures "path\to\output.csv" "path\to\input.gdb" "input_layer_name"

The command above takes the "input_layer_name" layer from the input GDB and exports it to a CSV file containing all columns in the original attribute table and another column called WKT that contains geometries' WKTs.

But I'm not interested in using the ogr2ogr application. I'm trying to do this from within a Python script. Sadly, using python's subprocess library to run the command above from inside Python is not an option here.

I've cobbled up a script that almost does what I need, but it still has a few issues:

import os 
from osgeo import ogr
# Use OGR specific exceptions
ogr.UseExceptions()
# Definitions for input file name and layer name
inDriverName = 'OpenFileGDB'
inGDBPath = 'path/to/input.gdb'
inLayerName = 'input_layer_name'
inDriver = ogr.GetDriverByName(inDriverName)
inDataSource = inDriver.Open(inGDBPath, 0)
inLayer = inDataSource.GetLayerByName(inLayerName)
inLayerIDColname = inLayer.GetFIDColumn()
# Name of the new CSV
outFile = "path/to/output.csv"
outLayerName = inLayerName + '_wkt'
outWKTColName = 'WKT'
outDriver = ogr.GetDriverByName("CSV")
# Remove output output CSV if it already exists
if os.path.exists(outFile):
 outDriver.DeleteDataSource(outFile)
# Create the output CSV
outDataSource = outDriver.CreateDataSource(outFile)
outLayer = outDataSource.CreateLayer(outLayerName)
# Adding ID and WKT fields
idField = ogr.FieldDefn(inLayerIDColname, ogr.OFTInteger)
wktField = ogr.FieldDefn(outWKTColName, ogr.OFTString)
outLayer.CreateField(idField)
outLayer.CreateField(wktField)
outFeatureDefn = outLayer.GetLayerDefn()
# Making sure the feature reader is reset
inLayer.ResetReading()
# Iterating over every feature in the input layer
for this_inFeature in inLayer:
 # Extracting the input feature's ID, geometry and WKT
 this_FID = this_inFeature.GetFID()
 this_inGeom = this_inFeature.GetGeometryRef()
 this_inWkt = this_inGeom.ExportToIsoWkt()
 
 # Adding the new ID and WKT data to the output file
 outFeature = ogr.Feature(outFeatureDefn)
 outFeature.SetField(inLayerIDColname, this_FID)
 outFeature.SetField(outWKTColName, this_inWkt)
 outLayer.CreateFeature(outFeature)
 
 # Clearing the input Feature
 this_inFeature = None
# Releasing the input and output files
inDataSource = None
outDataSource = None
inLayer = None
outLayer = None

I currently see three problems with the method I display above:

  • I need to "manually" iterate over every feature and export them one by one, which can be pretty slow. Is there an already-built vectorized or one-line approach for this?
  • The output ID field is generated with quotes (see image below). It should be exported as an integer.
  • The approach above only exports the ID and WKT fields. How can I make sure all original fields get included alongside the newly-created WKT column?

I know this is a long post, but ultimately, my question is pretty simple. I just want to figure out how read in layers from osgeo/ogr and export them to CSV with a WKT column inside Python.

ID field generated with quotes

PS: I know that the specific input file doesn't matter much, but here's the input file I'm working with: ZipFile with GDB. The zipfile contains some other stuff too, but I'm focusing on the GDB - specifically, the layer called "TxDOT_Roadway_Linework".

Vince
20.5k16 gold badges49 silver badges65 bronze badges
asked Oct 5, 2021 at 18:41

1 Answer 1

1

Something like (untested):

gdal.VectorTranslate('path/to/output.csv', 'path/to/input.gdb', layers=[input_layer_name], format='CSV', layerCreationOptions=['GEOMETRY=AS_WKT'])
answered Oct 5, 2021 at 20:00
11
  • Hi there! Thanks for the quick response! I ran the command, but it didn't go through. Nothing actually happens. When I try to store the output of the command in a variable, the variable just contains None. And the output.csv never gets created. Any idea what's up? I'm going to try playing around with the layerCreationOptions. Commented Oct 5, 2021 at 20:20
  • 1
    I just tested and didn't change anything except the filepaths and it worked as expected and wrote out a CSV with WKT geometry and the call to gdal.VectorTranslate returned a <osgeo.gdal.Dataset; object. When GDAL returns None it means there was an error, to actually see the error you need to set gdal.UseExceptions() Commented Oct 5, 2021 at 20:58
  • Aaahhhh!!! Got it! I believe the problem is with the projection. I'm getting this error: RuntimeError: PROJ: proj_create_from_wkt: Cannot find proj.db. Commented Oct 5, 2021 at 21:04
  • Sweeeeeet!!! I got it to run!!! The error above seems to pop up often when installing GDAL in conda environments (see here). I just needed to include os.environ['GDAL_DATA'] = r'C:\ProgramData\Anaconda3\envs\myenv\Library\share\gdal' and os.environ['PROJ_LIB'] = r'C:\ProgramData\Anaconda3\envs\myenv\Library\share\proj' and it worked. Thanks a bajillion!!!!! Commented Oct 5, 2021 at 21:13
  • 1
    Sorry, no idea then, sounds like you need to ask a whole new question Commented Oct 5, 2021 at 23:24

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.