2

At my internship, I have been asked to learn Python on the job. It's been a rough road never having coded before so please forgive me if my problem is elementary.

The Lightning CSV files I am trying to convert have over 400,000 rows.
The headers go as follows: mon, day, year, hr, min, sec, lat, lon, ht, type, Ip, Id

I found the answers to this question as potential solutions. I struggle to modify this code in order to convert the CSVs into shapefiles.

The path for the CSVs is C:\Users\zherran\Desktop\shp\enwest2015*.csv

The path where I want the shapefile C:\Users\zherran\Desktop\shp\shapefiles

import shapefile as shp
import csv
out_file = 'enwest3days.shp'
#Set up blank lists for data
mon,day,year,hr,min,sec,lat,lon,ht,type,Ip,Id=[],[],[],[],[],[],[],[],[],[],[],[]
# sample row mon,day,year,hr,min,sec,lat,lon,ht,type,Ip,Id
# 6,2,2015,7,27,16.6,41.5,-101.4,13789.5,1,-7900,26abb8a
#read data from csv file and store in lists
with open('enwest*.csv', 'rb') as csvfile:
 r = csv.reader(csvfile, delimiter=';')
 for i,row in enumerate(r):
 if i > 0: #skip header
 mon.append(int(row[0]))
 day.append(int(row[1]))
 year.append(int(row[2]))
 hr.append(int(row[3]))
 min.append(int(row[4]))
 sec.append(float(row[5]))
 lat.append(float(row[6]))
 lon.append(float(row[7]))
 ht.append(float(row[8]))
 type.append(int(row[9]))
 Ip.append(float(row[10]))
 Id.append(complex(row[11]))
#Set up shapefile writer and create empty fields
w = shp.Writer(shp.POINT)
w.autoBalance = 1 #ensures gemoetry and attributes match
w.field('X','F',10,8)
w.field('Y','F',10,8)
w.field('Date','D')
w.field('Target','C',50)
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
#loop through the data and write the shapefile
for j,k in enumerate(x):
 w.point(k,y[j]) #write the geometry
 w.record(k,y[j],date[j], target[j], id_no[j]) #write the attributes
#Save shapefile
w.save(out_file)
nmtoken
13.6k5 gold badges39 silver badges91 bronze badges
asked Jun 9, 2016 at 20:54
9
  • 1
    You should add the code that you have written. Commented Jun 9, 2016 at 20:55
  • 2
    The reason I ask to see your code is that it sounds as if you are having difficulty with Python syntax and programming concepts generally: with just the posted code from someone else, we have no idea what you don't understand. Commented Jun 9, 2016 at 21:19
  • 3
    Do you have to use python? If this is just a task you need to get done you could try using GDAL/OGR commandline libraries. Specifically the ogr2ogr function, this post is extremely useful for what you've got: gis.stackexchange.com/questions/127518/…. If you are set on using Python, I would also recommend using the GDAL/OGR python modules. Commented Jun 9, 2016 at 21:42
  • 1
    As @Dave-Evans said, do you have your heart set on 'shapefile'? OGR is available for python and has a driver for CSV which would simplify all this code down to a few lines. Commented Jun 9, 2016 at 23:01
  • can you explain what is actually not working? Commented Jun 10, 2016 at 8:41

1 Answer 1

3

a) With the solution you use (Pyshp (shapefile), you need

1) to extract the fields of the csv file
2) to define the fields of the shapefile (w.field('Target','C',50))
3) to construct the geometry from the lon et lat fields ( w.point(float(i['x']),float(i['y'])) (see CSV to Shapefile )

b) With Fiona, it is easier but you have the same problem of field definition.
c) osgeo/ogr complicates the things...
d) Therefore, be modern, to convert directly csv files to shapefiles, the best solution is now Pandas, GeoPandas(uses Fiona) and Shapely. You do not have to worry about the fields and the fields definitions (pandas.DataFrame.from_csv).

from pandas import DataFrame
from geopandas import GeoDataFrame
from shapely.geometry import Point
# convert the csv file to a DataFrame
data = DataFrame.from_csv('enwest2015_1.csv', index_col=False)
# extract the geometry from the DataFrame
points = [Point(row['lon'], row['lat']) for key, row in data.iterrows()]
#convert the DataFrame to a GeoDataFrame 
geo_df = GeoDataFrame(data,geometry=points)
# save the resulting shapefile
geo_df.to_file('enwest2015_1.shp', driver='ESRI Shapefile') 

For the paths, it is a pure Python problem

answered Jun 10, 2016 at 8:50
1
  • from_csv is deprecated. Use pandas.read_csv() Commented Dec 30, 2022 at 6:20

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.