1

I have written the following Python code to convert various input files into a shapefile using Pandas.

The coordinates of the input file are x,y which is what my shapefile outputs to, but I would like to add in some code which converts the input file to lat/long.

However, I'm unsure how to go about this

import pandas as pd
import os
import geopandas as gpd
from shapely.geometry import Point 
input_file = "...input_file.csv" 
file_extension = os.path.splitext(input_file)[-1].lower()
output_file = input_file[:-4]
if file_extension == ".xyz" or ".asc":
 df = pd.read_table(input_file, skiprows=2, sep=r',円|\t', engine='python', names=['x', 'y', 'z'])
 df.columns = ["x", "y", "z"]
elif file_extension == ".txt" or ".text" or ".csv":
 df = pd.read_csv(input_file, sep=',円|\t')
 df.columns = ["x", "y", "z"]
 
gdf = gpd.GeoDataFrame(df, geometry=df.apply(lambda row: Point(row.x,row.y,row.z), axis=1,), crs='EPSG:4326')
gdf.to_file(f"{output_file}.shp") 
shapefile = f"{output_file}.shp"
print("Shapefile Created!")
Taras
35.8k5 gold badges77 silver badges152 bronze badges
asked May 10, 2022 at 9:32
2

1 Answer 1

1

Firstly I recommend the os.path.splitext() function from the os.path module to get file_extension and output_file in a bit more elegant way:

from os.path import splitext
input_file = '.../input_file.csv' 
output_file, file_extension = splitext(input_file)
print(output_file, file_extension)

Let's assume there is a .csv-file called 'points.csv', see image below:

input

On this stage the GeoDataFrame will be created:

import pandas as pd
import geopandas as gpd
from os.path import splitext
absolute_path_to_file = '.../input_file.csv'
file_name, file_extension = splitext(absolute_path_to_file)
if file_extension == ".txt" or ".text" or ".csv":
 df = pd.read_table(absolute_path_to_file, sep=',')
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['x'], df['y'], df['z']), crs='EPSG:25832')
print(gdf)

that results in (keep in mind that the point's geometry representation is POINT Z):

 id x y z geometry
0 1 513072.05 5402445.83 250.0 POINT Z (513072.050 5402445.830 250.000)
1 2 513212.88 5402852.80 245.0 POINT Z (513212.880 5402852.800 245.000)
2 3 513733.74 5403658.99 239.0 POINT Z (513733.740 5403658.990 239.000)

Now it is possible to get latitude, longitude, and altitude using:

  • the GeoPandas's to_crs() function as was suggested by @user2856:

     import pandas as pd
     import geopandas as gpd
     from os.path import splitext
     from pyproj import CRS
     absolute_path_to_file = '.../input_file.csv'
     file_name, file_extension = splitext(absolute_path_to_file)
     if file_extension == ".txt" or ".text" or ".csv":
     df = pd.read_table(absolute_path_to_file, sep=',')
     gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['x'], df['y'], df['z']), crs='EPSG:25832')
     crs = CRS.from_string('EPSG:4326')
     gdf = gdf.to_crs(crs=crs)
     print(gdf)
     id x y z geometry
     0 1 513072.05 5402445.83 250.0 POINT Z (9.17792 48.77488 250.00000)
     1 2 513212.88 5402852.80 245.0 POINT Z (9.17985 48.77854 245.00000)
     2 3 513733.74 5403658.99 239.0 POINT Z (9.18697 48.78578 239.00000)
    
  • Pandas and a custom function with the pyproj's Transformer class:

     import pandas as pd
     from os.path import splitext
     from pyproj import CRS, Transformer
     def xyz_to_latlonalt(x, y, z, crs_in, crs_out):
     crs_from = CRS.from_user_input(crs_in)
     crs_to = CRS.from_user_input(crs_out)
     proj = Transformer.from_crs(crs_from, crs_to, always_xy=True)
     coordinates = proj.transform(x, y, z)
     return coordinates
     absolute_path_to_file = '.../input_file.csv'
     file_name, file_extension = splitext(absolute_path_to_file)
     if file_extension == ".txt" or ".text" or ".csv":
     df = pd.read_table(absolute_path_to_file, sep=',')
     df['lat'] = df.apply(lambda row: xyz_to_latlonalt(row['x'], row['y'], row['z'], 25832, 4326)[0], axis=1)
     df['lon'] = df.apply(lambda row: xyz_to_latlonalt(row['x'], row['y'], row['z'], 25832, 4326)[1], axis=1)
     df['alt'] = df.apply(lambda row: xyz_to_latlonalt(row['x'], row['y'], row['z'], 25832, 4326)[2], axis=1)
     df = df.drop(['x', 'y', 'z'], axis=1)
     print(df)
     id lat lon alt
     0 1 9.177920 48.774878 250.0
     1 2 9.179850 48.778536 245.0
     2 3 9.186966 48.785778 239.0
    
answered May 10, 2022 at 10:54

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.