I am not a developer or PostgreSQL DB admin, so this may be basic questions.
Logistics: Windows 10 server / pgAdmin 4 / Postgres 10 / Python 2.7.13
I'm using a python script to ingest external data, create a CSV file and copy that into Postgres 10. I keep getting the following error: Psycopg2.ProgrammingError: syntax error at or near "VALUES"
I have a two part question - 1) I can not see the syntax error in the following sql statement
def insert_csv_data(sqlstmt):
with get_conn('pg') as db:
cur = db.cursor()
sqlcopy = "COPY irwin (fire_id,name,type,acres,date_time,state,county,admin_unit,land_cat,commander,perc_cntnd,cont_date,gacc,lat,long,geom,updated,imo) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,ST_SetSRID(ST_MakePoint(%s, %s),4326)%s,%s) FROM STIN DELIMITER ',' CSV HEADER"
with open(csv_file, 'r') as f:
#next(f)# Skipping the first line header row
cur.copy_expert(sqlcopy, f, size=8000)
db.commit()
cur.close()
And 2) Once that is resolved I'm expecting to get an error about the geometry column in postgres. If someone would also peek at the code snippets and let me know if anything jumps out I would SO APPRECIATE IT!
This snippet pulls the external data in order but I don't think I've coded this correctly to pull the lat/long into the geom field.
# Lat 15 - double
if not attributes['InitialLatitude'] is None:
lat = str(attributes['InitialLatitude']).replace('\n', '')
else:
lat = '0'
#Long 16 - double
if not attributes['InitialLongitude'] is None:
long = str(attributes['InitialLongitude']).replace('\n', '')
else:
long = '0'
# geom is not defined - script is dumping the geometry into the IMO field
geom = str(attributes['InitialLatitude']) + ' ' + str(attributes['InitialLongitude'])
I added a Geom header to the csv data. Please help - thanks!
2 Answers 2
sqlcopy = "COPY irwin (fire_id,name,type,acres,date_time,state,county,admin_unit,land_cat,commander,perc_cntnd,cont_date,gacc,lat,long,geom,updated,imo) FROM STDIN WITH FORMAT CSV, HEADER")
copy doesn't do values...
, the columns in the input must be correctly formatted for the columns you list, also format CSV
is usually a better choice than delimiter ','
what copy expert actually does is open a channel to the postgresql copy command. and then squirt the file contents through that channel. so you need to format the query for what postgres wants.
if using python to copy data to postgresql, the fastest & stable method is via pandas.
And it is been the standard in their document. Here is the steps.
- read_csv by pandas to dataframe
- dataframe to postgresql in special method to speed up.
the pg highspeed insert method for you reference:
Alternative to_sql() method for DBs that support COPY FROM
import csv from io import StringIO
def psql_insert_copy(table, conn, keys, data_iter): """ Execute SQL statement inserting data
Parameters
----------
table : pandas.io.sql.SQLTable
conn : sqlalchemy.engine.Engine or sqlalchemy.engine.Connection
keys : list of str
Column names
data_iter : Iterable that iterates the values to be inserted
"""
# gets a DBAPI connection that can provide a cursor
dbapi_conn = conn.connection
with dbapi_conn.cursor() as cur:
s_buf = StringIO()
writer = csv.writer(s_buf)
writer.writerows(data_iter)
s_buf.seek(0)
columns = ', '.join('"{}"'.format(k) for k in keys)
if table.schema:
table_name = '{}.{}'.format(table.schema, table.name)
else:
table_name = table.name
sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(
table_name, columns)
cur.copy_expert(sql=sql, file=s_buf)
https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#sql-queries
hope it will help you.
printf
command (or whatever it is in Python) and printed out your value ofsqlcopy
and tried it with the psql client directly?