I want to put my SQL code results into a table/data frame in Python. I have typed the code below and wanted to know what additional code I need to use in order to do this:
import pandas as pd
f = pd.read_csv("/Users/sandeep/Downloads/Python/baseball.csv", header=0)
q = """
select player, year,
case when team='CHN' then 1 else 0 end as team_flag
from f
where year=2006;
"""
2 Answers 2
Your pandas code suggests that you are reading from a .csv file, not an SQL database. In such case, you do not need to do anything else and f will in fact contain the DataFrame.
The syntax for obtaining a DataFrame from an SQL table can be found here:
At minimum, you will need two arguments: a query and a connection (which may be a string). Your query may be something like SELECT player, year, case FROM table_name WHERE team='CHN' AND year=2006. The connection string can be sqlite:////full_path.sqlite or sql_flavor://user:password@host:path.
It is arguably easier to add your indicator within the DataFrame rather than suing SQL; say df['team'] = df['team'] == 'CHN' should do it.
Comments
Use pandas.read_sql()
df = psql.read_sql(q,db)
To create the db connection, I suggest you use the following:
import psycopg2
db= psycopg2.connect("dbname='template1' user='dbuser' host='localhost' password='dbpass'")
from fyour intention to get values from a CSV file? Why are you using SQL to query a CSV?