I have a SQL database from which I need to analyze a very big table. I would like to use python for that.
Unfortunately I cannot access the SQL database directly via python.
Are there any suggestions to what to export the table to be able to work with it?
I tried to export to SQL file and to import that into pandas dataframe but of course out of memory.
I tried to access the database from python directly using pymysql using
db=pymysql.connect(host="localhost", db="all_data")
but I get "can't connect to MySQL server Win Error 10061"?
File type is .sql.
Many thanks
-
1Try to export as CSV or similar than load into PandasMike67– Mike672020年07月30日 14:03:50 +00:00Commented Jul 30, 2020 at 14:03
-
What is the database type?s3dev– s3dev2020年07月30日 14:10:23 +00:00Commented Jul 30, 2020 at 14:10
-
it is a .sql fileRolf12– Rolf122020年07月30日 15:07:01 +00:00Commented Jul 30, 2020 at 15:07
-
No, apologies. What type of database is it; Oracle, MySQL, SQLServer, etc.? Each has its own method for exporting data, some easier to work with than others.s3dev– s3dev2020年07月30日 15:21:54 +00:00Commented Jul 30, 2020 at 15:21
-
ah sorry, mysqlRolf12– Rolf122020年07月30日 15:26:12 +00:00Commented Jul 30, 2020 at 15:26
2 Answers 2
I had this problem before... Try this:
#import libraries
from sqlalchemy import create_engine
import pandas as pd
#set login parameters
db = #enter your database schema name here
user = #enter username to login
pw = #type the password here
#connect to database
engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
.format(user=user,pw=pw, db=db))
#load data from MySQL into a dataframe
df = pd.read_sql_query("Select * from table", engine)
10 Comments
df = pd.read_sql(sql=query, con=engine) will do the job. Additionally, user might need to add the database port to the connstr.I am pretty sure something like this would work
import pyodbc
import pandas as pd
conn = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\Users\Ron\Desktop\testdb.accdb;')
SQL_Query = pd.read_sql_query(
'''select
product_name,
product_price_per_unit,
units_ordered,
((units_ordered) * (product_price_per_unit)) AS revenue
from tracking_sales''', conn)
df = pd.DataFrame(SQL_Query, columns=['field1','field2',...])
You can see a good example on this link: https://datatofish.com/sql-to-pandas-dataframe/