I have an SQL Table in Snowflake,100K rows and 15 Columns. I want to import this table into my Jupyter notebook using Dask for further analysis. Primarily doing this a form of practice since I am new to Dask.
import snowflake.connector
connection_parameters = {
'user': user,
'password': password,
'account': account,
'warehouse': warehouse,
'database': database,
'schema': schema
}
-These are not my actual credentials (I changed so I can post here. -Eventually, I will use environment variables, but going the simple route first.
query = """
SELECT * FROM my_table
"""
from dask.distributed import Client
from dask import delayed
from dask.dataframe import from_delayed
client = Client() # Start a Dask cluster
@delayed
def fetch_data(query):
cur = conn.cursor()
cur.execute(query)
return cur.fetch_pandas_all()
Everything seems to work fine upto this point
import dask.dataframe as dd
try:
with Client() as client:
ddf = dd.from_delayed([delayed(fetch_data)(query) for _ in range(10)])
result_df = ddf.compute()
except Exception as e:
print(f"Error occurred: {e}")
When I run this last code block, I get this error :
Error occurred: ('Could not serialize object of type HighLevelGraph'
I am stuck on this step, so any help would be appreciated. Again, I am new to Dask, so if ther is a simpler method I haven't considered, please recommend, my end goal is to bring in my data from Snowflake to Dask.
connobject within the delayed function.