-
Notifications
You must be signed in to change notification settings - Fork 91
read_csv from filehandle, when turned into a view, stops working with No files found that match the pattern "DUCKDB_INTERNAL_OBJECTSTORE://... #477
Description
What happens?
cursor.read_csv(filehandle) returns a DuckDBPyRelation object on which you can call .to_view(viewname) but the view isn't usable later, once the returned DuckDBPyRelation object has gone out of scope. If you try to do something like cursor.read_csv(filehandle).to_view('viewname') then it doesn't work at all.
This doesn't seem to be a problem for opening a csv by filename, or for relations made into tables, just for csvs opened from filehandles and made into views. I think I can understand why it's happening, but it is
(In case you're wondering, I'm opening files from filehandles as a workaround for duckdb/duckdb#12232 ... so more typically with bzip2.open(filename) as fh: cursor.read_csv(fh).to_view(viewname) or similar but using a StringIO makes for a simpler demo to reproduce.)
To Reproduce
import duckdb
from io import StringIO
cursor = duckdb.connect()
csv_file = StringIO("foo,bar\nhello,world")
rel = cursor.read_csv(csv_file)
rel.to_view("view1")
print(rel.alias)
print(cursor.sql("select * from view1"))
csv_file = StringIO("foo,bar\nhello,world")
cursor.read_csv(csv_file).to_view("view2")
print(cursor.sql("select * from view2"))
The first way works, the second way doesn't:
$ python x.py
DUCKDB_INTERNAL_OBJECTSTORE://b38cc260dcc16094
┌─────────┬─────────┐
│ foo │ bar │
│ varchar │ varchar │
├─────────┼─────────┤
│ hello │ world │
└─────────┴─────────┘
Traceback (most recent call last):
File "/home/nick/Work/wehi/countess/x.py", line 14, in <module>
print(cursor.sql("select * from view2"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
duckdb.duckdb.IOException: IO Error: No files found that match the pattern "DUCKDB_INTERNAL_OBJECTSTORE://ce4251be44deb137"
It also fails with the same error if rel is deleted or goes out of scope before the SQL query of the view. Note also that read_csv(filename).to_view(viewname) works fine.
OS:
Linux 6.8.0 x86_64
DuckDB Package Version:
1.5.3 from pypi
Also source build duckdb-python 1.6.0-dev45 @ ab63b5f
w/ duckdb v1.5.2-4685-g01eda16d6e
Python Version:
3.12.3
Full Name:
Nick Moore
Affiliation:
Mnemote Pty Ltd
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release 1.5.3 also 1.3.1
I have tested with 1.6.0.dev45 @ ab63b5f
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration to reproduce the issue?
- Yes, I have