Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

read_csv from filehandle, when turned into a view, stops working with No files found that match the pattern "DUCKDB_INTERNAL_OBJECTSTORE://... #477

Open

Description

What happens?

cursor.read_csv(filehandle) returns a DuckDBPyRelation object on which you can call .to_view(viewname) but the view isn't usable later, once the returned DuckDBPyRelation object has gone out of scope. If you try to do something like cursor.read_csv(filehandle).to_view('viewname') then it doesn't work at all.

This doesn't seem to be a problem for opening a csv by filename, or for relations made into tables, just for csvs opened from filehandles and made into views. I think I can understand why it's happening, but it is

(In case you're wondering, I'm opening files from filehandles as a workaround for duckdb/duckdb#12232 ... so more typically with bzip2.open(filename) as fh: cursor.read_csv(fh).to_view(viewname) or similar but using a StringIO makes for a simpler demo to reproduce.)

To Reproduce

import duckdb
from io import StringIO
cursor = duckdb.connect()
csv_file = StringIO("foo,bar\nhello,world")
rel = cursor.read_csv(csv_file)
rel.to_view("view1")
print(rel.alias)
print(cursor.sql("select * from view1"))
csv_file = StringIO("foo,bar\nhello,world")
cursor.read_csv(csv_file).to_view("view2")
print(cursor.sql("select * from view2"))

The first way works, the second way doesn't:

$ python x.py 
DUCKDB_INTERNAL_OBJECTSTORE://b38cc260dcc16094
┌─────────┬─────────┐
│ foo │ bar │
│ varchar │ varchar │
├─────────┼─────────┤
│ hello │ world │
└─────────┴─────────┘
Traceback (most recent call last):
 File "/home/nick/Work/wehi/countess/x.py", line 14, in <module>
 print(cursor.sql("select * from view2"))
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
duckdb.duckdb.IOException: IO Error: No files found that match the pattern "DUCKDB_INTERNAL_OBJECTSTORE://ce4251be44deb137"

It also fails with the same error if rel is deleted or goes out of scope before the SQL query of the view. Note also that read_csv(filename).to_view(viewname) works fine.

OS:

Linux 6.8.0 x86_64

DuckDB Package Version:

1.5.3 from pypi

Also source build duckdb-python 1.6.0-dev45 @ ab63b5f
w/ duckdb v1.5.2-4685-g01eda16d6e

Python Version:

3.12.3

Full Name:

Nick Moore

Affiliation:

Mnemote Pty Ltd

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release 1.5.3 also 1.3.1
I have tested with 1.6.0.dev45 @ ab63b5f

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /