502 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
1
answer
42
views
DuckDB AWS Credential refresh
I am running a DuckDB job in an AWS Fargate container. It is accessing .parquet files on S3 through the DuckLake extension. The container assumes an IAM role which DuckDB can access via CREATE SECRET (...
1
vote
1
answer
46
views
How to authenticate with Service Account in dataproc cluster for duckdb connection to BigQuery
I'm trying to authenticate with an already pre-signed-in service account (SA) in a Dataproc cluster.
I'm configuring a DuckDB connection with the BigQuery extension and I can't seem to reuse the ...
1
vote
1
answer
123
views
DuckDB: how to fine tune parameters?
I have several ndjson files that are nearly 800GB. They come from parsing the Wikipedia dump. I would like to remove duplicate HTML. As such, I group by "html" and pick the JSON with the ...
-3
votes
1
answer
127
views
Is there a way to directly convert CSV to Parquet with DuckDB in Java?
I am doing some tests comparing DuckDB usage among different languages etc, and I've noticed something strange.
In python you can do the following:
duckdb.read_csv(inputFile, max_line_size=10000000, ...
1
vote
1
answer
142
views
DuckDB: out-of-memory problem of groupby-max
I have several ndjson files that are nearly 800GB. They come from parsing the Wikipedia dump. I would like to remove duplicates html. As such, I group by "html" and pick the json with the ...
Best practices
1
vote
0
replies
43
views
Dictionary Encoding VS ENUM type
To meet certain data analysis requirements, I am migrating from a self-hosted local MySQL database to PolarDB. During the migration, I discovered that many data analysis tools offer a technique called ...
1
vote
0
answers
80
views
How do I get DuckDB CLI to successfully log errors?
I'm using the DuckDB CLI, version 1.4.2 on macOS.
The plan is to use duckdb as a part of a CLI pipeline, not from inside of a Python script, etc.
I've already got the tooling to build out the script ...
Advice
0
votes
3
replies
38
views
Issue statement per row
I have a table export like this:
id
file_name
json_content
1
out_1.json
{...}
2
out_2.json
{...}
Now I want to do a COPY (SELECT json_content FROM export) TO file_name for each row.
At first I thought ...
Sascha's user avatar
- 10.4k
0
votes
1
answer
82
views
DuckDB in lightweight incremental Foundry transform
Unable to use DuckDB in an incremental lightweight xform. The docs read to access the duckdb object from the context, but it fails to do so.
from transforms.api import transform, incremental, Input, ...
0
votes
0
answers
32
views
tbl_summary using duckplyr verbs leads to incorrectly sorted outputs
I'm trying to create output using gtsummary::tbl_summary, which I've done many times before like this:
library(dplyr)
library(gtsummary)
iris |>
tbl_summary()
I want the species variable to ...
0
votes
1
answer
72
views
In DuckDB, can there be proper UTF-8 output in duckbox mode to Windows console? [closed]
Edit:
Since the question is off-topic here, I marked it for closing/migration to SuperUser. It was not migrated so far, so I recreated this question and answer at SuperUser.
I cannot get non-ASCII ...
1
vote
1
answer
57
views
C API duckdb_create_list_value is always returning NULL [closed]
I am trying to insert a list of values to a column INTEGER[]. For that I am creating list value as I need to use appender. But duckdb_create_list_value is always returning NULL. I am using v 1.4.1 C ...
1
vote
0
answers
114
views
Problem updating Postgres ENUM from DuckDB
I'm on DuckDB 1.4.1 experiencing difficulty updating a Postgres 17.6 ENUM field status:
CREATE TYPE mystatus_enum AS ENUM (
'IN_STOCK', 'OUT_OF_STOCK', 'NOT_FOUND', 'NOT_A_PRODUCT'
);
CREATE ...
1
vote
1
answer
173
views
Avoiding duckdb OutOfRangeException when multiplying Decimals
I'm working with DuckDB and have several client-provided SQL expressions that use DECIMAL(38,10) columns (fixed precision with 10 digits after the decimal point).
For example:
SELECT S1__AMOUNT * ...
0
votes
1
answer
130
views
Duckdb Wasm limitation
I don't know how to check or increase the memory limitation of duckdb wasm.
I'm using chrome and I import some parquet into the browser, one of them has 234Mb of data
I did my research and the limit ...