-
Notifications
You must be signed in to change notification settings - Fork 925
Design for inserting multiple rows #2385
-
Heya. A recurring feature request (#2254, #2002 etc) is the ability to insert multiple rows at once.
For PostgreSQL we have CopyFrom support, which works quite well - but COPY FROM can only INSERT, so you need to use a temp table if you want some form of updating.
For MySQL I have PR #2220 pending, but MySQL's LOAD DATA LOCAL INFILE comes with a bunch of caveats that are scary. It is more powerful than Postgres's COPY FROM through, but I've yet to build support the LOAD DATA statement which actually allows using those features.
So that still leaves us with a niche unfilled: A simple INSERT INTO with multiple rows.
I've been looking into the design of it and ran into a few things I'd like to discuss.
- How do we signal to sqlc that we want to use this bulk insert? I've considered
INSERT INTO table VALUES sqlc.repeat((?, ?, NOW()))
but the parsers choke on that. My latest thinking is to use Allow for extra parameters to the queryType #2375 and have the syntax-- name: MultipleInsert :execrows multiple
. - Method signature: I propose to simply refuse any parameters outside of the to-be-inserted row, so the argument can simply be a slice.
- Implementation: I'm thinking to split the query into three parts: the prologue:
INSERT INTO table VALUES
, a repeatable(?, ?, NOW())
and the epilogueON CONFLICT ...
. However, it's proven non-trivial to extract that from the sql parsers. PostgreSQL's AST only exposes this for A_Const nodes (Expose node position in input string pganalyze/pg_query_go#90 ). MySQL exposes positions of AST nodes, but not lengths - so we can't find the parentheses of the values with 100% certaintly. The sqlite parser seems to expose positions, so that's at least easy. pg_query_go and tidb/parser do expose their scanners/lexers though, so we could probably use that to parse simple queries. That does add an entire parsing path for each engine however.(削除) Or we could just use a simple regexp ;) (削除ここまで)
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 12 -
👀 4
Replies: 4 comments 5 replies
-
Isn't the query just a regular INSERT INTO table (col1, col2) VALUES (?, ?) ON CONFLICT ...
? Currently, the generator generates correct code for that kind of query already. For supporting :execrows
, don't we just have to add code to the generated output which accept a slice of arguments?
Beta Was this translation helpful? Give feedback.
All reactions
-
@Jillie is talking about bulk inserts, which is not supported:
INSERT INTO table (col1, col2)
VALUES (?, ?), (?, ?), (?, ?) -- etc.
Beta Was this translation helpful? Give feedback.
All reactions
-
Yeah, it's what I was referring to. As the generator could add support for execrows
which expects a slice for the values perhaps.
Beta Was this translation helpful? Give feedback.
All reactions
-
Any traction on this issue? I recently ran into this exact problem with respect to sqlite. I have an unknown length of records that I would like to insert into my table and am currently just looping through to do the inserts.
Beta Was this translation helpful? Give feedback.
All reactions
-
It can also be processed first using database/sql
.
Beta Was this translation helpful? Give feedback.
All reactions
-
This is a pretty serious gap in functionality right now, forcing us to hand-code batch insertion code.
I like the idea of sqlc.repeat()
, but it seems to me the main hurdle is how to map input values to rows. Strictly speaking, only those parameters inside a repeat should be repeated, and other parameters should not. For example, here's an insert we use:
WITH prev_vals AS (
SELECT materialized_document_id
FROM view_entities
WHERE connection_id = @connection_id AND source_document_id = @source_document_id
)
INSERT INTO ...
...and the outer parameters are not useful to repeat.
What if we could do this:
INSERT INTO foo (a, b, c)
VALUES sqlc.values(@inserts) (@a, @b, @c)
This would be akin to a partial function that, for each @inserts
record, evaluated the right-hand side, adding a comma between each set. I haven't looked at the parser code and whether it would support such a thing.
Beta Was this translation helpful? Give feedback.
All reactions
-
This method of using unnest()
looks like a nice workaround until this is supported, and has the benefit of seemingly being faster.
Beta Was this translation helpful? Give feedback.
All reactions
-
I spoke too soon. There is a bug preventing unnest()
from working: #3507.
Beta Was this translation helpful? Give feedback.
All reactions
-
Also, unnest()
does not support preserving nested arrays, so it will not work for inserts into array columns.
Beta Was this translation helpful? Give feedback.