Assume I have a table as follows named people
, where id
is a Primary Key:
+-----------+---------+---------+
| id | fname | lname |
| (integer) | (text) | (text) |
+===========+=========+=========+
| 1 | Daniel | Edwards |
| 2 | Fred | Holt |
| 3 | Henry | Smith |
+-----------+---------+---------+
I'm trying to write a row duplication query which is robust enough to account for schema changes to the table. Any time I add a column to the table, I don't want to have to go back and modify the duplication query.
I know I can do this, which will duplicate record id 2 and give the duplicated record a new id:
INSERT INTO people (fname, lname) SELECT fname, lname FROM people WHERE id = 2;
However if I add an age
column, I'll need to modify the query to also account for the age column.
Obviously I can't do the following, because it will also duplicate the primary key, resulting in a duplicate key value violates unique constraint
-- And, I don't want them to share the same id anyway:
INSERT INTO people SELECT * FROM people WHERE id = 2
With that said, what would be a reasonable approach to solving this challenge? I would prefer to stay away from stored procedures, but I'm not 100% against them, I suppose ...
3 Answers 3
Simple with hstore
If you have the additional module hstore
installed (instructions in link below), there is a surprisingly simple way to replace the value(s) of individual field(s) without knowing anything about other columns:
Basic example: duplicate the row with id = 2
but replace 2
with 3
:
INSERT INTO people
SELECT (p #= hstore('id', '3')).* FROM people p WHERE id = 2;
Details:
- How to set value of composite variable field using dynamic SQL
- Assign to NEW by key in a Postgres trigger
Assuming (since it's not defined in the question) that people.id
is a serial
column with an attached sequence, you'll want the next value from the sequence. We can determine the sequence name with pg_get_serial_sequence()
. Details:
Or you can just hard-code the sequence name if it's not going to change.
We would have this query:
INSERT INTO people
SELECT (p #= hstore('id', nextval(pg_get_serial_sequence('people', 'id'))::text)).*
FROM people p WHERE id = 2;
(削除ここまで)Which works, but suffers from a weakness in the Postgres query planner: The expression is evaluated separately for every single column in the row, wasting sequence numbers and performance. To avoid this, move the expression into a subqery and decompose the row once only:
INSERT INTO people
SELECT (p1).*
FROM (
SELECT p #= hstore('id', nextval(pg_get_serial_sequence('people', 'id'))::text) AS p1
FROM people p WHERE id = 2
) sub;
Probably fastest for a single (or few) row(s) at once.
json / jsonb
If you don't have hstore
installed and can't install additional modules, you can do a similar trick with json_populate_record()
or jsonb_populate_record()
(削除) , but that capability is undocumented and may be unreliable (削除ここまで). Update: The feature is also documented since Postgres 13. See:
Transient temporary table
Another simple solution would be to use a transient temporary like this:
BEGIN;
CREATE TEMP TABLE people_tmp ON COMMIT DROP AS
SELECT * FROM people WHERE id = 2;
UPDATE people_tmp SET id = nextval(pg_get_serial_sequence('people', 'id'));
INSERT INTO people TABLE people_tmp;
COMMIT;
I added ON COMMIT DROP
to drop the table automatically at the end of the transaction. Consequently, I also wrapped the operation into a transaction of its own. Neither is strictly necessary.
This offers a wide range of additional options - you can do anything with the row before inserting, but it's going to be a bit slower due to the overhead of creating and dropping a temp table.
This solution works for a single row or for any number of rows at once. Each row gets a new default value from the sequence automatically.
Using the short (SQL standard) notation TABLE people
.
Dynamic SQL
For many rows at once, dynamic SQL is going to be fastest. Concatenate the columns from the system table pg_attribute
or from the information schema and execute it dynamically in a DO
statement or write a function for repeated use:
CREATE OR REPLACE FUNCTION f_row_copy(_tbl regclass, _id int, OUT row_ct int)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE (
SELECT format('INSERT INTO %1$s(%2$s) SELECT %2$s FROM %1$s WHERE id = 1ドル',
_tbl, string_agg(quote_ident(attname), ', '))
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
AND attname <> 'id' -- exclude id column
)
USING _id;
GET DIAGNOSTICS row_ct = ROW_COUNT; -- directly assign OUT parameter
END
$func$;
Call:
SELECT f_row_copy('people', 9);
Works for any table with an integer column named id
. You could easily make the column name dynamic, too ...
Maybe not your first choice since you wanted to stay away from stored procedures
, but then again, it's not a "stored procedure" anyway ...
Related:
Advanced solution
A serial
column is a special case. If you want to fill more or all columns with their respective default values, it gets more sophisticated. Consider this related answer:
Try to create a trigger
on insert:
CREATE TRIGGER name BEFORE INSERT
In this trigger you make the ID NULL. When the trigger is finished the insert is done and Postgres will provide an ID. I assume that you have defined the ID as DEFAULT NEXTVAL('A_SEQUENCE'::REGCLASS)
.
-
2This will work, but it's quite a "Sneaky" solution that I think would end up causing problems in the long run. I, personally, would avoid doing this if at all possible.... If he DOES choose to do this, I hope he would FIRST do a SELECT inside the trigger to see if the supplied ID existed .. if it did, then, and only then, set the NEW.id to NULL..Joishi Bodio– Joishi Bodio2015年11月25日 19:16:22 +00:00Commented Nov 25, 2015 at 19:16
-
He can do that but if you choose to use a
NEXTVAL('A_SEQUENCE'::REGCLASS)
that you never provide an ID yourself for a new entry.Marco– Marco2015年11月25日 20:24:12 +00:00Commented Nov 25, 2015 at 20:24 -
1That depends on how code and/or external libraries in your code will use the database. Some may manually query the SEQ.NEXTVAL and then send the generated ID in an INSERT statement.. I just wouldn't trust that the table/sequence/trigger trio would end up behaving "as expected" all the time. Thus my first comment.Joishi Bodio– Joishi Bodio2015年11月25日 21:04:42 +00:00Commented Nov 25, 2015 at 21:04
Dynamic SQL Work great, I am looking for this since few years,
if you have more than one excluded column, try simply,
AND attname <> 'id' -- exclude id column
AND attname <> 'second_col_name' -- exclude second_col_name
Explore related questions
See similar questions with these tags.
age
is kind of an anti-pattern for a column. (One should rather store thebirthday
.)