Duplicate row with Primary Key in PostgreSQL

Question 1

Assume I have a table as follows named people, where id is a Primary Key:

+-----------+---------+---------+
| id | fname | lname |
| (integer) | (text) | (text) |
+===========+=========+=========+
| 1 | Daniel | Edwards |
| 2 | Fred | Holt |
| 3 | Henry | Smith |
+-----------+---------+---------+

I'm trying to write a row duplication query which is robust enough to account for schema changes to the table. Any time I add a column to the table, I don't want to have to go back and modify the duplication query.

I know I can do this, which will duplicate record id 2 and give the duplicated record a new id:

INSERT INTO people (fname, lname) SELECT fname, lname FROM people WHERE id = 2;

However if I add an age column, I'll need to modify the query to also account for the age column.

Obviously I can't do the following, because it will also duplicate the primary key, resulting in a duplicate key value violates unique constraint -- And, I don't want them to share the same id anyway:

INSERT INTO people SELECT * FROM people WHERE id = 2

With that said, what would be a reasonable approach to solving this challenge? I would prefer to stay away from stored procedures, but I'm not 100% against them, I suppose ...

Question 2

Aside: maybe you should use a different example, since age is kind of an anti-pattern for a column. (One should rather store the birthday.)

Question 3

Simple with `hstore`

If you have the additional module hstore installed (instructions in link below), there is a surprisingly simple way to replace the value(s) of individual field(s) without knowing anything about other columns:

Basic example: duplicate the row with id = 2 but replace 2 with 3:

INSERT INTO people
SELECT (p #= hstore('id', '3')).* FROM people p WHERE id = 2;

Details:

Assuming (since it's not defined in the question) that people.id is a serial column with an attached sequence, you'll want the next value from the sequence. We can determine the sequence name with pg_get_serial_sequence(). Details:

PostgreSQL SELECT primary key as "serial" or "bigserial"

Or you can just hard-code the sequence name if it's not going to change.
We would have this query:

(削除)
INSERT INTO people SELECT (p #= hstore('id', nextval(pg_get_serial_sequence('people', 'id'))::text)).* FROM people p WHERE id = 2;
(削除ここまで)

Which works, but suffers from a weakness in the Postgres query planner: The expression is evaluated separately for every single column in the row, wasting sequence numbers and performance. To avoid this, move the expression into a subqery and decompose the row once only:

INSERT INTO people
SELECT (p1).*
FROM (
 SELECT p #= hstore('id', nextval(pg_get_serial_sequence('people', 'id'))::text) AS p1
 FROM people p WHERE id = 2
 ) sub;

Probably fastest for a single (or few) row(s) at once.

json / jsonb

If you don't have hstore installed and can't install additional modules, you can do a similar trick with json_populate_record() or jsonb_populate_record()~~(削除) , but that capability is undocumented and may be unreliable (削除ここまで)~~. Update: The feature is also documented since Postgres 13. See:

How to set value of composite variable field using dynamic SQL

Transient temporary table

Another simple solution would be to use a transient temporary like this:

BEGIN;
CREATE TEMP TABLE people_tmp ON COMMIT DROP AS
SELECT * FROM people WHERE id = 2;
UPDATE people_tmp SET id = nextval(pg_get_serial_sequence('people', 'id'));
INSERT INTO people TABLE people_tmp;
COMMIT;

I added ON COMMIT DROP to drop the table automatically at the end of the transaction. Consequently, I also wrapped the operation into a transaction of its own. Neither is strictly necessary.

This offers a wide range of additional options - you can do anything with the row before inserting, but it's going to be a bit slower due to the overhead of creating and dropping a temp table.

This solution works for a single row or for any number of rows at once. Each row gets a new default value from the sequence automatically.

Using the short (SQL standard) notation TABLE people.

Dynamic SQL

For many rows at once, dynamic SQL is going to be fastest. Concatenate the columns from the system table pg_attribute or from the information schema and execute it dynamically in a DO statement or write a function for repeated use:

CREATE OR REPLACE FUNCTION f_row_copy(_tbl regclass, _id int, OUT row_ct int)
 LANGUAGE plpgsql AS
$func$
BEGIN
 EXECUTE (
 SELECT format('INSERT INTO %1$s(%2$s) SELECT %2$s FROM %1$s WHERE id = 1ドル',
 _tbl, string_agg(quote_ident(attname), ', '))
 FROM pg_attribute
 WHERE attrelid = _tbl
 AND NOT attisdropped -- no dropped (dead) columns
 AND attnum > 0 -- no system columns
 AND attname <> 'id' -- exclude id column
 )
 USING _id;
 GET DIAGNOSTICS row_ct = ROW_COUNT; -- directly assign OUT parameter
END
$func$;

Call:

SELECT f_row_copy('people', 9);

Works for any table with an integer column named id. You could easily make the column name dynamic, too ...

Maybe not your first choice since you wanted to stay away from stored procedures, but then again, it's not a "stored procedure" anyway ...

A serial column is a special case. If you want to fill more or all columns with their respective default values, it gets more sophisticated. Consider this related answer:

Generate DEFAULT values in a CTE UPSERT using PostgreSQL 9.3

Question 4

Try to create a trigger on insert:

CREATE TRIGGER name BEFORE INSERT

In this trigger you make the ID NULL. When the trigger is finished the insert is done and Postgres will provide an ID. I assume that you have defined the ID as DEFAULT NEXTVAL('A_SEQUENCE'::REGCLASS).

Question 5

This will work, but it's quite a "Sneaky" solution that I think would end up causing problems in the long run. I, personally, would avoid doing this if at all possible.... If he DOES choose to do this, I hope he would FIRST do a SELECT inside the trigger to see if the supplied ID existed .. if it did, then, and only then, set the NEW.id to NULL..

Question 6

He can do that but if you choose to use a NEXTVAL('A_SEQUENCE'::REGCLASS) that you never provide an ID yourself for a new entry.

Question 7

That depends on how code and/or external libraries in your code will use the database. Some may manually query the SEQ.NEXTVAL and then send the generated ID in an INSERT statement.. I just wouldn't trust that the table/sequence/trigger trio would end up behaving "as expected" all the time. Thus my first comment.

Question 8

Dynamic SQL Work great, I am looking for this since few years,

if you have more than one excluded column, try simply,

AND attname <> 'id' -- exclude id column
AND attname <> 'second_col_name' -- exclude second_col_name

score 23 · Accepted Answer · 2015-11-26 03:58:26Z

Simple with `hstore`

If you have the additional module hstore installed (instructions in link below), there is a surprisingly simple way to replace the value(s) of individual field(s) without knowing anything about other columns:

Basic example: duplicate the row with id = 2 but replace 2 with 3:

INSERT INTO people
SELECT (p #= hstore('id', '3')).* FROM people p WHERE id = 2;

Details:

Assuming (since it's not defined in the question) that people.id is a serial column with an attached sequence, you'll want the next value from the sequence. We can determine the sequence name with pg_get_serial_sequence(). Details:

PostgreSQL SELECT primary key as "serial" or "bigserial"

Or you can just hard-code the sequence name if it's not going to change.
We would have this query:

(削除)
INSERT INTO people SELECT (p #= hstore('id', nextval(pg_get_serial_sequence('people', 'id'))::text)).* FROM people p WHERE id = 2;
(削除ここまで)

Which works, but suffers from a weakness in the Postgres query planner: The expression is evaluated separately for every single column in the row, wasting sequence numbers and performance. To avoid this, move the expression into a subqery and decompose the row once only:

INSERT INTO people
SELECT (p1).*
FROM (
 SELECT p #= hstore('id', nextval(pg_get_serial_sequence('people', 'id'))::text) AS p1
 FROM people p WHERE id = 2
 ) sub;

Probably fastest for a single (or few) row(s) at once.

json / jsonb

If you don't have hstore installed and can't install additional modules, you can do a similar trick with json_populate_record() or jsonb_populate_record()~~(削除) , but that capability is undocumented and may be unreliable (削除ここまで)~~. Update: The feature is also documented since Postgres 13. See:

How to set value of composite variable field using dynamic SQL

Transient temporary table

Another simple solution would be to use a transient temporary like this:

BEGIN;
CREATE TEMP TABLE people_tmp ON COMMIT DROP AS
SELECT * FROM people WHERE id = 2;
UPDATE people_tmp SET id = nextval(pg_get_serial_sequence('people', 'id'));
INSERT INTO people TABLE people_tmp;
COMMIT;

I added ON COMMIT DROP to drop the table automatically at the end of the transaction. Consequently, I also wrapped the operation into a transaction of its own. Neither is strictly necessary.

This offers a wide range of additional options - you can do anything with the row before inserting, but it's going to be a bit slower due to the overhead of creating and dropping a temp table.

This solution works for a single row or for any number of rows at once. Each row gets a new default value from the sequence automatically.

Using the short (SQL standard) notation TABLE people.

Dynamic SQL

For many rows at once, dynamic SQL is going to be fastest. Concatenate the columns from the system table pg_attribute or from the information schema and execute it dynamically in a DO statement or write a function for repeated use:

CREATE OR REPLACE FUNCTION f_row_copy(_tbl regclass, _id int, OUT row_ct int)
 LANGUAGE plpgsql AS
$func$
BEGIN
 EXECUTE (
 SELECT format('INSERT INTO %1$s(%2$s) SELECT %2$s FROM %1$s WHERE id = 1ドル',
 _tbl, string_agg(quote_ident(attname), ', '))
 FROM pg_attribute
 WHERE attrelid = _tbl
 AND NOT attisdropped -- no dropped (dead) columns
 AND attnum > 0 -- no system columns
 AND attname <> 'id' -- exclude id column
 )
 USING _id;
 GET DIAGNOSTICS row_ct = ROW_COUNT; -- directly assign OUT parameter
END
$func$;

Call:

SELECT f_row_copy('people', 9);

Works for any table with an integer column named id. You could easily make the column name dynamic, too ...

Maybe not your first choice since you wanted to stay away from stored procedures, but then again, it's not a "stored procedure" anyway ...

A serial column is a special case. If you want to fill more or all columns with their respective default values, it gets more sophisticated. Consider this related answer:

Generate DEFAULT values in a CTE UPSERT using PostgreSQL 9.3

Stack Exchange Network

Duplicate row with Primary Key in PostgreSQL

3 Answers 3

Simple with `hstore`

json / jsonb

Transient temporary table

Dynamic SQL

Advanced solution

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Duplicate row with Primary Key in PostgreSQL

3 Answers 3

Simple with hstore

json / jsonb

Transient temporary table

Dynamic SQL

Advanced solution

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Simple with `hstore`