PostgreSQL relative update breaks unique constraint

Question 1

I have unique index on column for UPSERT. When I am trying to update column using v = v + 1 expression my unique index breaks.

SQL

CREATE TABLE test(v bigint, data jsonb DEFAULT '{}'::jsonb);
INSERT INTO test(v) SELECT vv FROM generate_series(0, 10000) as vv;
CREATE UNIQUE INDEX uniq_ind ON test(v);

UPDATE test SET v = v + 1;

What I've tried:

Using deferred constraint, but it doesn't work with UPSERT.
Using CLUSTER to order rows on disk, so update will have specific order and won't break index. Problem is I have to call CLUSTER before each query which is very expensive.
Implementing UPSERT by hand seems complicated and non performant. (postgres wiki agrees with me on it https://wiki.postgresql.org/wiki/UPSERT#PostgreSQL_.28today.29)

Using multicolumn unique index (v, flag). For this I need to add flag column and replace index

ALTER TABLE test ADD COLUMN flag bool DEFAULT false;
CREATE UNIQUE INDEX uniq_ind ON test(v, flag);

Then UPDATE and UPSERT looks like

-- UPDATE
UPDATE test SET v = v + 1, flag = true;
UPDATE test SET flag = false;
--UPSERT
INSERT INTO test(v) VALUES (123) ON CONFLICT (v, flag) DO UPDATE SET v = EXCLUDED.v;

But it has x2 cost compared to simple update. For now it's most suitable solution.

What is alternatives for UPDATE case or for UPSERT case so I can:

Efficiently UPSERT rows. (this operation is dominant)
Update many records with expressions like v = v + 1.

Scale is around 1k-10k rows per update. And around 1m-10m records in the table.

Question 2

What is the purpose of incrementing v this way? What is it tracking?

Question 3

Yes, the best solution is probably to avoid that update. Perhaps you can calculate v on select.

Question 4

I've tried to simplify example. Actually I use row and col for storing data for excel like table cells.

Question 5

@Olleggerr: We can only answer the question as given. If your "real" question is substantially different, I suggest you start a new question, this time presenting what you really want to ask.

Question 6

I second Laurenz' comment: typically it's best to avoid such an update on a UNIQUE column to begin with.

If that's not possible, one workaround would be to order rows in a subquery and self-join:

UPDATE test t
SET v = t.v + 1
FROM (SELECT * FROM test ORDER BY v DESC) t_ordered -- additional WHERE clauses?
WHERE t_ordered.v = t.v;

db<>fiddle here

This usually works. But no rows are locked in the subquery, so the command is not bullet-proof against concurrent writes. If you want that, you'll have to write-lock the whole table, or use SERIALIZABLE transaction isolation. Either is expensive for concurrent access.

Postgres UPDATE ... LIMIT 1

Also, you mentioned:

1k-10k rows per update. And around 1m-10 rows in the table.

So the UPDATE can still conflict with rows that are not updated.

To update the whole table, consider dropping the UNIQUE constraint before the update and recreate it after, in the same transaction. That takes an exclusive write-lock, of course. But that seems ok while updating the whole table. Recreating the index is cheaper than incrementally updating all rows anyway, and you get a pristine (de-bloated, reindexed) unique index as side effect.
Conflicts with FK constraints pointing to the UNIQUE column, though ...

Question 7

I've tried to simplify example. Actually I use row and col for storing data for excel like table cells. So real structure is something like cells(row int, col int, sheet_id uuid, data text). And UNIQUE index on (col, row, sheet_id). Also I lazily write cells, so I am using UPSERT for this

score 2 · Accepted Answer · 2021-09-13 13:06:46Z

I second Laurenz' comment: typically it's best to avoid such an update on a UNIQUE column to begin with.

If that's not possible, one workaround would be to order rows in a subquery and self-join:

UPDATE test t
SET v = t.v + 1
FROM (SELECT * FROM test ORDER BY v DESC) t_ordered -- additional WHERE clauses?
WHERE t_ordered.v = t.v;

db<>fiddle here

This usually works. But no rows are locked in the subquery, so the command is not bullet-proof against concurrent writes. If you want that, you'll have to write-lock the whole table, or use SERIALIZABLE transaction isolation. Either is expensive for concurrent access.

Postgres UPDATE ... LIMIT 1

Also, you mentioned:

1k-10k rows per update. And around 1m-10 rows in the table.

So the UPDATE can still conflict with rows that are not updated.

To update the whole table, consider dropping the UNIQUE constraint before the update and recreate it after, in the same transaction. That takes an exclusive write-lock, of course. But that seems ok while updating the whole table. Recreating the index is cheaper than incrementally updating all rows anyway, and you get a pristine (de-bloated, reindexed) unique index as side effect.
Conflicts with FK constraints pointing to the UNIQUE column, though ...

I've tried to simplify example. Actually I use row and col for storing data for excel like table cells. So real structure is something like cells(row int, col int, sheet_id uuid, data text). And UNIQUE index on (col, row, sheet_id). Also I lazily write cells, so I am using UPSERT for this

Stack Exchange Network

PostgreSQL relative update breaks unique constraint

SQL

What I've tried:

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

PostgreSQL relative update breaks unique constraint

SQL

What I've tried:

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions