I'm running concurrent Postgres queries like this:
UPDATE foo SET bar = bar + 1 WHERE baz = 1234
Each query affects the fixed K number of rows, and I can't find a way to enforce the order in which the rows are updated, I end up with deadlocks. Currently I fix the problem by enforcing the order by hand, but this means I have to execute many more queries than I normally would while also raising the search complexity from O(log N + K) to O(K log N).
Is there a way to improve performance without ending up vulnerable to deadlocks? I suspect that replacing the (baz)
index with the (baz, id)
index might work provided that Postgres updates the rows in the same order that it have scanned them, is this an approach worth pursuing?
1 Answer 1
There is no ORDER BY
in an SQL UPDATE
command. Postgres updates rows in arbitrary order. See:
To avoid deadlocks with absolute certainty, you could run your statements in serializable transaction isolation. But that's more expensive and you need to prepare to repeat commands after a serialization failure.
Your best course of action is probably to lock explicitly with SELECT ... ORDER BY ... FOR UPDATE
in a subquery, CTE, or a standalone SELECT
in a transaction - in default READ COMMITTED
isolation level. Quoting Tom Lane on pgsql-general:
Should be all right --- the FOR UPDATE locking is always the last step in the SELECT pipeline.
This should do the job:
BEGIN;
SELECT 1
FROM foo
WHERE baz = 1234
ORDER BY bar
FOR UPDATE;
UPDATE foo
SET bar = bar + 1
WHERE baz = 1234;
COMMIT;
A multicolumn index on (baz, bar)
might be perfect for performance. But since bar
is obviously updated a lot, a single-column index on just (baz)
might be better overall. Depends on a couple of factors. How many rows per baz
? Are HOT updates possible without the multicolumn index? ...
If baz
is updated concurrently, there is still an unlikely corner case chance for conflicts. The manual:
It is possible for a
SELECT
command running at theREAD COMMITTED
transaction isolation level and usingORDER BY
and a locking clause to return rows out of order. ...
Also, if you should have a unique constraint involving bar
, consider a DEFERRABLE
constraint to avoid unique violations within the same command. Related answer:
-
1IF I'm ordering by
id
or some other unique column instead ofbar
, there shouldn't be a corner case or a performance hit, right?Aleksei Averchenko– Aleksei Averchenko2014年06月17日 12:46:44 +00:00Commented Jun 17, 2014 at 12:46 -
@AlexeiAverchenko: Yes, a unique column that is never updated would be perfect for this - and a multicolumn index inlcluding this column in second position.Erwin Brandstetter– Erwin Brandstetter2014年06月17日 12:49:11 +00:00Commented Jun 17, 2014 at 12:49