How to avoid database deadlocks?

Question 1

Let's say we have two tables user and book.

On application level to ensure data integrity and avoid race conditions we lock entity at the beginning of transaction.

Question 1:

Let's say on application level we have two requests running at the same time:

Request 1 (app level):

start trx
lock for update user with id 1
lock for update book with id 2
do changes
commit

Request 2 (app level):

start trx
lock for update book with id 2
lock for update user with id 1
do changes
commit

It can happen that both request first locks will apply before second locks, request 1 lock of user 1 and request 2 lock of book 2, which will result in dead lock.

How can we avoid this? My only idea is to write somewhere down a lock order for all tables like user->book and ensure we follow it.

Question 2:

If we do a two queries like that in different requests to lock entities before mutation

Request 1 SELECT * FROM user WHERE id IN [1,2] FOR UPDATE
Request 2 SELECT * FROM user WHERE id IN [2,1] FOR UPDATE

Can this result in deadlock? For example if ONE query for some reason will lock user 1 first and then switch to request 2 and after that get back to lock user 2.

Question 2

Question 1

...

How can we avoid this? My only idea is to write somewhere down a lock order for all tables like user->book and ensure we follow it.

You obviously already understand the essence of how the deadlock arises, which stating the most minimal example briefly, is that two threads each acquire and hold one lock, and each needs to acquire another in order to proceed, but the lock which each needs to acquire is the one already held by the other.

This situation, where it is unexpected, arises essentially as the result of a design flaw by the programmer. Database engines detect (at runtime) the deadlocks these defects cause and resolve them by terminating one of the threads and releasing all its locks, but engines do not statically analyse code to discern the possibility of deadlock between two threads and act to pre-empt them, and they do not handle deadlocks in any intelligent way besides permanently aborting a thread and all its uncommitted work.

More broadly, it is one of the shortfalls in the principle of "isolation" between concurrent algorithms, which SQL engines are (in the popular understanding) supposed to provide.

Nevertheless, the deadlock monitoring that database engines perform is better that what came before, where an entire computer system would simply hang permanently when it encountered one of these concurrent programming defects.

As you already note, one way of correct design is to ensure a series of locks are never proposed to be taken in an order which may cause the conflict. This might involve an order in which tables are locked, but it also potentially involves an order in which rows are locked in the same table, too. For example, a standard pattern is to update a table of account balances in an order always going from low to high account numbers - because if you go both ways, you're eventually going to deadlock even though the locks are purely within the scope of just one table.

In practice, because SQL engines decide query plans for themselves (by default), including the locking necessary and the ordering of lock placement, and because it is not possible to specify all orderings in a straightforward and explicit way using SQL, then it is often far easier to identify the necessary orderings than to actually make the engine obey them.

Also, whilst the necessary analysis is simple enough when it concerns a limited number of concurrent algorithms on a limited number of tables, it becomes a serious effort to perform the analysis and determine an appropriate locking order for the entire database of a non-trivial application.

Another strategy is to increase the granularity of locks, so that the possibility of a conflicting order is designed-out. So for example, instead of row-locking (in the example I give about updating account balances), escalate to table-locking so that only one table lock is necessary to cover all rows being updated, and a thread either has that sole lock or not.

(EDIT: @LaurenzAlbe from the comments points out that frequent high-level table locks can interrupt autovacuum in Postgres, which can lead to problems like table bloat in the long run.

Like anything with databases, it shouldn't be assumed that any solution to one problem doesn't cause constraints or problems elsewhere.

A deadlock-free approach to algorithms is not necessarily the most performant approach on average, under all circumstances, as I already obliquely implied at the end of this answer.)

A third strategy is to implement some kind of custom serialisation mechanism, which does not directly make the locking of multiple algorithms compatible when they are executed concurrently, but instead simply prevents a concurrent schedule of execution from arising at all amongst two algorithms that are otherwise prone to deadlock.

And finally, the risk of deadlocks can be accepted and handled. The deadlock victims which might be aborted can be placed in a priority order to determine which gets aborted and which has priority to proceed, and additional logic can be added to algorithms to cause the work to be retried in the event of being the victim of a deadlock resolution.

Sometimes, even in a system that is designed with a properly-analysed lock order, it may be necessary (for simple performance reasons) to go against the order and therefore to risk deadlock.

A system that was originally designed with a locking order, might also be altered in a way that potentially requires the locking order to be altered, but instead of engaging in the complicated analysis and rework that may be necessary to change the existing order around which a large database application has already been written, instead an algorithm is designed to go against the order using best effort, and to simply retry on deadlock.

Question 3

@LaurenzAlbe, no, not AI generated (second time someone has said this recently!), just a complicated topic that requires a certain amount of words. And yes, certainly taking table-level locks could have other implications on performance, but the OP asked for ways to avoid deadlocks, and I could hardly avoid mentioning the possibility of changing the lock granularity.

Question 4

Sorry about the wrong suspicion.

Question 5

Ad question 1:

The two transactions will deadlock if run concurrently. Locking the rows in the same order in both transactions will always avoid the deadlock.

Ad question 2:

The two queries can deadlock just like in the previous case. Rows are locked in the order that are searched and found by the statement. It depends on the execution plan, which you can check with EXPLAIN.

Question 6

Question 1 could deadlock if the first statement of each transaction has already acquired its row level lock in the first table before needing the second lock.

One of the transactions will eventually be selected to rollback allowing the other to complete.

The solution as you noted would be to acquire the locks in a consistent order.

For Question 2: Both queries should get an ACCESS SHARE lock to read the table and should not block each other as they are reading the table and not changing data. The ACCESS SHARE lock does not conflict with the other.

That lock would conflict only with an ACCESS EXCLUSIVE lock I believe. (Or some form of select for update)

They are also one statement. One may wait for other to complete, but there is no reason to deadlock. Neither is holding an exclusive lock waiting on another exclusively held lock.

A deadlock occurs when two transactions wish to acquire conflicting locks on same object in an order that cannot be resolved by waiting for the other to complete.

For Reference: https://www.postgresql.org/docs/current/explicit-locking.html

Question 7

Updated question 2, I meant that if I need to update multiple entities in one table and do a lock before mutation in one query, can it happen to be a deadlock if another request also locks the same entities. My question is - is it safe, or do you need to do select * from user where in [1,2] for update ASC or deadlock still happen no matter what I do.

Steve Steve 8875 silver badges7 bronze badges · Accepted Answer · 2024-04-06 15:22:52Z