Suppose there are two tables t1
and t2
. t1
has a boolean
column ct1
.
There are two scenarios:
- If
ct1
is false, create new entries int2
and makect1
true - If
ct1
is true, just return
How to handle above scenario in Postgres if there are concurrent queries like above?
The possible race condition is: the first query sees ct1
as false and then create entries in t2
, then the second query also see ct1
as false before the first query can make it true.
1 Answer 1
An UPDATE
takes a write lock on the row automatically, which prevents concurrent transactions from doing the same until the lock is released (your transaction has finished).
So this should do the trick:
WITH upd AS (
UPDATE t1
SET ct1 = true
WHERE t1.id = $t1_id -- your input here
AND ct1 = false
RETURNING t1_id
)
INSERT INTO t2(t1_id, col1)
SELECT t1_id, 'foo' -- or your input for t2 here
FROM upd; -- only if UPDATE found a row
This assumes a PK t1.id
to allow multiple rows in t1
. Your example makes it seem like there is a single row in t1
. The same solution would work for that simple case, just remove t1.id
from the query.
t1.ct1
must be defined NOT NULL
.
If the UPDATE
finds no row (row in t1 with t1.id = $t1_id
is already true
or does not exist) then nothing happens.
If concurrent transactions wait to update the same row, they will wake up once this transaction has finished.
If your transaction commits, ct1
is true
now, and the recheck for others will return no qualifying row, i.e. concurrent transactions are finished, too.
If your transaction rolls back, the next one in line gets to update t1
(and hence also insert rows in t2
).
Note: Normally, queries in CTEs can execute in any order. But since the outer INSERT
references the UPDATE
, a sequence of operations is established.
-
Thats a great point of view, which has helped to make things clearer for me. Here, is the understanding for me till now: Start transaction with the default isolation level(read committed). The first transaction will update ct1 column and that will force any concurrent transaction to roll back which is how read committed isolation is supposed to work. Found a good slide about isolation levels postgresql.org/files/developer/concurrency.pdfAman Gupta– Aman Gupta2017年01月10日 17:36:22 +00:00Commented Jan 10, 2017 at 17:36
-
2@AmanGupta: Actually, concurrent transactions will not roll back. They just don't find a row to update (the recheck for
ct1 = false
excludes the row) and do nothing.Erwin Brandstetter– Erwin Brandstetter2017年01月10日 21:08:27 +00:00Commented Jan 10, 2017 at 21:08 -
Ya correct Erwin.Aman Gupta– Aman Gupta2017年01月11日 06:22:01 +00:00Commented Jan 11, 2017 at 6:22
-
How about this: Using sequelize(nodejs) which uses repeatable read as the default isolation level. Each transaction will have a snapshot of db at the start of the transaction. Wont update ct1 to true unless insertion in t2 is done(cant make it true at begining). When the transaction comes to update ct1 1. if it is still unchanged, the transaction updates and commits 2. If it is updated by another concurrent transaction, it will rollback(this is how read repeated isolation level behaves) and throw an error, so need to be retried and will get ct1 as true this timeAman Gupta– Aman Gupta2017年01月11日 07:02:05 +00:00Commented Jan 11, 2017 at 7:02
-
I have made a change in question on why I cant make ct1 true as the first query, otherwise your solution was perfect.Aman Gupta– Aman Gupta2017年01月11日 07:03:56 +00:00Commented Jan 11, 2017 at 7:03
UPDATE
plus conditionalINSERT
in Postgres. Different on principal. We do not need the Postgres UPSERT implementation here (INSERT ... ON CONFLICT ... DO UPDATE
). Concurrency control is simple even with default isolation levelread committed
.