How to handle concurrent write access to two related tables

Question 1

Suppose there are two tables t1 and t2. t1 has a boolean column ct1. There are two scenarios:

If ct1 is false, create new entries in t2 and make ct1 true
If ct1 is true, just return

How to handle above scenario in Postgres if there are concurrent queries like above?

The possible race condition is: the first query sees ct1 as false and then create entries in t2, then the second query also see ct1 as false before the first query can make it true.

Question 2

It is very similar to this: stackoverflow.com/questions/108403/…

Question 3

@McNets: The suggested answer is somewhat related, yes. But the referenced answer is about UPSERT in SQL Server, while this one is about UPDATE plus conditional INSERT in Postgres. Different on principal. We do not need the Postgres UPSERT implementation here (INSERT ... ON CONFLICT ... DO UPDATE). Concurrency control is simple even with default isolation level read committed.

Question 4

An UPDATE takes a write lock on the row automatically, which prevents concurrent transactions from doing the same until the lock is released (your transaction has finished).

So this should do the trick:

WITH upd AS (
 UPDATE t1
 SET ct1 = true
 WHERE t1.id = $t1_id -- your input here
 AND ct1 = false
 RETURNING t1_id
 )
INSERT INTO t2(t1_id, col1)
SELECT t1_id, 'foo' -- or your input for t2 here
FROM upd; -- only if UPDATE found a row

This assumes a PK t1.id to allow multiple rows in t1. Your example makes it seem like there is a single row in t1. The same solution would work for that simple case, just remove t1.id from the query.

t1.ct1 must be defined NOT NULL.

If the UPDATE finds no row (row in t1 with t1.id = $t1_id is already true or does not exist) then nothing happens.

If concurrent transactions wait to update the same row, they will wake up once this transaction has finished.

If your transaction commits, ct1 is true now, and the recheck for others will return no qualifying row, i.e. concurrent transactions are finished, too.

If your transaction rolls back, the next one in line gets to update t1 (and hence also insert rows in t2).

Note: Normally, queries in CTEs can execute in any order. But since the outer INSERT references the UPDATE, a sequence of operations is established.

Question 5

Thats a great point of view, which has helped to make things clearer for me. Here, is the understanding for me till now: Start transaction with the default isolation level(read committed). The first transaction will update ct1 column and that will force any concurrent transaction to roll back which is how read committed isolation is supposed to work. Found a good slide about isolation levels postgresql.org/files/developer/concurrency.pdf

Question 6

@AmanGupta: Actually, concurrent transactions will not roll back. They just don't find a row to update (the recheck for ct1 = false excludes the row) and do nothing.

Question 7

Ya correct Erwin.

Question 8

How about this: Using sequelize(nodejs) which uses repeatable read as the default isolation level. Each transaction will have a snapshot of db at the start of the transaction. Wont update ct1 to true unless insertion in t2 is done(cant make it true at begining). When the transaction comes to update ct1 1. if it is still unchanged, the transaction updates and commits 2. If it is updated by another concurrent transaction, it will rollback(this is how read repeated isolation level behaves) and throw an error, so need to be retried and will get ct1 as true this time

Question 9

I have made a change in question on why I cant make ct1 true as the first query, otherwise your solution was perfect.

score 3 · Accepted Answer · 2017-01-10 14:44:07Z

An UPDATE takes a write lock on the row automatically, which prevents concurrent transactions from doing the same until the lock is released (your transaction has finished).

So this should do the trick:

WITH upd AS (
 UPDATE t1
 SET ct1 = true
 WHERE t1.id = $t1_id -- your input here
 AND ct1 = false
 RETURNING t1_id
 )
INSERT INTO t2(t1_id, col1)
SELECT t1_id, 'foo' -- or your input for t2 here
FROM upd; -- only if UPDATE found a row

This assumes a PK t1.id to allow multiple rows in t1. Your example makes it seem like there is a single row in t1. The same solution would work for that simple case, just remove t1.id from the query.

t1.ct1 must be defined NOT NULL.

If the UPDATE finds no row (row in t1 with t1.id = $t1_id is already true or does not exist) then nothing happens.

If concurrent transactions wait to update the same row, they will wake up once this transaction has finished.

If your transaction commits, ct1 is true now, and the recheck for others will return no qualifying row, i.e. concurrent transactions are finished, too.

If your transaction rolls back, the next one in line gets to update t1 (and hence also insert rows in t2).

Note: Normally, queries in CTEs can execute in any order. But since the outer INSERT references the UPDATE, a sequence of operations is established.

Thats a great point of view, which has helped to make things clearer for me. Here, is the understanding for me till now: Start transaction with the default isolation level(read committed). The first transaction will update ct1 column and that will force any concurrent transaction to roll back which is how read committed isolation is supposed to work. Found a good slide about isolation levels postgresql.org/files/developer/concurrency.pdf
@AmanGupta: Actually, concurrent transactions will not roll back. They just don't find a row to update (the recheck for ct1 = false excludes the row) and do nothing.
How about this: Using sequelize(nodejs) which uses repeatable read as the default isolation level. Each transaction will have a snapshot of db at the start of the transaction. Wont update ct1 to true unless insertion in t2 is done(cant make it true at begining). When the transaction comes to update ct1 1. if it is still unchanged, the transaction updates and commits 2. If it is updated by another concurrent transaction, it will rollback(this is how read repeated isolation level behaves) and throw an error, so need to be retried and will get ct1 as true this time
I have made a change in question on why I cant make ct1 true as the first query, otherwise your solution was perfect.

Stack Exchange Network

How to handle concurrent write access to two related tables

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to handle concurrent write access to two related tables

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions