Understanding Lock Behavior in SQL Server Deadlock Scenario with INSERT

Question 1

Hello SQL Server Experts,

I'm currently experimenting with SQL Server deadlocks and have intentionally created one to better understand locking behavior. After setting up a test table, I ran two separate sessions with the following indefinite while loops:

Test table:

DROP TABLE IF EXISTS dbo.table1;
CREATE TABLE dbo.table1 (
 id INT IDENTITY,
 c1 CHAR(4000) DEFAULT REPLICATE('a', 4000),
 c2 NVARCHAR(MAX) DEFAULT REPLICATE('x', 9000),
 c3 DATETIME DEFAULT GETDATE()
);
WITH a AS (SELECT 1 AS i UNION ALL SELECT 1),
 b AS (SELECT 1 AS i FROM a x, a y),
 c AS (SELECT 1 AS i FROM b x, b y),
 d AS (SELECT 1 AS i FROM c x, c y),
 e AS (SELECT 1 AS i FROM d x, d y),
 f AS (SELECT 1 AS i FROM e x, e y),
cte AS (SELECT ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) AS n FROM f)
INSERT INTO dbo.table1
SELECT TOP (500000)
 REPLICATE('a', 4000),
 REPLICATE('x', 9000),
 GETDATE()
FROM cte;

Session 1:

WHILE 1 = 1
BEGIN
 BEGIN TRAN;
 DELETE FROM table1 
 WHERE id % 5000 = 1
 AND id >= 240000;
 INSERT INTO table1 (id, c1, c2, c3)
 SELECT 240000, REPLICATE('a', 4000), REPLICATE('x', 9000), GETDATE();
 WAITFOR DELAY '00:00:31';
 ROLLBACK;
END;

Session 2:

WHILE 1 = 1
BEGIN
 BEGIN TRAN;
 UPDATE table1
 SET c1 = 'updated'
 WHERE id % 5000 = 1;
 ROLLBACK;
END;

Occasionally, this setup leads to a deadlock, and the deadlock event XML suggests that the INSERT statement is attempting to acquire an update (U) lock on a page. My understanding is that an INSERT should always request an exclusive (X) lock. Why is it trying to obtain an update lock in this scenario?

Here is the deadlock event XML for reference:

<deadlock>
 <victim-list>
 <victimProcess id="process236f8773848" />
 </victim-list>
 <process-list>
 <process id="process236f8773848" taskpriority="0" logused="255572" waitresource="PAGE: 34:1:2011908 " waittime="8975" ownerId="100233793" transactionname="user_transaction" lasttranstarted="2024年02月09日T12:23:01.017" XDES="0x2374648c470" lockMode="U" schedulerid="33" kpid="16868" status="suspended" spid="60" sbid="0" ecid="0" priority="0" trancount="3" lastbatchstarted="2024年02月09日T12:27:05.243" lastbatchcompleted="2024年02月09日T12:23:50.570" lastattention="2024年02月09日T12:23:50.570" clientapp="Microsoft SQL Server Management Studio - Query" hostname="myhost" hostpid="14044" loginname="dev\mydb" isolationlevel="read committed (2)" xactid="100233793" currentdb="34" currentdbname="mydb" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
 <executionStack>
 <frame procname="adhoc" line="5" stmtstart="52" stmtend="142" sqlhandle="0x02000000c996cc147055426e69966dd635d985db5bc3879a0000000000000000000000000000000000000000">
unknown </frame>
 <frame procname="adhoc" line="5" stmtstart="86" stmtend="218" sqlhandle="0x0200000087908a07b9eddfc8d6177991560c8f41cbc5dfad0000000000000000000000000000000000000000">
unknown </frame>
 </executionStack>
 <inputbuf>
WHILE 1 = 1
BEGIN
 BEGIN TRAN;
 DELETE table1 
 WHERE id % 5000 = 1
 AND id &gt;= 240000;
 INSERT INTO dbo.table1
 (
 id,
 c1,
 c2,
 c3
 )
 SELECT 240000,
 REPLICATE('a', 4000),
 REPLICATE('x', 9000),
 GETDATE();
 WAITFOR DELAY '00:00:31'
 ROLLBACK;
END; </inputbuf>
 </process>
 <process id="process236f8705c28" taskpriority="0" logused="972624" waitresource="PAGE: 34:1:695567 " waittime="2630" ownerId="100234046" transactionname="user_transaction" lasttranstarted="2024年02月09日T12:23:05.680" XDES="0x237cbf8c470" lockMode="U" schedulerid="11" kpid="21068" status="suspended" spid="72" sbid="0" ecid="0" priority="0" trancount="3" lastbatchstarted="2024年02月09日T12:27:07.600" lastbatchcompleted="2024年02月09日T12:23:55.770" lastattention="2024年02月09日T12:23:55.770" clientapp="Microsoft SQL Server Management Studio - Query" hostname="myhost" hostpid="14044" loginname="dev\mydb" isolationlevel="read committed (2)" xactid="100234046" currentdb="34" currentdbname="mydb" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
 <executionStack>
 <frame procname="adhoc" line="4" stmtstart="72" stmtend="164" sqlhandle="0x02000000b2a857012fa9eca41c1d4ad00b393826d7588f120000000000000000000000000000000000000000">
unknown </frame>
 <frame procname="adhoc" line="4" stmtstart="82" stmtend="204" sqlhandle="0x02000000b67c153ad5ca4c0ec4c49b7f6d4158e6b17ea53f0000000000000000000000000000000000000000">
unknown </frame>
 </executionStack>
 <inputbuf>
WHILE 1 = 1
BEGIN
 BEGIN TRAN;
 UPDATE table1
 SET c1 = 'updated'
 WHERE id % 5000 = 1
 ROLLBACK;
END; </inputbuf>
 </process>
 </process-list>
 <resource-list>
 <pagelock fileid="1" pageid="2011908" dbid="34" subresource="FULL" objectname="mydb.dbo.table1" id="lock23c698f3e00" mode="X" associatedObjectId="72057594046251008">
 <owner-list>
 <owner id="process236f8705c28" mode="X" />
 </owner-list>
 <waiter-list>
 <waiter id="process236f8773848" mode="U" requestType="wait" />
 </waiter-list>
 </pagelock>
 <pagelock fileid="1" pageid="695567" dbid="34" subresource="FULL" objectname="mydb.dbo.table1" id="lock23856f2c200" mode="X" associatedObjectId="72057594046251008">
 <owner-list>
 <owner id="process236f8773848" mode="X" />
 </owner-list>
 <waiter-list>
 <waiter id="process236f8705c28" mode="U" requestType="wait" />
 </waiter-list>
 </pagelock>
 </resource-list>
</deadlock>

Additionally, since the INSERT statement adds a row that would match the DELETE condition (note the id equals to 240000), I would assume that the DELETE has already placed exclusive locks on the rows or pages, and thus, they should be held until the end of the transaction. Why then is the INSERT operation trying to get an update lock for the row/page where the record 240000 will be located?

Moreover, I'm seeking advice on how to resolve and prevent such a deadlock. What strategies or best practices can I employ to avoid this situation?

I've searched for answers online without success. Could someone please help clarify these locking behaviors for me?

Thank you in advance for your insights!

Question 2

As it is a heap there is no specific "row/page where the record 240000 will be located"

Question 3

@MartinSmith I know a heap is basically a bunch of records randomly placed in some pages. But even in this case, a row will still be place some where, right?

Question 4

It seems to be an extremely complicated setup for something that is only designed to test a principle.

If I understand the code correctly, you begin with a test table with 500,000 sequentially and uniquely numbered rows. Only the id column really matters in analysing the logic - the rest of the columns are just padding to make each row consume more space (intentional or not?).

The id column is not in fact an identity for each row, because there is no primary key (or other constraint) defined.

Hence, the INSERT can proceed to add a row with id=240000 twice. This falsifies one of your claims:

Additionally, since the INSERT statement adds a row that would match the DELETE condition (note the id equals to 240000)

If there was a primary key, the insert would always fail, because the preceding DELETE does not remove rows with id=240000 (it removes only those that satisfy the filter id % 5000 = 1, which in plain English means it removes every 5000th row which begins with a 1; so 1, 5001, 10001, ... 240001, etc.)

My understanding is that an INSERT should always request an exclusive (X) lock. Why is it trying to obtain an update lock in this scenario?

What the deadlock XML actually describes is that each process holds an exclusive lock on a page for itself, whilst each process is also seeking an update lock on a page that is already locked by the other.

The main conflict in principle appears to be between the DELETE and UPDATE statements, since these are both seeking to operate upon the same data.

My guess is that the algorithm for the DELETE blocks the algorithm for the UPDATE after the UPDATE has already taken some locks, then finally the INSERT cannot proceed because of locks already taken by the UPDATE, and there is occasionally a coincidence where the INSERT happens to want to affect the same page as is already locked by the UPDATE, even though at a row level there cannot be any conflict.

What strategies or best practices can I employ to avoid this situation?

Avoid nonsensical test scenarios, I guess. Unindexed heap tables are an extremely uncommon animal, and for there to be intensive concurrent activity on such a table that would almost certainly benefit from an index, is probably even less common again.

Without understanding the purpose for this configuration - I assume it actually had no purpose, except to cause a deadlock scenario - it's impossible to advise meaningfully on what alterations to design would avoid the situation whilst still fulfilling the underlying purpose.

Potential solutions for real-world scenarios would include indexing the data differently, operating on less data at once in a single transaction, taking larger-grained locks at the outset of the transaction (locking the entire table), avoiding a 31 second wait in the middle of a transaction, and automatically retrying in the event of a deadlock.

In this particular case, it's also possible that hinting for smaller-granularity (row-level instead of page level) locks would solve the deadlock, since that would avoid the case where otherwise separate data happens to fall on the same page, and two processes therefore end up competing for locks on the same page even though their activity concerns different rows on that page.

Steve Steve 8875 silver badges7 bronze badges · Answer 1 · 2024-02-10 12:26:12Z

It seems to be an extremely complicated setup for something that is only designed to test a principle.

If I understand the code correctly, you begin with a test table with 500,000 sequentially and uniquely numbered rows. Only the id column really matters in analysing the logic - the rest of the columns are just padding to make each row consume more space (intentional or not?).

The id column is not in fact an identity for each row, because there is no primary key (or other constraint) defined.

Hence, the INSERT can proceed to add a row with id=240000 twice. This falsifies one of your claims:

Additionally, since the INSERT statement adds a row that would match the DELETE condition (note the id equals to 240000)

If there was a primary key, the insert would always fail, because the preceding DELETE does not remove rows with id=240000 (it removes only those that satisfy the filter id % 5000 = 1, which in plain English means it removes every 5000th row which begins with a 1; so 1, 5001, 10001, ... 240001, etc.)

My understanding is that an INSERT should always request an exclusive (X) lock. Why is it trying to obtain an update lock in this scenario?

What the deadlock XML actually describes is that each process holds an exclusive lock on a page for itself, whilst each process is also seeking an update lock on a page that is already locked by the other.

The main conflict in principle appears to be between the DELETE and UPDATE statements, since these are both seeking to operate upon the same data.

My guess is that the algorithm for the DELETE blocks the algorithm for the UPDATE after the UPDATE has already taken some locks, then finally the INSERT cannot proceed because of locks already taken by the UPDATE, and there is occasionally a coincidence where the INSERT happens to want to affect the same page as is already locked by the UPDATE, even though at a row level there cannot be any conflict.

What strategies or best practices can I employ to avoid this situation?

Avoid nonsensical test scenarios, I guess. Unindexed heap tables are an extremely uncommon animal, and for there to be intensive concurrent activity on such a table that would almost certainly benefit from an index, is probably even less common again.

Without understanding the purpose for this configuration - I assume it actually had no purpose, except to cause a deadlock scenario - it's impossible to advise meaningfully on what alterations to design would avoid the situation whilst still fulfilling the underlying purpose.

Potential solutions for real-world scenarios would include indexing the data differently, operating on less data at once in a single transaction, taking larger-grained locks at the outset of the transaction (locking the entire table), avoiding a 31 second wait in the middle of a transaction, and automatically retrying in the event of a deadlock.

In this particular case, it's also possible that hinting for smaller-granularity (row-level instead of page level) locks would solve the deadlock, since that would avoid the case where otherwise separate data happens to fall on the same page, and two processes therefore end up competing for locks on the same page even though their activity concerns different rows on that page.

Stack Exchange Network

Understanding Lock Behavior in SQL Server Deadlock Scenario with INSERT

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Understanding Lock Behavior in SQL Server Deadlock Scenario with INSERT

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions