TL;DR: The question below boils down to: When inserting a row, is there a window of opportunity between the generation of a new Identity
value and the locking of the corresponding row key in the clustered index, where an external observer could see a newer Identity
value inserted by a concurrent transaction? (In SQL Server.)
Detailed version
I have a SQL Server table with an Identity
column called CheckpointSequence
, which is the key of the table's clustered index (which also has a number of additional nonclustered indexes). Rows are inserted into the table by several concurrent processes and threads (at isolation level READ COMMITTED
, and without IDENTITY_INSERT
). At the same time, there are processes periodically reading rows from the clustered index, ordered by that CheckpointSequence
column (also at isolation level READ COMMITTED
, with the READ COMMITTED SNAPSHOT
option being turned off).
I currently rely on the fact that the reading processes can never "skip" a checkpoint. My question is: Can I rely on this property? And if not, what could I do to make it true?
Example: When rows with identity values 1, 2, 3, 4, and 5 are inserted, the reader must not see the row with value 5 prior to seeing the one with value 4. Tests show that the query, which contains an ORDER BY CheckpointSequence
clause (and a WHERE CheckpointSequence > -1
clause), reliably blocks whenever row 4 is to be read, but not yet committed, even if row 5 has already been committed.
I believe that at least in theory, there may be a race condition here that might cause this assumption to break. Unfortunately, documentation on Identity
doesn't say a lot about how Identity
works in the context of multiple concurrent transactions, it only says "Each new value is generated based on the current seed & increment." and "Each new value for a particular transaction is different from other concurrent transactions on the table." (MSDN)
My reasoning is, it must work somehow like this:
- A transaction is started (either explicitly or implicitly).
- An identity value (X) is generated.
- The corresponding row lock is taken on the clustered index based on the identity value (unless lock escalation kicks in, in which case the whole table is locked).
- The row is inserted.
- The transaction is committed (possibly quite a lot of time later), so the lock is removed again.
I think that between step 2 and 3, there is a very tiny window where
- a concurrent session could generate the next identity value (X+1) and execute all the remaining steps,
- thus allowing a reader coming exactly at that point of time to read the value X+1, missing the value of X.
Of course, the probability of this seems extremely low; but still - it could happen. Or could it?
(If you're interested in the context: This is the implementation of NEventStore's SQL Persistence Engine. NEventStore implements an append-only event store where every event gets a new, ascending checkpoint sequence number. Clients read events from the event store ordered by checkpoint in order to perform computations of all sorts. Once an event with checkpoint X has been processed, clients only consider "newer" events, i.e., events with checkpoint X+1 and above. Therefore, it is vital that events can never be skipped, as they'd never be considered again. I'm currently trying to determine if the Identity
-based checkpoint implementation meets this requirement. These are the exact SQL statements used: Schema, Writer's query, Reader's Query.)
If I'm right and the situation described above could arise, I can see only two options of dealing with them, both of which are unsatisfactory:
- When seeing a checkpoint sequence value X+1 before having seen X, dismiss X+1 and try again later. However, because
Identity
can of course produce gaps (e.g., when the transaction is rolled back), X might never come. - So, same approach, but accept the gap after n milliseconds. However, what value of n should I assume?
Any better ideas?
-
Have you tried using Sequence instead of identity? With identity, I don't think you can reliably predict which insert will get a particular identity value but this should not be a problem using a sequence. Of course that changes how you do things now though.Antoine Hernandez– Antoine Hernandez2016年03月28日 16:56:24 +00:00Commented Mar 28, 2016 at 16:56
-
@SoleDBAGuy Wouldn't a Sequence make the race condition I described above even more likely? I produce a new Sequence value X (replacing step 2 above), then insert a row (steps 3 and 4). Between 2 and 3, there is a possibility that someone else might produce the next Sequence value X+1, commits it, and a reader reads that value X+1 before I even get to inserting my row with Sequence value X.Fabian Schmied– Fabian Schmied2016年03月29日 06:16:02 +00:00Commented Mar 29, 2016 at 6:16
6 Answers 6
When inserting a row, is there a window of opportunity between the generation of a new Identity value and the locking of the corresponding row key in the clustered index, where an external observer could see a newer Identity value inserted by a concurrent transaction?
Yes.
The allocation of identity values is independent of the containing user transaction. This is one reason that identity values are consumed even if the transaction is rolled back. The increment operation itself is protected by a latch to prevent corruption, but that is the extent of the protections.
In the specific circumstances of your implementation, the identity allocation (a call to CMEDSeqGen::GenerateNewValue
) is made before the user transaction for the insert is even made active (and so before any locks are taken).
By running two inserts concurrently with a debugger attached to allow me to freeze one thread just after the identity value is incremented and allocated, I was able to reproduce a scenario where:
- Session 1 acquires an identity value (3)
- Session 2 acquires an identity value (4)
- Session 2 performs its insert and commits (so row 4 is fully visible)
- Session 1 performs its insert and commits (row 3)
After step 3, a query using row_number under locking read committed returned the following:
In your implementation, this would result in Checkpoint ID 3 being skipped incorrectly.
The window of misopportunity is relatively small, but it exists. To give a more realistic scenario than having a debugger attached: An executing query thread can yield the scheduler after step 1 above. This allows a second thread to allocate an identity value, insert and commit, before the original thread resumes to perform its insert.
For clarity, there are no locks or other synchronization objects protecting the identity value after it is allocated and before it is used. For example, after step 1 above, a concurrent transaction can see the new identity value using T-SQL functions like IDENT_CURRENT
before the row exists in the table (even uncommitted).
Fundamentally, there are no more guarantees around identity values than documented:
- Each new value is generated based on the current seed & increment.
- Each new value for a particular transaction is different from other concurrent transactions on the table.
That really is it.
If strict transactional FIFO processing is required, you likely have no choice but to serialize manually. If the application has less oneous requirements, you have more options. The question isn't 100% clear in that regard. Nevertheless, you may find some useful information in Remus Rusanu's article Using Tables as Queues.
As Paul White answered absolutely correct there is a possibility for temporarily "skipped" identity rows. Here is just a small piece of code to reproduce this case for your own.
Create a database and a testtable:
create database IdentityTest
go
use IdentityTest
go
create table dbo.IdentityTest (ID int identity, c1 char(10))
create clustered index CI_dbo_IdentityTest_ID on dbo.IdentityTest(ID)
Perform concurrent inserts and selects on this table in a C# console program:
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Threading;
namespace IdentityTest
{
class Program
{
static void Main(string[] args)
{
var insertThreads = new List<Thread>();
var selectThreads = new List<Thread>();
//start threads for infinite inserts
for (var i = 0; i < 100; i++)
{
insertThreads.Add(new Thread(InfiniteInsert));
insertThreads[i].Start();
}
//start threads for infinite selects
for (var i = 0; i < 10; i++)
{
selectThreads.Add(new Thread(InfiniteSelectAndCheck));
selectThreads[i].Start();
}
}
private static void InfiniteSelectAndCheck()
{
//infinite loop
while (true)
{
//read top 2 IDs
var cmd = new SqlCommand("select top(2) ID from dbo.IdentityTest order by ID desc")
{
Connection = new SqlConnection("Server=localhost;Database=IdentityTest;Integrated Security=SSPI;Application Name=IdentityTest")
};
try
{
cmd.Connection.Open();
var dr = cmd.ExecuteReader();
//read first row
dr.Read();
var row1 = int.Parse(dr["ID"].ToString());
//read second row
dr.Read();
var row2 = int.Parse(dr["ID"].ToString());
//write line if row1 and row are not consecutive
if (row1 - 1 != row2)
{
Console.WriteLine("row1=" + row1 + ", row2=" + row2);
}
}
finally
{
cmd.Connection.Close();
}
}
}
private static void InfiniteInsert()
{
//infinite loop
while (true)
{
var cmd = new SqlCommand("insert into dbo.IdentityTest (c1) values('a')")
{
Connection = new SqlConnection("Server=localhost;Database=IdentityTest;Integrated Security=SSPI;Application Name=IdentityTest")
};
try
{
cmd.Connection.Open();
cmd.ExecuteNonQuery();
}
finally
{
cmd.Connection.Close();
}
}
}
}
}
This console prints a line for every case when one of the reading threads "misses" an entry.
-
1Nice code but you only check for consecutive ids ("//write line if row1 and row are not consecutive"). There may be gaps produced that your code will print. That doesn't mean that these gaps will be filled later.ypercubeᵀᴹ– ypercubeᵀᴹ2016年04月13日 10:06:27 +00:00Commented Apr 13, 2016 at 10:06
-
1Because the code doesn't trigger a scenario where
IDENTITY
would produce gaps (such as rolling back a transaction), the printed lines indeed show "skipped" values (or at least they did when I ran and checked it on my machine). Very nice repro sample!Fabian Schmied– Fabian Schmied2016年04月14日 07:08:05 +00:00Commented Apr 14, 2016 at 7:08
It is best to not expect the identities to be consecutive because there are many scenarios that can leave gaps. It is better to consider the identity like an abstract number and to not attach any business meaning to it.
Basically, gaps can happen if you roll back INSERT operations (or explicitly delete rows), and duplicates can occur if you set the table property IDENTITY_INSERT to ON.
Gaps can occur when:
- Records are deleted.
- An error has occurred when attempting to insert a new record (rolled back)
- An update/insert with explicit value (identity_insert option).
- Incremental value is more than 1.
- A transaction rolls back.
The identity property on a column has never guaranteed:
• Uniqueness
• Consecutive values within a transaction. If values must be consecutive then the transaction should use an exclusive lock on the table or use the SERIALIZABLE isolation level.
• Consecutive values after server restart.
• Reuse of values.
If you cannot use identity values because of this, create a separate table holding a current value and manage access to the table and number assignment with your application. This does have the potential of impacting performance.
https://msdn.microsoft.com/en-us/library/ms186775(v=sql.105).aspx
https://msdn.microsoft.com/en-us/library/ms186775(v=sql.110).aspx
-
I think gaps are not my primary problem - my main problem is ascending visibility of values. (I.e., say, identity value 7 must not be observable to a query ordering by that value before identity value 6 is.)Fabian Schmied– Fabian Schmied2016年03月29日 06:11:21 +00:00Commented Mar 29, 2016 at 6:11
-
1I have seen identity values commit like: 1, 2, 5, 3, 4.stacylaray– stacylaray2016年03月29日 06:43:10 +00:00Commented Mar 29, 2016 at 6:43
-
Sure, this is easily reproducible, e.g., using the scenario from Lennart's answer. The question I'm struggling with is whether I can observe that commit order when using a query with an
ORDER BY CheckpointSequence
clause (which happens to be the order of the clustered index). I think it boils down to the question whether the generation of an Identity value is anyhow linked to the locks taken by the INSERT statement, or if these are simply two unrelated actions performed by SQL Server one after the other.Fabian Schmied– Fabian Schmied2016年03月29日 06:48:13 +00:00Commented Mar 29, 2016 at 6:48 -
1What is the query? If using read committed then, in your example, order by would show 1, 2, 3, 5 because they have been committed and 4 has not, i.e. dirty read. Also, your explanation of NEventStore states "Therefore, it is vital that events can never be skipped, as they'd never be considered again."stacylaray– stacylaray2016年03月29日 07:43:59 +00:00Commented Mar 29, 2016 at 7:43
-
The query is given above (gist.github.com/fschmied/47f716c32cb64b852f90) - it's paged, but boils down to a simple
SELECT ... FROM Commits WHERE CheckpointSequence > ... ORDER BY CheckpointSequence
. I don't think this query would read past the locked row 4, or would it? (In my experiments, it blocks when the query tries to acquire the KEY lock for row 4.)Fabian Schmied– Fabian Schmied2016年03月29日 07:48:42 +00:00Commented Mar 29, 2016 at 7:48
I suspect that it occasionally can lead to trouble, troubles that become worse when the server is under heavy load. Consider two transaction:
- T1: insert into T ... -- say 5 get inserted
- T2: insert into T ... -- say 6 get inserted
- T2: commit
- Reader sees 6 but not 5
- T1: commit
In the above scenario your LAST_READ_ID will be 6, so 5 will never be read.
-
My tests seem to indicate that this scenario is not a problem because Reader (step 4) will block (until T1 has released its locks) when it tries to read the row with value 5. Am I missing something?Fabian Schmied– Fabian Schmied2016年03月29日 06:29:27 +00:00Commented Mar 29, 2016 at 6:29
-
You might be right, I don't know the locking mechanism in SQL server all that well (hence I suspect in my answer).Lennart - Slava Ukraini– Lennart - Slava Ukraini2016年03月29日 06:49:01 +00:00Commented Mar 29, 2016 at 6:49
-
Depends on reader's isolation level. It my see both, block, or see only 6.Michael Green– Michael Green2016年04月13日 16:35:16 +00:00Commented Apr 13, 2016 at 16:35
Running this script:
BEGIN TRAN;
INSERT INTO dbo.Example DEFAULT VALUES;
COMMIT;
Below are the locks I see acquired and released as captured by an Extended Event session:
name timestamp associated_object_id mode object_id resource_type session_id resource_description
lock_acquired 2016年03月29日 06:37:28.9968693 1585440722 IX 1585440722 OBJECT 51
lock_acquired 2016年03月29日 06:37:28.9969268 7205759890195415040 IX 0 PAGE 51 1:1235
lock_acquired 2016年03月29日 06:37:28.9969306 7205759890195415040 RI_NL 0 KEY 51 (ffffffffffff)
lock_acquired 2016年03月29日 06:37:28.9969330 7205759890195415040 X 0 KEY 51 (29cf3326f583)
lock_released 2016年03月29日 06:37:28.9969579 7205759890195415040 X 0 KEY 51 (29cf3326f583)
lock_released 2016年03月29日 06:37:28.9969598 7205759890195415040 IX 0 PAGE 51 1:1235
lock_released 2016年03月29日 06:37:28.9969607 1585440722 IX 1585440722 OBJECT 51
Note the RI_N KEY lock acquired immediately before the X key lock for the the new row being created. This short-lived range lock will prevent a concurrent insert from acquiring another RI_N KEY lock since RI_N locks are incompatible. The window you mentioned between steps 2 and 3 is not a concern because the range lock is acquired before the row lock on the newly generated key.
As long as your SELECT...ORDER BY
begins the scan before the desired newly-inserted rows, I would expect the behavior you desire in the default READ COMMITTED
isolation level as long as the database READ_COMMITTED_SNAPSHOT
option is turned off.
-
2According to technet.microsoft.com/en-us/library/…, two locks with
RangeI_N
are compatible, i.e., don't block each other (the lock is mostly there for blocking on an existing serializable reader).Fabian Schmied– Fabian Schmied2016年03月30日 09:38:44 +00:00Commented Mar 30, 2016 at 9:38 -
@FabianSchmied, interesting. That topic conflicts with the lock compatibility matrix in technet.microsoft.com/en-us/library/ms186396(v=sql.105).aspx, which shows the locks are not compatible. The insert example in the link you mentioned does state the same behavior as shown in the trace in my answer (short-lived insert range lock to test the range before the exclusive key lock).Dan Guzman– Dan Guzman2016年03月30日 11:03:10 +00:00Commented Mar 30, 2016 at 11:03
-
2Actually, the matrix says "N" for "no conflict" (not for "not compatible") :)Fabian Schmied– Fabian Schmied2016年03月31日 09:40:21 +00:00Commented Mar 31, 2016 at 9:40
From my understanding of SQL Server the default behaviour is for the second query to not display any results until the first query has been committed. If the first query does a ROLLBACK instead of a COMMIT, then you will have a missing ID in your column.
Basic Configuration
Database Table
I created a database table with the following structure:
CREATE TABLE identity_rc_test (
ID4VALUE INT IDENTITY (1,1),
TEXTVALUE NVARCHAR(20),
CONSTRAINT PK_ID4_VALUE_CLUSTERED
PRIMARY KEY CLUSTERED (ID4VALUE, TEXTVALUE)
)
Database Isolation Level
I checked the isolation level of my database with the following statement:
SELECT snapshot_isolation_state,
snapshot_isolation_state_desc,
is_read_committed_snapshot_on
FROM sys.databases WHERE NAME = 'mydatabase'
Which returned the following result for my database:
snapshot_isolation_state snapshot_isolation_state_desc is_read_committed_snapshot_on
0 OFF 0
(This is the default setting for a database in SQL Server 2012)
Test Scripts
The following scripts were executed using the standard SQL Server SSMS client settings and the standard SQL Server settings.
Client connections settings
The client has been set to use the Transaction Isolation Level READ COMMITTED
as per the Query Options in SSMS.
Query 1
The following query was executed in a Query window with the SPID 57
SELECT * FROM dbo.identity_rc_test
BEGIN TRANSACTION [FIRST_QUERY]
INSERT INTO dbo.identity_rc_test (TEXTVALUE) VALUES ('Nine')
/* Commit is commented out to prevent the INSERT from being commited
--COMMIT TRANSACTION [FIRST_QUERY]
--ROLLBACK TRANSACTION [FIRST_QUERY]
*/
Query 2
The following query was executed in a Query window with the SPID 58
BEGIN TRANSACTION [SECOND_QUERY]
INSERT INTO dbo.identity_rc_test (TEXTVALUE) VALUES ('Ten')
COMMIT TRANSACTION [SECOND_QUERY]
SELECT * FROM dbo.identity_rc_test
The query doesn't complete and is waiting for the eXclusive lock to be release on a PAGE.
Script to determine locking
This script displays the locking occurring on the database objects for the two transactions:
SELECT request_session_id, resource_type,
resource_description,
resource_associated_entity_id,
request_mode, request_status
FROM sys.dm_tran_locks
WHERE request_session_id IN (57, 58)
And here are the results:
58 DATABASE 0 S GRANT
57 DATABASE 0 S GRANT
58 PAGE 1:79 72057594040549300 IS GRANT
57 PAGE 1:79 72057594040549300 IX GRANT
57 KEY (a0aba7857f1b) 72057594040549300 X GRANT
58 KEY (a0aba7857f1b) 72057594040549300 S WAIT
58 OBJECT 245575913 IS GRANT
57 OBJECT 245575913 IX GRANT
The results show that query window one (SPID 57) has a Shared lock (S) on the DATABASE an Intended eXlusive (IX) lock on the OBJECT, an Intended eXlusive (IX) lock on the PAGE it wants to insert to and an eXclusive lock (X) on the KEY it is has inserted, but not yet committed.
Because of the uncommitted data, the second query (SPID 58) has a Shared lock (S) on the DATABASE level, an Intended Shared (IS) lock on the OBJECT, an Intended Shared (IS) lock on the page a Shared (S) lock on the KEY with a request status WAIT.
Summary
The query in the first query window executes without committing. Because the second query can only READ COMMITTED
data it waits either until the timeout occurs or until the transaction has been committed in the first query.
This is from my understanding the default behaviour of Microsoft SQL Server.
You should observe that the ID is indeed in sequence for subsequent reads by SELECT statements if the first statement COMMITs.
If the first statement does a ROLLBACK then you will find a missing ID in the sequence, but still with the ID in ascending order (provided you created the INDEX with the default or ASC option on the ID column).
Update:
(Bluntly) Yes, you can rely on the identity column functioning correctly, until you encounter an issue. There is only one HOTFIX regarding SQL Server 2000 and the identity column on Microsoft's website.
If you couldn't rely on the identity column updating correctly, I think there would be more hotfixes or patches on Microsoft's website.
If you have a Microsoft Support Contract you could always open up an Advisory Case and ask for additional information.
-
1Thanks for the analysis, but my question is if there is a time window between the generation of the next
Identity
value and the acquisition of the KEY lock on the row (where concurrent reads/writers could fall into). I don't think this is proven impossible by your observations because one cannot stop query execution and analyze locks during that ultra-short time window.Fabian Schmied– Fabian Schmied2016年03月30日 09:21:31 +00:00Commented Mar 30, 2016 at 9:21 -
No you cannot stop the statements, but my (slow) observation is what happens on a fast/normal basis. As soon as one SPID acquires a lock to insert data, the other one will be unable to acquire the same lock. The faster statement will have the advantage of having already acquired the lock and the ID in sequence. The next statement will receive the next ID after the lock has been released.John K. N.– John K. N.2016年03月30日 10:28:16 +00:00Commented Mar 30, 2016 at 10:28
-
1On a normal basis, your observations match my own (and also my expectations) - that's good to know. I wonder if there are exceptional situations where they won't hold, though.Fabian Schmied– Fabian Schmied2016年03月30日 10:57:16 +00:00Commented Mar 30, 2016 at 10:57
Explore related questions
See similar questions with these tags.