Hello people smarter than me! I've created a sort-of-a-queue table system, but it seems too simple to be safe from race conditions. Am I missing something or is the following race condition safe?
The Schema
I have a table, let's call it ProductQueue
:
CREATE TABLE dbo.ProductQueue
(
SerialId BIGINT PRIMARY KEY,
QueuedDateTime DATETIME NOT NULL -- Only using this for reference, no functionality is tied to it
);
I have procedure for adding to the queue called AddToProductQueue
:
CREATE PROCEDURE dbo.AddToProductQueue (@SerialId BIGINT)
AS
BEGIN
INSERT INTO dbo.ProductQueue (SerialId, QueuedDateTime)
OUTPUT Inserted.SerialId
SELECT @SerialId, GETDATE();
END
I also have a procedure for removing from the queue called RemoveFromProductQueue
:
CREATE PROCEDURE dbo.RemoveFromProductQueue (@SerialId BIGINT)
AS
BEGIN
DELETE FROM dbo.ProductQueue
OUTPUT Deleted.SerialId
WHERE SerialId = @SerialId;
END
Note, SerialId
is globally unique for a Product
in the source database / system. I.e. no two instances of a Product
can ever have the same SerialId
. That's the extent of it on the database side.
The Workflow
- I have an application process that runs hourly.
- That process gets a variable list of
SerialIds
from the source system. - It iteratively calls the
AddToProductQueue
procedure on eachSerialId
in its list. - If the procedure tries to insert a
SerialId
that exists in theProductQueue
table already, it throws a primary key violation error, and the application process catches that error and skips thatSerialId
. - Otherwise, the procedure successfully adds that
SerialId
to theProductQueue
table and returns it back to the application process. - The application process then adds that successfully queued
SerialId
to a separate list. - After the application process finishes iterating its list of all candidate
SerialIds
to enqueue, it then iterates its new list of successfully queuedSerialIds
and does external work on them, in a separate thread perSerialId
. (This work is all unrelated to the database.) - Finally, as each thread finishes its external work, the last step in that asynchronous thread is to remove that
SerialId
from theProductQueue
table by calling theRemoveFromProductQueue
procedure. (Note that a new database context object is instantiated and a new connection is created for each asynchronous call to this procedure, so that it is thread-safe on the application side.)
Additional Information
- There aren't any indexes on the
ProductQueue
table, and it'll never have more than 1,000 rows in it at one time. (Actually, most times it'll literally only have a couple of rows.) - The same
SerialId
can become a candidate again to be re-added to the queue table on a future execution of the application process. - There are no safe guards from preventing a second instance of the application process from concurrently running, either by accident or if the first instance took more than 1 hour to run, etc. (This is the concurrent part I'm most concerned about.)
- The transaction isolation level of the database (and connection being made) where the queue table and procedures live is the default isolation level of
Read Committed
.
Potential Problems
- The running instance of the application process crashes in an unhandled way, leaving
SerialIds
stuck in the queue table. This is acceptable for the business needs, and we plan to have exception reports to help us manually remediate this case. - The application process gets executed multiple times concurrently and grabs some of the same
SerialIds
between instances in their initial source lists. I can't think of any negative ramifications of this case yet, since the enqueuing procedure is atomic, and the actual list ofSerialIds
that the application process will work on should be self-contained due to that atomic enqueuing procedure. We don't care which instance of the application process actually processes eachSerialId
as long as the sameSerialId
isn't processed concurrently by both process instances.
4 Answers 4
The only thing you're asking the database engine to do in this scenario is to enforce the PRIMARY KEY
. It will do that under all conditions and isolation levels, of course.
The potential race conditions are all external to the database, considerations that aren't really on-topic here.
That said, the only way I can think of the database being involved in a race condition is for the process of adding candidate serial IDs to be wrapped in a database transaction, but you didn't mention anything about that.
Perhaps it's possible for the processes to use a database transaction in an unintended way, for example if you're using an ORM that does helpful things by magic without being explicitly asked. Or perhaps you are using implicit transactions.
In that scenario, app instance A would start adding its (long list of) serial IDs all within a single database transaction. Meanwhile, app instance B (with an equally long list, including at least one present in A's list) would become blocked on the insert of an overlapping serial ID (because instance A hasn't committed its transaction yet).
In an unfortunate sequence of events, instance A would finish asynchronous processing a duplicate serial ID (toward the start of its list) before blocked instance B can perform its insert (towards the end of its list).
In that case, both A and B would successfully process the same serial ID.
This seems unlikely, given the process outline you have described but not impossible.
I'm not sure why the name of your table has the word "queue" in it, but it looks to me that what you really want is a locking mechanism.
To answer your question directly: if you don't care about stale locks, and as long as the INSERT
transaction, the processing, and the DELETE
transaction are in a causal relationship (i.e. happen in this order after the previous step reports success), I don't see any potential for double-processing.
That said, one way to address both issues that you list under "potential problems" in an automated way would be implementing the single-instance variant of the locking algorithm Redlock.
Each instance has its own unique id (say, a guid or something that you generate every time you run your processing application)
Whenever the instance grabs a
SerialId
, it tries to obtain a lock on it:CREATE TABLE ItemLock (serialId BIGINT NOT NULL PRIMARY KEY, instance UNIQUEIDENTIFIER NOT NULL, expires DATETIME NOT NULL);
WITH source (serialId, instance, expires) AS ( SELECT @serialId, @instance, @expires ) MERGE INTO ItemLock target USING source ON target.serialId = source.serialId WHEN NOT MATCHED BY TARGET THEN INSERT (serialId, instance, expires) VALUES (serialId, instance, expires) WHEN MATCHED AND (target.instance = source.instance OR target.expires <= @now) THEN UPDATE SET instance = source.instance, expires = source.expires OUTPUT INSERTED.serialId
The output of the query, if any, will be the serial id you have a lock on. If there is no output, you don't have the lock.
Pass the current timestamp in the variable
@now
and the timestamp five minutes in the future in the variable@expires
Run a keep-alive thread that submits this query every minute or however often you need it, extending the value in the variable
@expires
, say, five minutes past now every time. You can amend the query to extend locks that you haven't processed so far in a single batch. Make sure to commit right away.When you're done processing, delete the lock only if it's yours:
DELETE FROM ItemLock WHERE serialId = @serialId AND instance = @instance
This way:
- If your application dies, the lock will die by itself in 5 minutes.
- No concurrent processing will happen, as long as you successfully obtain the lock and keep it alive.
Of course, you can just install a real Redis instance and use a ready-made implementation of Redlock in your application, of which there are plenty.
-
Hey, thanks for your answer! Couple things: "not sure why the name of your table has the word "queue" in it, but it looks to me that what you really want is a locking mechanism." - Bingo! That's why my question introduces it as "sort-of-a-queue". But yes, it's really a self isolating locking system. Secondly, I appreciate the effort of an atomic upsert, but I rather avoid the risk of bugs with the
MERGE
statement and just hold the proper locks against the table within a transaction so I can run multiple statements as needed. Though I don't think I'd ever want to do anUPDATE
with...J.D.– J.D.2024年03月14日 21:12:41 +00:00Commented Mar 14, 2024 at 21:12 -
...my current use cases. The goal is to ensure once a specific
SerialId
is locked by an instance of the application process, only that same process can work on thatSerialId
so long as that instance of theSerialId
row exists in the table. Finally, "If your application dies, the lock will die by itself in 5 minutes." - It seems like the business rules would prefer to let the lock persist in such a scenario so that manual intervention can be had. Ultimately this application process is syncing data (in a non-transactional way) to a 3rd party system with limited controls / access.J.D.– J.D.2024年03月14日 21:13:35 +00:00Commented Mar 14, 2024 at 21:13
I found the Service Broker to be completely unsatisfactory, imposing a lot of overhead. The conversations require cleanup. You are better off with your table.
I would also recommend allowing row locks on the index behind your primary key.
-
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker.2024年03月24日 14:31:29 +00:00Commented Mar 24, 2024 at 14:31
Nothing wrong with what you are doing but SQL Server already has a built-in service, Service Broker, that handles and manages data queues. I would humbly suggest you invest your time in that.
-
Hey, I didn't downvote you, but I'd be happy to upvote you if you wanted to link to what you're referring to and expand on your answer a little bit by describing it further (which you can probably pull from the docs). 🙂J.D.– J.D.2024年03月15日 12:53:17 +00:00Commented Mar 15, 2024 at 12:53
Explore related questions
See similar questions with these tags.