This statement:
INSERT INTO deleteme
SELECT #t.id, 'test' FROM #t
LEFT JOIN deleteme ON deleteme.id = #t.id
WHERE deleteme.id IS NULL;
...will fail with a primary key violation in the concurrent scenario (i.e. when inserting the same key that is absent in deleteme from several threads at the same time) with the default transaction isolation level of READ COMMITTED
:
Error: Violation of PRIMARY KEY constraint 'PK_DeleteMe'. Cannot insert duplicate key in object 'dbo.deleteme'.
What is the best way to prevent that?
The table looks something like this:
CREATE TABLE [dbo].[DeleteMe](
[id] [uniqueidentifier] NOT NULL,
[Value] [varchar](max) NULL,
CONSTRAINT [PK_DeleteMe] PRIMARY KEY ([id] ASC));
Update
From comments:
Why do you have multiple sessions, which can obviously pull the same id from somewhere, using the same permanent table without having some kind of session key to tell them apart? Even without a duplicate error, how are you going to know which rows belong to which session?
This is a stored procedure that is called by an external service that populates data in this table. The service generates the record id and it provides no guarantee that it won't send the same data twice. Or send the same data at the same time.
The database code is supposed to discard all the records with the same id, if one already exists. The service never sends two records with the same id in the same batch.
2 Answers 2
If you really need to run multiple threads at the same time you can enable the ignore_dup_key
option on the primary key.
This will just give a warning instead of an error when an insert would result in a duplicate key violation. But instead of failing it will discard the row(s) that if inserted would cause the uniqueness violation.
CREATE TABLE [dbo].[DeleteMe](
[id] [uniqueidentifier] NOT NULL,
[Value] [varchar](max) NULL,
CONSTRAINT [PK_DeleteMe]
PRIMARY KEY ([id] ASC)
WITH (IGNORE_DUP_KEY = ON));
Paul White's detailed explanation on the IGNORE_DUP_KEY option. Thanks Paul.
The name of the table DeleteMe
suggests that you are accumulating IDs into this table and then periodically delete rows with these IDs from some other permanent table. Is it true?
If it is true, then you can allow DeleteMe
to have duplicate IDs. Just have a non-unique index on id
instead of unique (and since id
is uniqueidentifier
it makes sense to make this index non-clustered as well):
CREATE TABLE [dbo].[DeleteMe](
[id] [uniqueidentifier] NOT NULL,
[Value] [varchar](max) NULL,
)
CREATE NONCLUSTERED INDEX [IX_ID] ON [dbo].[DeleteMe]
(
[id] ASC
)
If you use DeleteMe
to remove rows from another MainTable
like this, then it doesn't matter if id
in DeleteMe
is unique or not:
DELETE FROM MainTable
WHERE MainTable.ID IN (SELECT id FROM DeleteMe)
BTW, the query to populate DeleteMe
becomes much simpler as well:
INSERT INTO DeleteMe (id, Value)
SELECT #t.id, 'test'
FROM #t
-
At first I couldn't understand what the downvote was for, but it has now occurred to me that the reason could be this part of the question: "The service generates the record id..." If the service generates the ID, it's unlikely the ID will be used the way your answer assumes.Andriy M– Andriy M2015年08月21日 06:36:59 +00:00Commented Aug 21, 2015 at 6:36
-
@AndriyM, Somebody decided to downvote all my questions on SO and this answer today... As for this question, it is not really clear what is the ultimate goal. If you really need to filter out duplicates, then good solution should be
IGNORE_DUP_KEY
as Aaron said - I tried to find this hint in docs earlier, but couldn't because it is not in the section with table/query hints, but it's an index option. But, it may be OK to store duplicateIDs
, it depends on how they are used later.Vladimir Baranov– Vladimir Baranov2015年08月21日 07:29:12 +00:00Commented Aug 21, 2015 at 7:29
Explore related questions
See similar questions with these tags.
#t
must have duplicates at least across sessions. If you have two sessions that are both trying to insert the same primary key value at the same time, one of them will have to get an error. If you want to prevent the error, you'd have to prevent two sessions from trying to insert the same primary key value at the same time. In this case, that would mean ensuring that whatever populates#t
populates a different set of rows in each session.