16

We have encountered this strange error three times over the past few days, after being error free for 8 weeks, and I'm stumped.

This is the error message:

Executing the query "EXEC dbo.MergeTransactions" failed with the following error:
"Cannot insert duplicate key row in object 'sales.Transactions' with unique index
'NCI_Transactions_ClientID_TransactionDate'.
The duplicate key value is (1001, 2018年12月14日 19:16:29.00, 304050920).".

The index we have is not unique. If you notice, the duplicate key value in the error message doesn’t even line up with the index. Strange thing is if I rerun the proc, it succeeds.

This is the most recent link I could find that has my issues but I don't see a solution.

https://www.sqlservercentral.com/forums/topic/error-cannot-insert-duplicate-key-row-in-a-non-unique-index

A couple things about my scenario:

  • The proc is updating the TransactionID (part of the primary key) - I think this is what is causing the error but don't know why? We'll be removing that logic.
  • Change tracking is enabled on the table
  • Doing transaction read uncommitted

There are 45 fields for each table, I mainly listed the ones used in indexes. I'm updating the TransactionID (clustered key) in the update statement (unnecessarily). Strange that we haven't had any issues for months until last week. And it's only happening sporadically via SSIS.

Table

USE [DB]
GO
/****** Object: Table [sales].[Transactions] Script Date: 5/29/2019 1:37:49 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND type in (N'U'))
BEGIN
CREATE TABLE [sales].[Transactions]
(
 [TransactionID] [bigint] NOT NULL,
 [ClientID] [int] NOT NULL,
 [TransactionDate] [datetime2](2) NOT NULL,
 /* snip*/
 [BusinessUserID] [varchar](150) NOT NULL,
 [BusinessTransactionID] [varchar](150) NOT NULL,
 [InsertDate] [datetime2](2) NOT NULL,
 [UpdateDate] [datetime2](2) NOT NULL,
 CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED 
(
 [TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [DB_Data]
) ON [DB_Data]
END
GO
USE [DB]
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND name = N'NCI_Transactions_ClientID_TransactionDate')
begin
CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
 [ClientID] ASC,
 [TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE) ON [DB_Data]
END
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_Units]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_Units] DEFAULT ((0)) FOR [Units]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_ISOCurrencyCode]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_ISOCurrencyCode] DEFAULT ('USD') FOR [ISOCurrencyCode]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_InsertDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_InsertDate] DEFAULT (sysdatetime()) FOR [InsertDate]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_UpdateDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_UpdateDate] DEFAULT (sysdatetime()) FOR [UpdateDate]
END
GO

temporary table

same columns as the mgdata. including the relevant fields. Also has a non-unique clustered index
(
 [BusinessTransactionID] [varchar](150) NULL,
 [BusinessUserID] [varchar](150) NULL,
 [PostalCode] [varchar](25) NULL,
 [TransactionDate] [datetime2](2) NULL,
 [Units] [int] NOT NULL,
 [StartDate] [datetime2](2) NULL,
 [EndDate] [datetime2](2) NULL,
 [TransactionID] [bigint] NULL,
 [ClientID] [int] NULL,
) 
CREATE CLUSTERED INDEX ##workingTransactionsMG_idx ON #workingTransactions (TransactionID)
It is populated in batches (500k rows at a time), something like this
IF OBJECT_ID(N'tempdb.dbo.#workingTransactions') IS NOT NULL DROP TABLE #workingTransactions;
select fields 
into #workingTransactions
from import.Transactions
where importrowid between two number ranges -- pseudocode

Primary Key

 CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED 
(
 [TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [Data]
) ON [Data]

Non-clustered index

CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
 [ClientID] ASC,
 [TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE)

sample update statement

-- updates every field
update t 
set 
 t.transactionid = s.transactionid,
 t.[CityCode]=s.[CityCode],
 t.TransactionDate=s.[TransactionDate],
 t.[ClientID]=s.[ClientID],
 t.[PackageMonths] = s.[PackageMonths],
 t.UpdateDate = @UpdateDate
 FROM #workingTransactions s
 JOIN [DB].[sales].[Transactions] t 
 ON s.[TransactionID] = t.[TransactionID]
 WHERE CAST(HASHBYTES('SHA2_256 ',CONCAT( S.[BusinessTransactionID],'|',S.[BusinessUserID],'|', etc)
 <> CAST(HASHBYTES('SHA2_256 ',CONCAT( T.[BusinessTransactionID],'|',T.[BusinessUserID],'|', etc)

My question is, what is going on under the hood? And what is the solution? For reference, the link above mentions this:

At this point, I have a few theories:

  • Bug related to memory pressure or large parallel update plan, but I would expect a different type of error and so far I cannot correlate low resources will timeframe of these isolated and sporadic errors.
  • A bug in UPDATE statement or data is causing an actual duplicate violation on the primary key, but some obscure SQL Server bug is resulting in and error message that cites the wrong index name.
  • Dirty reads resulting from read uncommitted isolation causing a large parallel update to double insert. But ETL developers claim default read committed is used, and it's hard to determine exactly what isolation level the process is actually used at runtime.

I suspect that if I tweak the execution plan as a work-around, perhaps MAXDOP (1) hint or using session trace flag to disable spool operation, the error will just go away, but it's unclear how this would impact performance

Version

Microsoft SQL Server 2017 (RTM-CU13) (KB4466404) - 14.0.3048.4 (X64) Nov 30 2018 12:57:58 Copyright (C) 2017 Microsoft Corporation Enterprise Edition (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393: )

Paul White
95.4k30 gold badges440 silver badges689 bronze badges
asked Jun 18, 2019 at 14:54
0

1 Answer 1

13

My question is, what is going on under the hood? And what is the solution?

It is a bug. The problem is that it only happens occasionally, and will be tough to reproduce. Still, your best chance is to engage Microsoft support. Update processing is mind-bendingly complex, so this will require a very detailed investigation.

For an example of the sort of complexities involved, have a look at my posts MERGE Bug with Filtered Indexes and Incorrect Results with Indexed Views. Neither of those relate directly to your issue, but they do give a flavour.

Write a deterministic update

That's all rather generic of course. Perhaps more usefully, I can say that you should look to rewrite your current UPDATE statement. As the documentation says:

Use caution when specifying the FROM clause to provide the criteria for the update operation. The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.

Your UPDATE is not deterministic, and the results are therefore undefined. You should change it so that at most one source row is identified for each target row. Without that change, the result of the update may not reflect any individual source row.

Example

Let me show you an example, using tables loosely modelled on those given in the question:

CREATE TABLE dbo.Transactions
(
 TransactionID bigint NOT NULL,
 ClientID integer NOT NULL,
 TransactionDate datetime2(2) NOT NULL,
 CONSTRAINT PK_dbo_Transactions
 PRIMARY KEY CLUSTERED (TransactionID),
 INDEX dbo_Transactions_ClientID_TranDate
 (ClientID, TransactionDate)
);
CREATE TABLE #Working
(
 TransactionID bigint NULL,
 ClientID integer NULL,
 TransactionDate datetime2(2) NULL,
 INDEX cx CLUSTERED (TransactionID)
);

To keep things simple, put one row in the target table, and four rows in the source:

INSERT dbo.Transactions 
 (TransactionID, ClientID, TransactionDate)
VALUES 
 (1, 1, '2019-01-01');
INSERT #Working 
 (TransactionID, ClientID, TransactionDate)
VALUES 
 (1, 2, NULL),
 (1, NULL, '2019-03-03'),
 (1, 3, NULL),
 (1, NULL, '2019-02-02');

All four source rows match the target on TransactionID, so which one will be used if we run an update (like the one in the question) that joins on TransactionID alone?

UPDATE T
SET T.TransactionID = W.TransactionID,
 T.ClientID = W.ClientID,
 T.TransactionDate = W.TransactionDate
FROM #Working AS W
JOIN dbo.Transactions AS T
 ON T.TransactionID = W.TransactionID;

(Updating the TransactionID column is not important for the demo, you can comment it out if you like.)

The first surprise is that the UPDATE completes without an error, despite the target table not allowing nulls in any column (all the candidate rows contain a null).

The important point is that the result is undefined, and in this case produces an outcome that matches none of the source rows:

SELECT
 T.TransactionID,
 T.ClientID,
 T.TransactionDate
FROM dbo.Transactions AS T;
╔═══════════════╦══════════╦════════════════════════╗
║ TransactionID ║ ClientID ║ TransactionDate ║
╠═══════════════╬══════════╬════════════════════════╣
║ 1 ║ 2 ║ 2019年03月03日 00:00:00.00 ║
╚═══════════════╩══════════╩════════════════════════╝

db<>fiddle demo

More details: The ANY Aggregate is Broken

The update should be written such that it would succeed if written as the equivalent MERGE statement, which does check for attempts to update the same target row more than once. I do not generally recommend using MERGE directly, because it has been subject to so many implementation bugs, and normally has worse performance.

As a bonus, you may find that rewriting your current update to be deterministic will result in your occasional bug problem also going away. The product bug will still exist for people that write non-determinstic updates of course.

answered Jun 23, 2019 at 13:06
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.