1

It's documented, that recursive variable assignment (like this ↓↓↓↓) can sometimes return incorrect results:

SELECT @var = @var + [Name]
FROM dbo.People
...

But does that applies even in the case where I am sure there will be only one row returned?

Example (Fiddle):

CREATE TABLE dbo.People
(
 Id INT IDENTITY PRIMARY KEY,
 [Name] varchar(50) NOT NULL
);
INSERT INTO dbo.People([Name]) VALUES ('Bob');
DECLARE @var varchar(50) = '';
/*Is this bad even if the query always returns only one row?*/
SELECT @var = @var + [Name]
FROM dbo.People
WHERE Id = 1;
SELECT [@var] = @var;

The documentations says that "In this case, it isn't guaranteed that @Var would be updated on a row by row basis." which (to me) suggests that single row select should be fine, right?

Another example where one might want to use this is when calculating some aggregate values and adding them to somewhere:

SELECT @Total = @Total + SUM(InvoiceTotal) /*single row, aggregate of several values*/
FROM dbo.Invoice
WHERE ...;
/*Notice no GROUP BY*/
asked Nov 24, 2024 at 7:51

1 Answer 1

3

sometimes maybe

There may be some cases I haven't found where this would behave unexpectedly with single row assignment, but single-row guarantees (without needing order by via TOP or OFFSET/FETCH filtering) are relatively strong.

As far as I've ever seen, the problem is mostly for quirky updates and string concatenation. The string concatenation method is the given example in your linked documentation page, too. Both of these rely on undocumented behavior, which is dangerous at best. The number of things that can change to produce incorrect results are tough to control fully.

It's probably worth noting that you can get around the problems with string concatenation by using FOR XML PATH instead, too. Microsoft only documented STRING_AGG in their example.

The multi-row aggregate is somewhat more interesting to me under two scenarios:

  • NULL variables
  • Isolation levels

On NULLs

Here you really only have to be careful to initialize your starting variable with a value.

DECLARE
 @Total bigint; /*Implicitly assigned NULL*/
SELECT @Total = @Total + SUM(u.Reputation)
FROM dbo.Users AS u
WHERE u.DisplayName = N'john'
SELECT
 Total = @Total;
GO 
DECLARE
 @Total bigint = 0; /*Explicitly assigned 0*/
SELECT @Total = @Total + SUM(u.Reputation)
FROM dbo.Users AS u
WHERE u.DisplayName = N'john'
SELECT
 Total = @Total;
GO

The first query will return NULL, and the second query will return the correct total. Encountering NULL values in the results will of course not change things.

Isolation

This really only applies if you need a true "point in time" aggregate, but that largely depends on what you're doing with the total afterwards and its importance.

Consider that under most isolation levels, even Read Committed, you may miss or double count rows under concurrency with modification queries.

To avoid that, you would need to use either Snapshot, or Serializable Isolation, both of which explicitly disallow dirty, phantom, and non-repeatable reads.

Read Committed Snapshot Isolation may also provide adequate results, depending on what you're trying to avoid and how strict you need the reads to be.

answered Nov 24, 2024 at 16:44

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.