Is recursive variable assignment bad even if recursively assigning single row?

Question 1

It's documented, that recursive variable assignment (like this ↓↓↓↓) can sometimes return incorrect results:

SELECT @var = @var + [Name]
FROM dbo.People
...

But does that applies even in the case where I am sure there will be only one row returned?

Example (Fiddle):

CREATE TABLE dbo.People
(
 Id INT IDENTITY PRIMARY KEY,
 [Name] varchar(50) NOT NULL
);
INSERT INTO dbo.People([Name]) VALUES ('Bob');
DECLARE @var varchar(50) = '';
/*Is this bad even if the query always returns only one row?*/
SELECT @var = @var + [Name]
FROM dbo.People
WHERE Id = 1;
SELECT [@var] = @var;

The documentations says that "In this case, it isn't guaranteed that @Var would be updated on a row by row basis." which (to me) suggests that single row select should be fine, right?

Another example where one might want to use this is when calculating some aggregate values and adding them to somewhere:

SELECT @Total = @Total + SUM(InvoiceTotal) /*single row, aggregate of several values*/
FROM dbo.Invoice
WHERE ...;
/*Notice no GROUP BY*/

Question 2

sometimes maybe

There may be some cases I haven't found where this would behave unexpectedly with single row assignment, but single-row guarantees (without needing order by via TOP or OFFSET/FETCH filtering) are relatively strong.

As far as I've ever seen, the problem is mostly for quirky updates and string concatenation. The string concatenation method is the given example in your linked documentation page, too. Both of these rely on undocumented behavior, which is dangerous at best. The number of things that can change to produce incorrect results are tough to control fully.

It's probably worth noting that you can get around the problems with string concatenation by using FOR XML PATH instead, too. Microsoft only documented STRING_AGG in their example.

The multi-row aggregate is somewhat more interesting to me under two scenarios:

NULL variables
Isolation levels

On NULLs

Here you really only have to be careful to initialize your starting variable with a value.

DECLARE
 @Total bigint; /*Implicitly assigned NULL*/
SELECT @Total = @Total + SUM(u.Reputation)
FROM dbo.Users AS u
WHERE u.DisplayName = N'john'
SELECT
 Total = @Total;
GO 
DECLARE
 @Total bigint = 0; /*Explicitly assigned 0*/
SELECT @Total = @Total + SUM(u.Reputation)
FROM dbo.Users AS u
WHERE u.DisplayName = N'john'
SELECT
 Total = @Total;
GO

The first query will return NULL, and the second query will return the correct total. Encountering NULL values in the results will of course not change things.

Isolation

This really only applies if you need a true "point in time" aggregate, but that largely depends on what you're doing with the total afterwards and its importance.

Consider that under most isolation levels, even Read Committed, you may miss or double count rows under concurrency with modification queries.

To avoid that, you would need to use either Snapshot, or Serializable Isolation, both of which explicitly disallow dirty, phantom, and non-repeatable reads.

Read Committed Snapshot Isolation may also provide adequate results, depending on what you're trying to avoid and how strict you need the reads to be.

score 3 · Accepted Answer · 2024-11-24 16:44:36Z

sometimes maybe

There may be some cases I haven't found where this would behave unexpectedly with single row assignment, but single-row guarantees (without needing order by via TOP or OFFSET/FETCH filtering) are relatively strong.

As far as I've ever seen, the problem is mostly for quirky updates and string concatenation. The string concatenation method is the given example in your linked documentation page, too. Both of these rely on undocumented behavior, which is dangerous at best. The number of things that can change to produce incorrect results are tough to control fully.

It's probably worth noting that you can get around the problems with string concatenation by using FOR XML PATH instead, too. Microsoft only documented STRING_AGG in their example.

The multi-row aggregate is somewhat more interesting to me under two scenarios:

NULL variables
Isolation levels

On NULLs

Here you really only have to be careful to initialize your starting variable with a value.

DECLARE
 @Total bigint; /*Implicitly assigned NULL*/
SELECT @Total = @Total + SUM(u.Reputation)
FROM dbo.Users AS u
WHERE u.DisplayName = N'john'
SELECT
 Total = @Total;
GO 
DECLARE
 @Total bigint = 0; /*Explicitly assigned 0*/
SELECT @Total = @Total + SUM(u.Reputation)
FROM dbo.Users AS u
WHERE u.DisplayName = N'john'
SELECT
 Total = @Total;
GO

The first query will return NULL, and the second query will return the correct total. Encountering NULL values in the results will of course not change things.

Isolation

This really only applies if you need a true "point in time" aggregate, but that largely depends on what you're doing with the total afterwards and its importance.

Consider that under most isolation levels, even Read Committed, you may miss or double count rows under concurrency with modification queries.

To avoid that, you would need to use either Snapshot, or Serializable Isolation, both of which explicitly disallow dirty, phantom, and non-repeatable reads.

Read Committed Snapshot Isolation may also provide adequate results, depending on what you're trying to avoid and how strict you need the reads to be.

Stack Exchange Network

Is recursive variable assignment bad even if recursively assigning single row?

1 Answer 1

sometimes maybe

On NULLs

Isolation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Is recursive variable assignment bad even if recursively assigning single row?

1 Answer 1

sometimes maybe

On NULLs

Isolation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions