Take the following queries:
DECLARE @X VARCHAR(200) = '1,2,3,4'
SELECT
*,
dbo.aUserDefinedScalarFunction(4) AS ScalarValue
FROM
MyTable T
INNER JOIN dbo.aUserDefineTableFunction(@X) A ON T.SomeID=A.SomeID
WHERE
(T.ID1 IS NULL OR T.ID1 IN (SELECT [value] FROM STRING_SPLIT(@X,',')))
AND
(T.ID2 IS NULL OR T.ID2 IN (SELECT Value FROM dbo.MySplitterFunction(@X))
I normally create indexed #tempTables for the WHERE
conditions above, I have found that to perform better on large datasets. However, I still can't find definitive answers to the following questions:
Will the query analyzer optimize aUserDefinedScalarFunction(4) As ScalarValue or does it get evaluated for each record?
Will the INNER JOIN dbo.aUserDefineTableFunction(@X) get materialized into a temp table once or will it be executed for each record? The function returns table (not a table variable).
Does the result of SELECT [value] FROM STRING_SPLIT(@X,',') get optimized or is it evaluated for each comparison?
Does the result of SELECT Value FROM dbo.MySplitterFunction(@X) get optimized or is it evaluated during each comparison?
-
Better question is why you are storing multiple pieces of info in the same column in the first place, and why you are using scalar functions given their inherent performance issuesCharlieface– Charlieface2022年05月01日 14:35:28 +00:00Commented May 1, 2022 at 14:35
-
@Charlieface - The above is not something I would normally use. I just put it there to clarify the questions I had.Ross Bush– Ross Bush2022年05月02日 19:48:00 +00:00Commented May 2, 2022 at 19:48
2 Answers 2
This depends on the function. For a deterministic function specified with schemabinding, the optimizer will usually be able to evaluate its result just once, caching the answer.
For details, see:
Inline table-valued functions are expanded into the calling query text before optimization, so there is no special opportunity for early evaluation and caching. That does not mean the execution plan won't feature any cache-and-reuse optimizations. Where applicable, the optimizer will consider those (e.g. a spool) just as it would for a query without an inline function.
Generally, the optimizer will be able to tell that
SPLIT_STRING
on a constant input returns deterministic results. The plan is likely to evaluate the split just once, but it does depend on plan shape.Assuming the splitter function is a multi-statement table-valued function, the result is very likely to be evaluated once. For details, and how you can tell if caching occurs from the execution plan, see:
All that said, you are right to be cautious about all this. The optimizer will frequently be able to evaluate expressions once and reuse the result, but this is not guaranteed, and often requires expert analysis to determine.
Forcing materialization using temporary tables is frequently the most robust solution since it guarantees a particular order and number of evaluations.
-
Thanks for the informative reply! I will look into the links you posted.Ross Bush– Ross Bush2022年04月29日 12:19:59 +00:00Commented Apr 29, 2022 at 12:19
Every row will evaluate the functions in the where clause. Research SARGABLE query. https://www.sqlshack.com/how-to-use-sargable-expressions-in-t-sql-queries-performance-advantages-and-examples/
DECLARE @X VARCHAR(200) = '1,2,3,4'
DECLARE @StringElements TABLE (Snippet VARCHAR(200))
INSERT INTO @StringElements (Snippet)
(SELECT [value] FROM STRING_SPLIT(@X,','))
DECLARE @XBits TABLE (Snippet VARCHAR(200))
INSERT INTO @XBits
SELECT Value FROM dbo.MySplitterFunction(@X)
DECLARE @ScalarValue INT = dbo.aUserDefinedScalarFunction(4) --AS ScalarValue
SELECT
*,
@ScalarValue AS ScalarValue
FROM
MyTable T
INNER JOIN dbo.aUserDefineTableFunction(@X) A ON T.SomeID=A.SomeID
WHERE
(T.ID1 IS NULL OR T.ID1 IN (SELECT snippet [value] FROM @StringElements))
AND
(T.ID2 IS NULL OR T.ID2 IN (SELECT Snippet [Value] FROM @Xbits))