In SQL SERVER 2008 I'm thinking about a query such as:
SELECT ...
FROM ...
WHERE ...
AND @a=1
UNION
SELECT ...
FROM ...
WHERE ...
AND @a=2
Given that @a
is clearly the same value, and therefore either the first SELECT
returns results or the second SELECT
does, will the query optimizer identify one of the queries as false and not run it? Would it be better for me to put them in an IF
statement? The actual query is a bit more complicated than written of course, so the if
would make it like 4 different queries, which would be a pain to maintain.
I'm hoping the optimizer is smart enough to see the 'and false' and not waste time with it.
3 Answers 3
With respect to your question
will the query optimizer identify one of the queries as false and not run it?
You'd need to check the execution plans. An example below where it doesn't.
DECLARE @P INT =1
SET STATISTICS IO ON;
SELECT name
FROM sys.objects
WHERE type='P' AND @P=1
UNION
SELECT name
FROM master..spt_values
WHERE number >0 AND @P=0
Output
Table 'Worktable'. Scan count 0, logical reads 0
Table 'sysschobjs'. Scan count 1, logical reads 5
Notice that master..spt_values
was not touched in the query execution. The reason for this is that the seek on this table is under a filter with a start up expression predicate of [@P]=(0)
. As this condition is not met the seek is never actually executed.
Both the Startup Expression and number of executions are shown in the operator tool tips as below.
Plan
With regard to the other part of your question
Would it be better for me to put them in an IF statement?
Yes probably despite that. You're then guaranteed that the behaviour that you get is that which you want. Also even though master..spt_values
doesn't have any pages read above it does still have an IS
lock taken out against it (which doesn't happen in the IF ... ELSE
version)
Better still might be to use an If
that calls separate stored procedures for each branch. You haven't shown the rest of your code but one possible issue with these mutually exclusive branches might be parameter sniffing. The batch will be compiled as per the first set of parameters seen. These might be wholly unsuitable for the other branch(es). Splitting into child batches allows each branch to be compiled separately taking into account the parameter values that actually occurred when that branch was executed.
-
However, the Distinct Sort step might be unnecessary, right? We select from at most one object, and name is presumably unique in each of them, is it not.A-K– A-K2012年10月05日 18:22:08 +00:00Commented Oct 5, 2012 at 18:22
-
2@AlexKuznetsov - Agreed. They probably need
UNION ALL
and even if they don't and theDISTINCT
semantics are actually desired they might get a better plan havingDISTINCT
in the two individual queries.Martin Smith– Martin Smith2012年10月05日 18:25:09 +00:00Commented Oct 5, 2012 at 18:25
SQL Server optimizer is closed source, so we cannot see what exactly is going on under the hood. Of course, there are tons of documentation, and just as surely there are lots of cases when this thing does not behave exactly as documented.
In my experience, all documentation from all vendors tends to get out of sync quite soon, sometimes before the first release ;). The only 100% accurate source of information is source code, pun intended.
So, even if you observe some behavior right now, there is no guarantee it cannot change later on. The reason: we have no idea what is involved, how the optimizer is making its decisions. So, we do not know if we have tested all the situations.
As such, I would keep it simple:
IF @a=1 BEGIN ;
SELECT ... FROM ... ;
END ELSE BEGIN ;
SELECT ... FROM ... ;
END ;
To reuse code and reduce maintenance costs, I would have both selects utilize an inline UDF to keep common functionality in one place.
Besides, as demonstrated in Martin Smith's excellent answer, UNION leads to a possibly unnecessary sort, which may be very expensive.
In general, it pays to write shorter, more specific queries - you have a better chance to get a good execution plan.
The query optimizer must come up with an execution plan that works for any value of @a. This plan will be executed when @a=1 and the very same execution plan will be executed when @a=2. Therefore by definition the answer to your question is NO: all queries in the UNION will be executed. There is simply no other way.
-
2Just because it is in the plan doesn't mean it is executed though.Martin Smith– Martin Smith2012年10月05日 18:13:46 +00:00Commented Oct 5, 2012 at 18:13
-
@Martin: that is splitting hairs. While true that there are some plan portions than may not be executed at runtime (eg. dynamic partition elimination, a nested loop that never probes, a TOP satisfied before jumping into an UNION etc) that is really the exception. Both as a way to explain how it works and for the vast majority of real use cases the plan is always executed entirely.Remus Rusanu– Remus Rusanu2012年10月05日 18:50:14 +00:00Commented Oct 5, 2012 at 18:50
-
For the case given in the OP a filter with startup predicate seems quite common though.Martin Smith– Martin Smith2012年10月05日 18:57:08 +00:00Commented Oct 5, 2012 at 18:57
-
@Martin: is a very slippery slope. Explicit IF branches are million times safer. And ultimately this question is the hallmark of wrong API design, the one jack-of-all-trades procedure that can return sprockets or widgets depending on what argument is called with.Remus Rusanu– Remus Rusanu2012年10月05日 19:15:31 +00:00Commented Oct 5, 2012 at 19:15
-
Not to mention that the OP has an assumption about boolean short circuit, which is incorrect: rusanu.com/2009/09/13/…Remus Rusanu– Remus Rusanu2012年10月05日 19:19:56 +00:00Commented Oct 5, 2012 at 19:19
WHERE
criteria in parallel to make it faster, which in your case would be slower.