I had a query (for Postgres and Informix) with a NOT IN
clause containing a subquery that in some cases returned NULL
values, causing that clause (and the entire query) to fail to return anything.
What's the best way to understand this? I thought of NULL
as something without a value, and therefore wasn't expecting the query to fail, but obviously that's not the correct way to think of NULL
.
1 Answer 1
Boolean logic - or Three valued logic
- IN is shorthand for a series of OR conditions
x NOT IN (1, 2, NULL)
is the same asNOT (x = 1 OR x = 2 OR x = NULL)
- ... is the same as
x <> 1 AND x <> 2 AND x <> NULL
- ... is the same as
true AND true AND unknown
** - ... =
unknown
** - ... which is almost the same as
false
in this case as it will not pass theWHERE
condition **
Now, this is why folk use EXISTS
+ NOT EXISTS
rather than IN
+ NOT IN
. Also see The use of NOT logic in relation to indexes for more
** Note: unknown
is the same as false
at the end of an expression in a WHERE
condition.
While the expression is being evaluated, then it is unknown
See @kgrittn's comment below for why
-
10Even with the clarification it is technically wrong, and in a way that could burn someone. For example, if you view
x <> NULL
as resolving toFALSE
, you would expectNOT (x <> NULL)
to evaluate toTRUE
, and it doesn't. Both evaluate toUNKNOWN
. The trick is that a row is selected only if theWHERE
clause (if present) evaluates toTRUE
-- a row is omitted if the clause evaluates to eitherFALSE
orUNKNOWN
. This behavior (in general, and for theNOT IN
predicate in particular) is mandated by the SQL standard.kgrittn– kgrittn2012年05月03日 14:35:32 +00:00Commented May 3, 2012 at 14:35 -
Also
NULL NOT IN (some_subquery)
should not return the outer row except ifsome_subquery
doesn't return any rows. Which is why the execution plan when both columns are Null-able can be considerably more expensive. SQL Server ExampleMartin Smith– Martin Smith2012年06月18日 07:08:39 +00:00Commented Jun 18, 2012 at 7:08