I'm really confused by this SQL here, it's used in our app but I have no idea how it works. I put it in from examples I found online and it seems to have no issues.
What it's doing is grabbing the latest timestamp of a set. We couldn't use ORDER BY
as there had to be multiple groupings. The actual query is much larger and this is just a cut down version showing the fundamental selection.
select * from table1 as main
WHERE NOT (EXISTS (select * from table1 sub
WHERE sub.datetime > main.datetime));
(Live online version: http://sqlfiddle.com/#!17/290cc/2)
What I think should be happening in my mind:
Two identical tables selected, the constraint in the subquery is saying:
Only select the rows from the subquery which have a date greater than any other row in the main query.
In this case:
Subquery id:1 has only a smaller than or equal to date value from all rows in the main query so it doesn't get selected.
Subquery id:2 has a date which is larger than id:1 from the main query selection, so id:2 gets selected once. It is not larger than anything else.
Subquery id:3 has a date which is larger than both id:1 and id:2 from the main query selection, so it gets selected twice.
So the return from the subquery should be (2, 3, 3). So the main query should do a selection which not exists in that set, which should return id:1, but it returns id:3.
Where is my misunderstanding?
1 Answer 1
Well, there are some misunderstandings about what the subquery is doing.
First of all, EXISTS
works in this case by evaluating the subquery for every row of the main query, and returns true
if the subquery returns even a single row.
Since you are using WHERE NOT EXISTS(....)
, what's happening for every row from the main query is the following:
For
id = 1
: The subquery evaluates if there exists a row in the same table, where the date is greater than the date forid = 1
. And the results would be ids 2 and 3, henceEXISTS
evaluates totrue
, andNOT EXISTS
eliminates this rowFor
id = 2
: Same as before, in this case the subquery would return id 3, henceEXISTS
evaluates totrue
, andNOT EXISTS
eliminates this rowFor
id = 3
: There are no other rows in this table that have a date greater than it, soEXISTS
evaluates tofalse
, andNOT EXISTS
returnstrue
, so you get this row as a result
-
Brilliant, thanks a lot. I have an extension I'd like to do to this query then, and with your description and my new understanding it should be easy. If I had a new column 'modifier' which is nullable, and I wanted to choose the latest row (by datetime, as before) which has a concrete 'modifier' value, or the latest null 'modifier' row if no concrete value exists - how would I do this? To me this should simply be sqlfiddle.com/#!17/b94de/3 Which should return the Y row (id:2), but it doesn't, have I misunderstood again?Firepanda– Firepanda2017年08月24日 13:48:20 +00:00Commented Aug 24, 2017 at 13:48
-
To me the above sqlfiddle says: Don't select those which has a row in the same table which has a datetime greater than it (as you said before), or those rows which have a null modifier compared to rows of the same table which do have a value for the modifier.Firepanda– Firepanda2017年08月24日 13:54:28 +00:00Commented Aug 24, 2017 at 13:54
-
1@Firepanda: Your additional question looks too elaborated for asking in a comment. If you can't figure out the solution on your own, feel free to ask a new question, it's perfectly all right. You can add a link to this question if you think that can help, but please remember that questions still need to stand on their own and not rely on external information, thank you.Andriy M– Andriy M2017年08月24日 15:02:34 +00:00Commented Aug 24, 2017 at 15:02