How does a subquery use the main query columns?

Question 1

I'm really confused by this SQL here, it's used in our app but I have no idea how it works. I put it in from examples I found online and it seems to have no issues.

What it's doing is grabbing the latest timestamp of a set. We couldn't use ORDER BY as there had to be multiple groupings. The actual query is much larger and this is just a cut down version showing the fundamental selection.

select * from table1 as main
WHERE NOT (EXISTS (select * from table1 sub
WHERE sub.datetime > main.datetime));

(Live online version: http://sqlfiddle.com/#!17/290cc/2)

What I think should be happening in my mind:

Two identical tables selected, the constraint in the subquery is saying:

Only select the rows from the subquery which have a date greater than any other row in the main query.

In this case:

Subquery id:1 has only a smaller than or equal to date value from all rows in the main query so it doesn't get selected.
Subquery id:2 has a date which is larger than id:1 from the main query selection, so id:2 gets selected once. It is not larger than anything else.
Subquery id:3 has a date which is larger than both id:1 and id:2 from the main query selection, so it gets selected twice.

So the return from the subquery should be (2, 3, 3). So the main query should do a selection which not exists in that set, which should return id:1, but it returns id:3.

Where is my misunderstanding?

Question 2

Well, there are some misunderstandings about what the subquery is doing.

First of all, EXISTS works in this case by evaluating the subquery for every row of the main query, and returns true if the subquery returns even a single row.

Since you are using WHERE NOT EXISTS(....), what's happening for every row from the main query is the following:

For id = 1: The subquery evaluates if there exists a row in the same table, where the date is greater than the date for id = 1. And the results would be ids 2 and 3, hence EXISTS evaluates to true, and NOT EXISTS eliminates this row
For id = 2: Same as before, in this case the subquery would return id 3, hence EXISTS evaluates to true, and NOT EXISTS eliminates this row
For id = 3: There are no other rows in this table that have a date greater than it, so EXISTS evaluates to false, and NOT EXISTS returns true, so you get this row as a result

Question 3

Brilliant, thanks a lot. I have an extension I'd like to do to this query then, and with your description and my new understanding it should be easy. If I had a new column 'modifier' which is nullable, and I wanted to choose the latest row (by datetime, as before) which has a concrete 'modifier' value, or the latest null 'modifier' row if no concrete value exists - how would I do this? To me this should simply be sqlfiddle.com/#!17/b94de/3 Which should return the Y row (id:2), but it doesn't, have I misunderstood again?

Question 4

To me the above sqlfiddle says: Don't select those which has a row in the same table which has a datetime greater than it (as you said before), or those rows which have a null modifier compared to rows of the same table which do have a value for the modifier.

Question 5

@Firepanda: Your additional question looks too elaborated for asking in a comment. If you can't figure out the solution on your own, feel free to ask a new question, it's perfectly all right. You can add a link to this question if you think that can help, but please remember that questions still need to stand on their own and not rely on external information, thank you.

Lamak Lamak 2,5761 gold badge24 silver badges30 bronze badges · Accepted Answer · 2017-08-24 13:03:20Z

Well, there are some misunderstandings about what the subquery is doing.

First of all, EXISTS works in this case by evaluating the subquery for every row of the main query, and returns true if the subquery returns even a single row.

Since you are using WHERE NOT EXISTS(....), what's happening for every row from the main query is the following:

For id = 1: The subquery evaluates if there exists a row in the same table, where the date is greater than the date for id = 1. And the results would be ids 2 and 3, hence EXISTS evaluates to true, and NOT EXISTS eliminates this row
For id = 2: Same as before, in this case the subquery would return id 3, hence EXISTS evaluates to true, and NOT EXISTS eliminates this row
For id = 3: There are no other rows in this table that have a date greater than it, so EXISTS evaluates to false, and NOT EXISTS returns true, so you get this row as a result

Brilliant, thanks a lot. I have an extension I'd like to do to this query then, and with your description and my new understanding it should be easy. If I had a new column 'modifier' which is nullable, and I wanted to choose the latest row (by datetime, as before) which has a concrete 'modifier' value, or the latest null 'modifier' row if no concrete value exists - how would I do this? To me this should simply be sqlfiddle.com/#!17/b94de/3 Which should return the Y row (id:2), but it doesn't, have I misunderstood again?
To me the above sqlfiddle says: Don't select those which has a row in the same table which has a datetime greater than it (as you said before), or those rows which have a null modifier compared to rows of the same table which do have a value for the modifier.
@Firepanda: Your additional question looks too elaborated for asking in a comment. If you can't figure out the solution on your own, feel free to ask a new question, it's perfectly all right. You can add a link to this question if you think that can help, but please remember that questions still need to stand on their own and not rely on external information, thank you.

Stack Exchange Network

How does a subquery use the main query columns?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How does a subquery use the main query columns?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions