In SQL, as far as I know, the logical query processing order, which is the conceptual interpretation order, starts with FROM in the following way:
- FROM
- WHERE
- GROUP BY
- HAVING
- SELECT
- ORDER BY
Following this list it's easy to see why you can't have SELECT aliases in a WHERE clause, because the alias hasn't been created yet. T-SQL (SQL Server) follows this strictly and you can't use SELECT aliases until you've passed SELECT.
But in MySQL it's possible to use SELECT aliases in the HAVING clause even though it should (logically) be processed before the SELECT clause. How can this be possible?
To give an example:
SELECT YEAR(orderdate), COUNT(*) as Amount
FROM Sales.Orders
GROUP BY YEAR(orderdate)
HAVING Amount>1;
The statement is invalid in T-SQL (because HAVING is referring to the SELECT alias Amount
)...
Msg 207, Level 16, State 1, Line 5
Invalid column name 'Amount'.
...but works just fine in MySQL.
Based upon this, I'm wondering:
- Is MySQL taking a shortcut in the SQL rules to help the user? Maybe using some kind of pre-analysis?
- Or is MySQL using a different conceptual interpretation order than the one I though all RDBMS were following?
2 Answers 2
Well when you have a question of this sort the best source of information IMHO is MySQL documentation. Now to the point. This is the behavior of MySql extension to GROUP BY
which is enabled by default.
MySQL Extensions to GROUP BY
MySQL extends this behavior to permit the use of an alias in the HAVING clause for the aggregated column
If you want standard behavior you can disable this extension with sql_mode
ONLY_FULL_GROUP_BY
SET [SESSION | GLOBAL] sql_mode = ONLY_FULL_GROUP_BY;
If you try to execute the above-mentioned query in ONLY_FULL_GROUP_BY
sql_mode you'll get the following error message:
Non-grouping field 'Amount' is used in HAVING clause: SELECT YEAR(orderdate), COUNT(*) as Amount FROM Orders GROUP BY YEAR(orderdate) HAVING Amount> 1
Here is SQLFiddle demo
Therefore it's up to you how to configure and use your instance of MySQL.
-
You're absolutely right about the documentation. I just never thought it could be so clearly written as you quoted it above :) Thanks for finding it...Ohlin– Ohlin2013年09月25日 06:12:57 +00:00Commented Sep 25, 2013 at 6:12
-
This answer doesn't answer "Is MySQL doing pre-analysis or is MySQL using a different conceptual interpretation?".Pacerier– Pacerier2015年05月08日 05:03:44 +00:00Commented May 8, 2015 at 5:03
-
2@Pacerier MySQL is "doing pre-analysis," of course, because the query optimizer considers all facets of the query while choosing what it believes will be the best query plan. The notion of a "different conceptual interpretation" betrays a misunderstanding of the fact that the server is free to implement the conceptual model in any way that produces a valid result.
ORDER BY
, for example, might be actually handled much earlier than it theoretically is, if the optimizer finds that rows can be initially read in order from an index that's already in the desired order.Michael - sqlbot– Michael - sqlbot2016年02月20日 12:36:20 +00:00Commented Feb 20, 2016 at 12:36
Good question.
I think you should run these querys
EXPLAIN SELECT YEAR(orderdate), COUNT(*) as Amount
FROM Sales.Orders
GROUP BY YEAR(orderdate)
HAVING Amount>1;
SHOW WARNINGS;
and check how the query is rewritten. iam pretty sure the query optimizer replace Amount with COUNT(*)
SELECT YEAR(orderdate), COUNT(*) as Amount
FROM Sales.Orders
GROUP BY YEAR(orderdate)
HAVING COUNT(*)>1;
Like it does with
select
*
from
test
where
id = 5 - 3
after query optimizer its something like this.
select
test.id as 'id'
from
test
where
test.id = 2
SELECT C, ROW_NUMBER() OVER (ORDER BY X) AS RN FROM T GROUP BY C HAVING RN = 1
will be problematic as theROW_NUMBER
runs after theHAVING
SELECT @rownum:=@rownum + 1 as row ...
. Maybe the reason why they support SELECT aliases simply is because they can, due to the fact that they don't support things that would make it impossible...who knows? :)HAVING
andSELECT
clause can be interchanged. So, there is no ambiguity in doing this and can simplify the looks of the code when there are monstrous expressions inSELECT
.distincts
) ... with theAlias in the Having
despite the sameExplain
output. So some variation with the Optimizer is happening.