I'm trying to migrate a query from Oracle to SQL Server 2014.
Here is my query which works great in Oracle:
select
count(distinct A) over (partition by B) / count(*) over() as A_B
from MyTable
Here is the error i got after tried to run this query in SQL Server 2014.
Use of DISTINCT is not allowed with the OVER clause
Anyone know what is the problem? Is such as kind of query possible in SQL Server? Please advise.
5 Answers 5
Anyone know what is the problem? Is such as kind of query possible in SQL Server?
No it isn't currently implemented. See the following connect item request.
OVER clause enhancement request - DISTINCT clause for aggregate functions
Another possible variant would be
SELECT M.A,
M.B,
T.A_B
FROM MyTable M
JOIN (SELECT CAST(COUNT(DISTINCT A) AS NUMERIC(18,8)) / SUM(COUNT(*)) OVER() AS A_B,
B
FROM MyTable
GROUP BY B) T
ON EXISTS (SELECT M.B INTERSECT SELECT T.B)
the cast to NUMERIC
is there to avoid integer division. The reason for the join clause is explained here.
It can be replaced with ON M.B = T.B OR (M.B IS NULL AND T.B IS NULL)
if preferred (or simply ON M.B = T.B
if the B
column is not nullable).
This gives the distinct count(*) for A partitioned by B:
dense_rank() over (partition by B order by A)
+ dense_rank() over (partition by B order by A desc)
- 1
-
3Interesting solution. I suppose it should have a disclaimer that it works when
A
is non-nullable only (as I think it counts nulls as well).ypercubeᵀᴹ– ypercubeᵀᴹ2016年10月10日 11:56:42 +00:00Commented Oct 10, 2016 at 11:56 -
It should be
abs(dense_rank - dense_rank) + 1
I believe.norcalli– norcalli2017年09月20日 19:56:01 +00:00Commented Sep 20, 2017 at 19:56
You can take the max value of dense_rank()
to get the distinct count of A partitioned by B.
To take care of the case where A can have null values you can use first_value
to figure out if a null is present in the partition or not and then subtract 1 if it is as suggested by Martin Smith in the comment.
select (max(T.DenseRankA) over(partition by T.B) -
cast(iif(T.FirstA is null, 1, 0) as numeric(18, 8))) / T.TotalCount as A_B
from (
select dense_rank() over(partition by T.B order by T.A) DenseRankA,
first_value(T.A) over(partition by T.B order by T.A) as FirstA,
count(*) over() as TotalCount,
T.A,
T.B
from MyTable as T
) as T
Try doing a subquery, grouping by A, B, and including the count. Then in your outer query, your count(distinct) becomes a regular count, and your count(*) becomes a sum(cnt).
select
count(A) over (partition by B) * 1.0 /
sum(cnt) over() as A_B
from
(select A, B, count(*) as cnt
from MyTable
group by A, B) as partial;
SQL Server for now does not allow using Distinct
with windowed functions.
But once you remember how windowed functions work (in simplistic terms: they're applied to result set of the query), you can work around that:
select B,
min(count(distinct A)) over (partition by B) / max(count(*)) over() as A_B
from MyTable
group by B
-
it's fascinating that this works, but does it accomplish anything more than just a simple count(distinct A) since you are already grouping by B?ScottEdwards2000– ScottEdwards20002024年11月20日 00:03:51 +00:00Commented Nov 20, 2024 at 0:03
-
1@ScottEdwards2000 - That wasn't the question. ;-) But I share the sentiment, so to answer: I don't know. I did a lot of work using windowed functions, but not for that. At a glance, I have no idea what is the point of this calculation, so the only way to find out is to test it.AcePL– AcePL2024年11月20日 07:43:18 +00:00Commented Nov 20, 2024 at 7:43
Explore related questions
See similar questions with these tags.
MyTable
? Or are distinct rows enough? And you don't need to consider the division by zero error if there are no rows inMyTable
?