I'm using SQL Server 2012 and trying to calculate some aggregate functions and percentiles on a data set with two columns (id and time). In my research I found some solutions, but they don't work for me (maybe it's because of my outdated SQL version)
| id | time_seconds |
|---|---|
| 1 | 120 |
| 2 | 10 |
| 3 | 50 |
| 4 | 80 |
| 5 | 60 |
| 6 | 42 |
| 7 | 96 |
I'll tried something like that:
SELECT
MIN(time_seconds) AS min,
MAX(time_seconds) AS max,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY time_seconds) OVER () AS '95 percentil'
from table
If i run this query, it wants me to make an include "time_seconds" in Group by, but then the aggregate function doesnt work anymore.
The Output id like to become is something like this :
| min | max | 95 percentil |
|---|---|---|
| 10 | 120 | 110 |
Thanks for your help!
-
95 percentil result should be 112.8 correct?APB Reports– APB Reports2023年06月06日 08:54:33 +00:00Commented Jun 6, 2023 at 8:54
2 Answers 2
Unfortunately PERCENTILE_CONT works as a window function only, not as an aggregate function. This is a (rare) case when select distinct may come useful:
select distinct
min(time_seconds) over() as min_time_seconds,
max(time_seconds) over() as max_time_seconds,
percentile_cont(0.95) within group (order by time_seconds) over () as percentil_95
from mytable
Basically this takes the problem the opposite way: since we can't have an aggregate percentile computation, we turn other aggregate functions to window functions: the window computations yield the same values on all rows, so we can then deduplicate the resultset with distinct.
| min_time_seconds | max_time_seconds | percentil_95 |
|---|---|---|
| 10 | 120 | 112.8 |
fiddle - this is SQL Server 2014, since db<>fiddle does not support 2012.
Comments
Try this:
;WITH cte AS (
SELECT
CAST(COUNT(*) OVER() AS float) AS total_rows,
ROW_NUMBER() OVER (ORDER BY time_seconds) AS row_number,
time_seconds
FROM (
SELECT 1 AS id, 120 AS time_seconds UNION ALL
SELECT 2 AS id, 10 AS time_seconds UNION ALL
SELECT 3 AS id, 50 AS time_seconds UNION ALL
SELECT 4 AS id, 80 AS time_seconds UNION ALL
SELECT 5 AS id, 60 AS time_seconds UNION ALL
SELECT 6 AS id, 42 AS time_seconds UNION ALL
SELECT 7 AS id, 96 AS time_seconds
) AS YourTable
),
percentiles AS (
SELECT
time_seconds,
100.0 * ((row_number - 1) / (total_rows - 1)) AS percentile,
LEAD(time_seconds) OVER (ORDER BY row_number) AS next_time_seconds,
100.0 * (row_number / (total_rows - 1)) AS next_percentile
FROM cte
)
SELECT
MIN(time_seconds) AS min_time_seconds,
MAX(time_seconds) AS max_time_seconds,
(
SELECT
time_seconds + (next_time_seconds - time_seconds) * (95 - percentile) / (next_percentile - percentile)
FROM percentiles
WHERE percentile <= 95 AND next_percentile >= 95
) AS percentile_95
FROM cte;
Comments
Explore related questions
See similar questions with these tags.