Wednesday, March 02, 2016
Easy running totals with windowing functions
Back in the pre SQL Server 2012 days, if you wanted to do a running count, you either had to run a subquery or you could use a variable. This was slow because for each row the query that did the sum would be executed. With the additions to the windowing functions in SQL Server 2012, this is now running much faster.
Let's take a look, first create the following table
let's check that data we just inserted into the temporary table
The output looks like this
What we want is the following
For each row, we want to have the date that the row starts on and also the date when it end, we also want a running total as well. If there is no row after the current row for that id, we want the end date to be 9999年12月31日.
So we will use a couple of functions. The first one is LEAD, LEAD accesses data from a subsequent row in the same result set without the use of a self-join. So the LEAD part looks like this
LEAD(dateadd(dd,-1,SomeDate),1,'99991231') OVER (PARTITION BY id ORDER BY SomeDate) as Enddate,
What we are doing is subtracting 1 from the date in the subsequent row (ateadd(dd,-1,SomeDate))
We are using 1 as the offset since we want to apply this to the next row. Finally if there is no subsequent row, we want to use the date 9999年12月31日 instead of NULL
To do the running count, we will do the following
SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS RunningTotal
What this means in English is for each id ordered by date, sum up the charge values for the rows between the preceding rows and the current row. Here is what all that stuff means.
ROWS BETWEEN
Specifies the rows that make up the range to use as implied by
UNBOUNDED PRECEDING
Specifies that the window starts at the first row of the partition. UNBOUNDED PRECEDING can only be specified as window starting point.
CURRENT ROW
Specifies that the window starts or ends at the current row when used with ROWS or the current value when used with RANGE.
CURRENT ROW can be specified as both a starting and ending point.
And here is the query
And running that query, gives us the running count as well as the dates
Let's take a look, first create the following table
CREATE TABLE #test(Id tinyint,SomeDate date, Charge decimal(20,10)) insert #test SELECT 1,'20120101',1000 UNION ALL SELECT 1,'20120401',200 UNION ALL SELECT 1,'20120501',300 UNION ALL SELECT 1,'20120601',600 UNION ALL SELECT 2,'20120101',100 UNION ALL SELECT 2,'20130101',500 UNION ALL SELECT 2,'20140101',-800 UNION ALL SELECT 3,'20120101',100
let's check that data we just inserted into the temporary table
SELECT * FROM #test
The output looks like this
Id SomeDate Charge 1 2012年01月01日 1000.0000000000 1 2012年04月01日 200.0000000000 1 2012年05月01日 300.0000000000 1 2012年06月01日 600.0000000000 2 2012年01月01日 100.0000000000 2 2013年01月01日 500.0000000000 2 2014年01月01日 -800.0000000000 3 2012年01月01日 100.0000000000
What we want is the following
id StartDate Enddate Charge RunningTotal 1 2012年01月01日 2012年03月31日 1000.0000000000 1000.0000000000 1 2012年04月01日 2012年04月30日 200.0000000000 1200.0000000000 1 2012年05月01日 2012年05月31日 300.0000000000 1500.0000000000 1 2012年06月01日 9999年12月31日 600.0000000000 2100.0000000000 2 2012年01月01日 2012年12月31日 100.0000000000 100.0000000000 2 2013年01月01日 2013年12月31日 500.0000000000 600.0000000000 2 2014年01月01日 9999年12月31日 -800.0000000000 -200.0000000000 3 2012年01月01日 9999年12月31日 100.0000000000 100.0000000000
For each row, we want to have the date that the row starts on and also the date when it end, we also want a running total as well. If there is no row after the current row for that id, we want the end date to be 9999年12月31日.
So we will use a couple of functions. The first one is LEAD, LEAD accesses data from a subsequent row in the same result set without the use of a self-join. So the LEAD part looks like this
LEAD(dateadd(dd,-1,SomeDate),1,'99991231') OVER (PARTITION BY id ORDER BY SomeDate) as Enddate,
What we are doing is subtracting 1 from the date in the subsequent row (ateadd(dd,-1,SomeDate))
We are using 1 as the offset since we want to apply this to the next row. Finally if there is no subsequent row, we want to use the date 9999年12月31日 instead of NULL
To do the running count, we will do the following
SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS RunningTotal
What this means in English is for each id ordered by date, sum up the charge values for the rows between the preceding rows and the current row. Here is what all that stuff means.
ROWS BETWEEN
Specifies the rows that make up the range to use as implied by
UNBOUNDED PRECEDING
Specifies that the window starts at the first row of the partition. UNBOUNDED PRECEDING can only be specified as window starting point.
CURRENT ROW
Specifies that the window starts or ends at the current row when used with ROWS or the current value when used with RANGE.
CURRENT ROW can be specified as both a starting and ending point.
And here is the query
SELECT id, someDate as StartDate, LEAD(dateadd(dd,-1,SomeDate),1,'99991231') OVER (PARTITION BY id ORDER BY SomeDate) as Enddate, Charge, SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM #test ORDER BY id, SomeDate
And running that query, gives us the running count as well as the dates
id StartDate Enddate Charge RunningTotal 1 2012年01月01日 2012年03月31日 1000.0000000000 1000.0000000000 1 2012年04月01日 2012年04月30日 200.0000000000 1200.0000000000 1 2012年05月01日 2012年05月31日 300.0000000000 1500.0000000000 1 2012年06月01日 9999年12月31日 600.0000000000 2100.0000000000 2 2012年01月01日 2011年12月31日 100.0000000000 100.0000000000 2 2012年01月01日 2011-13-31 500.0000000000 600.0000000000 2 2012年01月01日 9999年12月31日 -800.0000000000 -200.0000000000 3 2012年01月01日 9999年12月31日 100.0000000000 100.0000000000That's all for this post...
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment