Showing posts with label analytical functions. Show all posts
Showing posts with label analytical functions. Show all posts
Friday, March 04, 2016
Gotcha with the LEAD analytical function and date datatype
The other day some code that used the LEAD analytical function stopped working suddenly. It started to throw an error.
Running this query to see what is in this table
Here are our results
What will happen if someone changes the column from a datetime to a date? Maybe they looked at the data and saw that all the values had only dates and no time components... why use 8 bytes of storage when you can use 3 instead?
To find out exactly what will happen, run the following DDL statement....
Now go run your query again
Here is the error you get
The error is the same as if you would run the following
Msg 206, Level 16, State 2, Line 1
Operand type clash: date is incompatible with int
The code wasn't touched in a long time... so what could be the issue. Let's reproduce it here
First create the following table
CREATE TABLE testRunningTotal(Id tinyint,SomeDate datetime, Charge decimal(20,2)) INSERT testRunningTotal SELECT 1,'20120101',1000 UNION ALL SELECT 1,'20120401',200 UNION ALL SELECT 1,'20120501',300 UNION ALL SELECT 1,'20120601',600 UNION ALL SELECT 2,'20120101',100 UNION ALL SELECT 2,'20130101',500 UNION ALL SELECT 2,'20140101',-800 UNION ALL SELECT 3,'20120101',100
Running this query to see what is in this table
SELECT * FROM testRunningTotal
IdSomeDateCharge
12012年01月01日 00:00:00.0001000.00
12012年04月01日 00:00:00.000200.00
12012年05月01日 00:00:00.000300.00
12012年06月01日 00:00:00.000600.00
22012年01月01日 00:00:00.000100.00
22013年01月01日 00:00:00.000500.00
22014年01月01日 00:00:00.000-800.00
32012年01月01日 00:00:00.000100.00
Now run our fancy query...
SELECT id, someDate as StartDate, LEAD(SomeDate -1,1,'99991231') OVER (PARTITION BY id ORDER BY SomeDate) as Enddate, Charge, SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM testRunningTotal ORDER BY id, SomeDate
Here are our results
idStartDateEnddateChargeRunningTotal
12012年01月01日 00:00:00.0002012年03月31日 00:00:00.0001000.001000.00
12012年04月01日 00:00:00.0002012年04月30日 00:00:00.000200.001200.00
12012年05月01日 00:00:00.0002012年05月31日 00:00:00.000300.001500.00
12012年06月01日 00:00:00.0009999年12月31日 00:00:00.000600.002100.00
22012年01月01日 00:00:00.0002012年12月31日 00:00:00.000100.00100.00
22013年01月01日 00:00:00.0002013年12月31日 00:00:00.000500.00600.00
22014年01月01日 00:00:00.0009999年12月31日 00:00:00.000-800.00-200.00
32012年01月01日 00:00:00.0009999年12月31日 00:00:00.000100.00100.00
To see what the query actually does, visit this post Easy running totals with windowing functions
Let's take a closer look at the LEAD analytical function, this is what Books On Line has on its page
LEAD ( scalar_expression [ ,offset ] , [ default ] )
OVER ( [ partition_by_clause ] order_by_clause )
offset
The number of rows forward from the current row from which to obtain a value. If not specified, the default is 1. offset can be a column, subquery, or other expression that evaluates to a positive integer or can be implicitly converted to bigint. offset cannot be a negative value or an analytic function.
Looks all cool to me..nothing that stands out that could cause a problem...
What will happen if someone changes the column from a datetime to a date? Maybe they looked at the data and saw that all the values had only dates and no time components... why use 8 bytes of storage when you can use 3 instead?
To find out exactly what will happen, run the following DDL statement....
ALTER TABLE testRunningTotal ALTER COLUMN SomeDate date
Now go run your query again
SELECT id, someDate as StartDate, LEAD(SomeDate -1,1,'99991231') OVER (PARTITION BY id ORDER BY SomeDate) as Enddate, Charge, SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM testRunningTotal ORDER BY id, SomeDate
Here is the error you get
Msg 206, Level 16, State 2, Line 1
Operand type clash: date is incompatible with int
The error is the same as if you would run the following
SELECT CONVERT(DATE,GETDATE() )-1
Instead what you have to do is use the DATEADD function to do arithmetic with dates
SELECT DATEADD(dd,-1,CONVERT(DATE,GETDATE()))
So to fix our query from before, we want to change the -1 to be DATEADD(dd,-1 instead, here is what our query looks like
SELECT id, someDate as StartDate, LEAD(dateadd(dd,-1,SomeDate),1,'99991231') OVER (PARTITION BY id ORDER BY SomeDate) as Enddate, Charge, SUM(Charge) OVER (PARTITION BY id ORDER BY SomeDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM testRunningTotal ORDER BY id, SomeDate
And here are our results, it's all good now
idStartDateEnddateChargeRunningTotal
12012年01月01日2012年03月31日1000.001000.00
12012年04月01日2012年04月30日200.001200.00
12012年05月01日2012年05月31日300.001500.00
12012年06月01日9999年12月31日600.002100.00
22012年01月01日2012年12月31日100.00100.00
22013年01月01日2013年12月31日500.00600.00
22014年01月01日9999年12月31日-800.00-200.00
32012年01月01日9999年12月31日100.00100.00
I still think that maybe the LEAD function should do this conversion internally since all we are doing is specifying the OFFSET...what is your opinion?
Subscribe to:
Posts (Atom)