3
\$\begingroup\$

I'm trying to find an efficient way to aggregate data for reporting. Let's say I need to aggregate the following data in 5-second intervals:

CREATE TABLE RawData
(
 Result FLOAT,
 CaptureTime DATETIME
);
INSERT INTO RawData VALUES
(2.3, '2018-04-01 00:00:00'),
(2.5, '2018-04-01 00:00:01'),
(2.8, '2018-04-01 00:00:02'),
(2.8, '2018-04-01 00:00:03'),
(3.4, '2018-04-01 00:00:04'),
(5.1, '2018-04-01 00:00:05'),
(2.2, '2018-04-01 00:00:06'),
(4.1, '2018-04-01 00:00:07'),
(4.3, '2018-04-01 00:00:08'),
(5.9, '2018-04-01 00:00:09'),
...

The aggregated data will be stored in another table:

CREATE TABLE AggregateData
(
 Result FLOAT,
 StartCaptureTime DATETIME,
 EndCaptureTime DATETIME
);

As an example: if my aggregate function is SUM, the data in AggregateData would be:

+--------+----------------------+---------------------+
| Result | StartCaptureTime | EndCaptureTime |
+--------+----------------------+---------------------+
| 13.8 | 2018年04月01日 00:00:00 | 2018年04月01日 00:00:05 |
| 21.6 | 2018年04月01日 00:00:05 | 2018年04月01日 00:00:10 |
+--------+----------------------+---------------------+

The best solution that I've come up with uses a loop:

CREATE PROCEDURE [dbo].[spPerformAggregateCalculation]
 @UpdateIntervalSeconds INT,
 @StartTime DATETIME,
 @EndTime DATETIME
AS
BEGIN
 SET NOCOUNT ON;
 DECLARE @RunningStartTime DATETIME = @StartTime;
 DECLARE @RunningEndTime DATETIME = DATEADD(SECOND, @UpdateIntervalSeconds, @RunningStartTime);
 DECLARE @AggregateValue FLOAT;
 WHILE @RunningStartTime < @EndTime
 BEGIN
 SELECT @AggregateValue =
 (
 SELECT SUM(Result)
 FROM RawData
 WHERE CaptureTime >= @RunningStartTime AND CaptureTime < @RunningEndTime
 );
 INSERT INTO AggregateData VALUES
 (@AggregateValue, @RunningStartTime, @RunningEndTime);
 SELECT
 @RunningStartTime = @RunningEndTime,
 @RunningEndTime = DATEADD(SECOND, @UpdateIntervalSeconds, @RunningEndTime);
 END
END
EXEC spPerformAggregateCalculation
 @UpdateIntervalSeconds = 5,
 @StartTime = '2018-04-01 00:00:00',
 @EndTime = '2018-04-01 00:00:10'

Is there a more efficient way to get this data? I'll be performing more complex aggregations on millions of rows on a daily basis, so every bit helps.

P.S. In case "increase your interval" is offered as a suggestion, my data will actually be aggregated in much larger intervals than 5 seconds; I just whittled the problem down for the example.


Update

Below was my final solution, based dnoeth's answer:

CREATE PROCEDURE [dbo].[spPerformAggregateCalculation]
 @UpdateIntervalSeconds INT,
 @StartTime DATETIME,
 @EndTime DATETIME
AS
BEGIN
 SET NOCOUNT ON;
 WITH cte AS
 (
 SELECT 
 -- Adjust each CaptureTime to the beginning of the interval.
 DATEADD(SECOND, -DATEDIFF(SECOND, @StartTime , CaptureTime) % @UpdateIntervalSeconds, CaptureTime) as StartCaptureTime,
 Result
 FROM RawData
 WHERE CaptureTime >= @StartTime
 )
 INSERT INTO AggregateData
 SELECT
 SUM(Result),
 StartCaptureTime,
 DATEADD(SECOND, @UpdateIntervalSeconds, StartCaptureTime)
 FROM cte
 GROUP BY StartCaptureTime
 ORDER BY StartCaptureTime;
END
asked Apr 28, 2018 at 14:26
\$\endgroup\$
7
  • \$\begingroup\$ Is it always 5 rows. Can time skip or duplicate? \$\endgroup\$ Commented Apr 30, 2018 at 10:48
  • \$\begingroup\$ Good question. It is always 5 rows and there are no skips or duplicates. \$\endgroup\$ Commented Apr 30, 2018 at 11:57
  • \$\begingroup\$ Then there is a very simple solution but you don't to be enthused with my answer. \$\endgroup\$ Commented Apr 30, 2018 at 12:01
  • \$\begingroup\$ A simple solution, in addition to the one you already answered, or are you referring to your answer below? I haven't had time to test it yet. \$\endgroup\$ Commented Apr 30, 2018 at 12:56
  • \$\begingroup\$ Should the intervals always start at 00 or at the minimum CaptureTime? And what are those much larger intervals than 5 seconds? \$\endgroup\$ Commented Apr 30, 2018 at 12:59

2 Answers 2

3
\$\begingroup\$

There's no need for a cursor or recursion, both are not really performant for larger amounts of data.

Simply adjust the CaptureTime to the start of each interval.

declare @UpdateIntervalSeconds integer = 5;
declare @minTime datetime = (select min(CaptureTime) from RawData);
-- if you want to start at exactly `00` instead of `01`,`02` etc. 
-- you can modify `@minTime` using another DATEADD/DATEDIFF
with cte as
 ( select 
 -- adjust each CaptureTime to the begin of an interval
 DATEADD(SECOND, -DATEDIFF(SECOND, @minTime , CaptureTime) % @UpdateIntervalSeconds, CaptureTime) as StartCaptureTime
 ,Result
 from RawData
 )
select StartCaptureTime
 ,DATEADD(SECOND, @UpdateIntervalSeconds, StartCaptureTime) AS EndCaptureTime
 ,sum(Result)
from cte
group by StartCaptureTime
order by StartCaptureTime
answered Apr 30, 2018 at 13:31
\$\endgroup\$
1
  • \$\begingroup\$ Very clever! I just had to add a filter to the CTE to make it work with a passed-in Start Time instead processing the entire table at once. Thanks for your help! \$\endgroup\$ Commented Apr 30, 2018 at 22:33
1
\$\begingroup\$

I don't see any problems with the cursor.

I think a recursive cte is more efficient and cleaner here.

INSERT INTO @RawData VALUES
(2.3, '2018-04-01 00:00:00'),
(2.5, '2018-04-01 00:00:01'),
(2.8, '2018-04-01 00:00:02'),
(2.8, '2018-04-01 00:00:03'),
(3.4, '2018-04-01 00:00:04'),
(5.1, '2018-04-01 00:00:05'),
(2.2, '2018-04-01 00:00:06'),
(4.1, '2018-04-01 00:00:07'),
(4.3, '2018-04-01 00:00:08'),
(5.9, '2018-04-01 00:00:09');
declare @start datetime = (select min(CaptureTime) from @RawData);
declare @end datetime = (select max(CaptureTime) from @RawData);
with cte as 
( select @start st, DATEADD(ss, 4, @start) as ed
 union all 
 select DATEADD(ss, 5, st), DATEADD(ss, 5, ed)
 from cte 
 where ed < @end
)
select sum(rd.Result), cte.st, cte.ed 
 from @RawData rd 
 join cte 
 on rd.CaptureTime between cte.st and cte.ed 
 group by cte.st, cte.ed;
answered Apr 30, 2018 at 11:06
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.