Optimize simple query in SQL Server

Question 1

This should be a quite easy query, but I honestly think its execution time can be improved.

select idTag,MAX(pctimestamp) AS PCTIMESTAMP,getdate() AS NOW, datediff(SECOND,MAX(pctimestamp),getdate()) AS DELAY 
from ValuesTagsOPC
group by IdTag

This query returns 1386 rows from 'ValuesTagsOPC' table, that contains about 40 million rows, and has the following structure, retrieved by the create script:

CREATE TABLE [dbo].[ValuesTagsOPC](
 [IdTag] [int] NOT NULL,
 [TTimeStamp] [datetime] NOT NULL,
 [PCTimeStamp] [datetime] NOT NULL,
 [Value] [nvarchar](50) NOT NULL,
 [Quality] [int] NOT NULL,
 CONSTRAINT [PK_ValuesTagsOPC] PRIMARY KEY CLUSTERED 
(
 [IdTag] ASC,
 [PCTimeStamp] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Obviously, there is a clustered index on its primary key. Time and IO statistics from SQL Server are the following (sorry, it's in spanish):

(1386 filas afectadas)
Tabla 'ValuesTagsOPC'. Recuento de exámenes 5, lecturas lógicas 224612, lecturas físicas 0, lecturas anticipadas 0, lecturas lógicas de LOB 0, lecturas físicas de LOB 0, lecturas anticipadas de LOB 0.
 Tiempos de ejecución de SQL Server:
 Tiempo de CPU = 9578 ms, tiempo transcurrido = 3019 ms.

I've checked that the estimated execution plan is the same as the real one, and it tells me that 94 % of the cost comes from the cluster index scan. What I don't understand is why it is not performing a seek instead of a scan, since all required fields in the query are included in the clustered index....

Thanks in advance!!

Question 2

Yes, indeed, there is a table DescriptionTagsOPC which has as PK idTag, and has a cluster index on it. I guess by your suggestion that I should add a join in the where clause, matching both idTag. I've tried it but the result is exactly the same. The query is this:

select v.idTag,MAX(v.pctimestamp) AS PCTIMESTAMP,getdate() AS NOW, datediff(SECOND,MAX(pctimestamp),getdate()) AS DELAY from ValuesTagsOPC v, DescriptionTagsOPC d where v.IdTag = d.IdTag group by v.IdTag

Question 3

It has no WHERE clause so it must process and aggregate all 40 million rows. SQL Server will not take advantage of the index order and skip scan ahead to the next IdTag once it has found the MAX for the current group but will continue processing the other rows in that group. Each group has an average of about 30,000 rows.

As you have another table that lists the 1,386 distinct IdTag types then you could try the following instead.

SELECT D.IdTag,
 V.PCTimeStamp,
 V.Now,
 datediff(SECOND, V.PCTimeStamp, V.Now) AS DELAY
FROM DescriptionTagsOPC D
 CROSS APPLY (SELECT TOP 1 *,
 getdate() AS Now
 FROM ValuesTagsOPC V
 WHERE D.IdTag = V.IdTag
 ORDER BY PCTimeStamp DESC) V

To replace the scan of 40 million rows with 1,386 seeks.

If that table was not available then a recursive CTE could be used to achieve similar results.

WITH RecursiveCTE
AS (
 SELECT TOP 1 IdTag, PCTimeStamp
 FROM ValuesTagsOPC
 ORDER BY IdTag DESC, PCTimeStamp DESC
 UNION ALL
 SELECT R.IdTag, R.PCTimeStamp
 FROM (
 SELECT V.*,
 rn = ROW_NUMBER() OVER (ORDER BY V.IdTag DESC, V.PCTimeStamp DESC)
 FROM ValuesTagsOPC V
 JOIN RecursiveCTE R
 ON V.IdTag < R.IdTag
 ) R
 WHERE R.rn = 1
 )
SELECT IdTag,
 PCTimeStamp,
 getdate() AS NOW,
 datediff(SECOND, PCTimeStamp, getdate()) AS DELAY
FROM RecursiveCTE
OPTION (MAXRECURSION 0);

Martin Smith Martin Smith 88.4k15 gold badges258 silver badges357 bronze badges · Accepted Answer · 2013-02-07 09:33:34Z

It has no WHERE clause so it must process and aggregate all 40 million rows. SQL Server will not take advantage of the index order and skip scan ahead to the next IdTag once it has found the MAX for the current group but will continue processing the other rows in that group. Each group has an average of about 30,000 rows.

As you have another table that lists the 1,386 distinct IdTag types then you could try the following instead.

SELECT D.IdTag,
 V.PCTimeStamp,
 V.Now,
 datediff(SECOND, V.PCTimeStamp, V.Now) AS DELAY
FROM DescriptionTagsOPC D
 CROSS APPLY (SELECT TOP 1 *,
 getdate() AS Now
 FROM ValuesTagsOPC V
 WHERE D.IdTag = V.IdTag
 ORDER BY PCTimeStamp DESC) V

To replace the scan of 40 million rows with 1,386 seeks.

If that table was not available then a recursive CTE could be used to achieve similar results.

WITH RecursiveCTE
AS (
 SELECT TOP 1 IdTag, PCTimeStamp
 FROM ValuesTagsOPC
 ORDER BY IdTag DESC, PCTimeStamp DESC
 UNION ALL
 SELECT R.IdTag, R.PCTimeStamp
 FROM (
 SELECT V.*,
 rn = ROW_NUMBER() OVER (ORDER BY V.IdTag DESC, V.PCTimeStamp DESC)
 FROM ValuesTagsOPC V
 JOIN RecursiveCTE R
 ON V.IdTag < R.IdTag
 ) R
 WHERE R.rn = 1
 )
SELECT IdTag,
 PCTimeStamp,
 getdate() AS NOW,
 datediff(SECOND, PCTimeStamp, getdate()) AS DELAY
FROM RecursiveCTE
OPTION (MAXRECURSION 0);

Stack Exchange Network

Optimize simple query in SQL Server

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Optimize simple query in SQL Server

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions