Optimizing query on a partitioned table

Question 1

Tried to find an answer for this, but couldn't. So bear with me and apologies if this has been answered elsewhere. It may be just me not being able to find it.

Here's the setup: SQL 2008 R2 EE. Table partitioned on column1, integer (YYYYMMDD). Clustered index and a couple of NC indexes, all aligned, using the same partition function.

Facts: table has close to 300 million records. Query would return around 2-3% of the data (around 6million records).

Here's the query:

SELECT column1, column2, column3 FROM table1 WHERE column1 = 20131120

I'm expecting this query to use the clustered index (seek or scan) as all data is there and there'll be no need for additional I/O. But instead of using the clustered index, Query Optimizer decided to use a NC index defined on one column: column5 (not included in any way in the query above). No included columns. Stats all updated yesterday, full sample.

My logical explanation is that it uses the partitioning column to force partition elimination and for this it does not need necessarily the clustered index, any index is good as it uses the same partition function.

Am I way off with my expectations or is it something I'm missing ?

Update

Here's the DDL. As you can see, the second index is indeed narrower, but only covers the column1 (as the parititioning column is included in each partitioned index even if not explicitly part of the key), but that may explain why it is being chosen. I had to anonymize it first, but kept the actual column positions in place in the code.

CREATE TABLE [dbo].[table1](
[c1] [bigint] NOT NULL,
[c2] [int] NOT NULL,
[c3] [smallint] NOT NULL,
[c4] [int] NOT NULL,
[c5] [int] NOT NULL,
[c6] [int] NOT NULL,
[c7] [int] NOT NULL,
[column1] [int] NOT NULL,
[c8] [int] NOT NULL,
[column3] [int] NOT NULL,
[c9] [int] NOT NULL,
[c10] [int] NOT NULL,
[c11] [int] NOT NULL,
[column2] [bigint] NOT NULL,
[column5] [int] NOT NULL,
[c13] [bigint] NOT NULL,
[c14] [datetime] NOT NULL,
[c15] [datetime] NOT NULL,
[c16] [datetime] NULL,
[c17] [datetime] NULL,
[c18] [tinyint] NOT NULL,
[c19] [tinyint] NOT NULL,
[c20] [tinyint] NOT NULL,
[c21] [tinyint] NOT NULL,
[c22] [tinyint] NOT NULL,
[c23] [int] NOT NULL,
[c24] [int] NOT NULL,
[c25] [int] NOT NULL,
[c26] [int] NOT NULL,
[c27] [int] NOT NULL,
[c28] [int] NOT NULL,
[c29] [bigint] NOT NULL,
[c30] [int] NULL,
CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED 
( [column1] ASC,
[column2] ASC,
[column3] ASC)
WITH ... 
( PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = OFF, 
IGNORE_DUP_KEY = OFF, 
ALLOW_ROW_LOCKS = ON, 
ALLOW_PAGE_LOCKS = ON)
)
ON [partition_scheme]([column1])
GO
CREATE NONCLUSTERED INDEX [IX_table1_column5] ON [dbo].[table1] 
(
[column5] ASC
)
WITH ( PAD_INDEX = OFF, 
 STATISTICS_NORECOMPUTE = OFF, 
 SORT_IN_TEMPDB = OFF, 
 IGNORE_DUP_KEY = OFF, 
 DROP_EXISTING = OFF, 
 ONLINE = OFF, 
 ALLOW_ROW_LOCKS = ON, 
 ALLOW_PAGE_LOCKS = ON
)
ON [partition_scheme]([column1])
GO

Question 2

NC indexes include all the columns from the CI (and not only column1 as it seems you think.)

Question 3

If you have a composite clustered index on (at least) (column1, column2, column3), then a non-clustered index on column5 is covering for ( column1, column2, column3, column5 ) and would be narrower than the clustered index, especially on a wide fact. So that might be why.

Can you post the DDL and indexes for your table please?

I'm just going to reproduce that section from "Query Tuning and Optimization", CH3 here as hopefully it will help clear up what looks like a misunderstanding you have about covering indexes:

CREATE TABLE T_heap (a int, b int, c int, d int, e int, f int)
CREATE INDEX T_heap_a ON T_heap (a)
CREATE INDEX T_heap_bc ON T_heap (b, c)
CREATE INDEX T_heap_d ON T_heap (d) INCLUDE (e)
CREATE UNIQUE INDEX T_heap_f ON T_heap (f)
CREATE TABLE T_clu (a int, b int, c int, d int, e int, f int)
CREATE UNIQUE CLUSTERED INDEX T_clu_a ON T_clu (a)
CREATE INDEX T_clu_b ON T_clu (b)
CREATE INDEX T_clu_ac ON T_clu (a, c)
CREATE INDEX T_clu_d ON T_clu (d) INCLUDE (e)
CREATE UNIQUE INDEX T_clu_f ON T_clu (f)

covered columns

Hopefully that makes senses!

Question 4

Hi wBob... Sorry I couldn't post the DDL here, but was too long for a comment. Please see it above, as an answer to my own question. Still learning how to use this site :)

Question 5

Hi wBob... Thank you again. I was refering to the second index as only covering column1 (the NC one). And yes, the clustered one is a composite index on 3 columns (including the partitioning column).

Question 6

Hi wBob. Thanks, that was the "issue". The NC yielded less I/O as wea smaller. Here's another similar situation: link. Thanks again all for your comments.

wBob wBob 10.4k2 gold badges26 silver badges44 bronze badges · Accepted Answer · 2013-11-21 16:44:44Z

If you have a composite clustered index on (at least) (column1, column2, column3), then a non-clustered index on column5 is covering for ( column1, column2, column3, column5 ) and would be narrower than the clustered index, especially on a wide fact. So that might be why.

Can you post the DDL and indexes for your table please?

I'm just going to reproduce that section from "Query Tuning and Optimization", CH3 here as hopefully it will help clear up what looks like a misunderstanding you have about covering indexes:

CREATE TABLE T_heap (a int, b int, c int, d int, e int, f int)
CREATE INDEX T_heap_a ON T_heap (a)
CREATE INDEX T_heap_bc ON T_heap (b, c)
CREATE INDEX T_heap_d ON T_heap (d) INCLUDE (e)
CREATE UNIQUE INDEX T_heap_f ON T_heap (f)
CREATE TABLE T_clu (a int, b int, c int, d int, e int, f int)
CREATE UNIQUE CLUSTERED INDEX T_clu_a ON T_clu (a)
CREATE INDEX T_clu_b ON T_clu (b)
CREATE INDEX T_clu_ac ON T_clu (a, c)
CREATE INDEX T_clu_d ON T_clu (d) INCLUDE (e)
CREATE UNIQUE INDEX T_clu_f ON T_clu (f)

covered columns

Hopefully that makes senses!

Hi wBob... Sorry I couldn't post the DDL here, but was too long for a comment. Please see it above, as an answer to my own question. Still learning how to use this site :)
Hi wBob... Thank you again. I was refering to the second index as only covering column1 (the NC one). And yes, the clustered one is a composite index on 3 columns (including the partitioning column).
Hi wBob. Thanks, that was the "issue". The NC yielded less I/O as wea smaller. Here's another similar situation: link. Thanks again all for your comments.

Stack Exchange Network

Optimizing query on a partitioned table

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Optimizing query on a partitioned table

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions