SQL Server not Using Index

Question 1

I am absolutely stumped as to why my query is not using what I think is a selective index.

My Model consists of Claims, Contacts, and Phone Numbers. Each Claim has 1 Contact and Each Contact has Many Phone numbers. A Claim can have a Status and a Phone Number has a Type. Simplified Model

I have added an index on the Claim for the Status and it includes the ContactID.

create index Status on tClaim(Status) include (Name,ContactID)

I have added an index on the Phones for the ContactID and Type that includes the Number.

create index ContactID_Type on tContactPhone(ContactID,Type) include (Number)

I am trying to write a query that returns all Claims that have a status of 'Won' and the corresponding 'Home Phone' for the Claim's Contact. I have tried it 2 ways. One including the join to Contacts and one without. Neither generate a plan that I expect.

select 
 c.ID,
 c.Name,
 p.Number 
from
 tClaim c 
 left join tContactPhone p on
 c.ContactID=p.ContactID and p.Type='Home'
where
 c.Status = 'Won'
select 
 c.ID,
 c.Name,
 p.Number
from
 tClaim c
 inner join tContact co on 
 co.id=c.ContactID
 left join tContactPhone p on
 co.ID=p.ContactID and p.Type='Home'
where
 c.Status = 'Won'

The plan I am getting back, refuses to use the tContactPhone.ContactID_Type. It suggests indexing by Type, which doesn't make sense, because it seems less selective than the ContactId.

Paste The Plan

Here is the script I used to create a sample data set to test. Please note my actual data set is much larger, named better, and has a lot more fields; but this is distilled down to replicate my situation [AKA I don't even like the naming conventions and data generation, but it gets the job done :)]

/*
 Create Tables and Constraints
*/
CREATE TABLE tContact(
 [ID] [int] IDENTITY(1,1) NOT NULL,
 [Name] [nvarchar](100) NOT NULL,
 CONSTRAINT [pkey_tContact] PRIMARY KEY CLUSTERED 
 (
 [ID] ASC
 )
)
GO
CREATE TABLE tContactPhone(
 [ID] [int] IDENTITY(1,1) NOT NULL,
 [ContactID] [int] NOT NULL,
 [Type] [nvarchar](25) NOT NULL,
 [Number] [nvarchar](12) NOT NULL,
 CONSTRAINT [pkey_tContactPhone] PRIMARY KEY CLUSTERED 
 (
 [ID] ASC
 )
)
GO
ALTER TABLE tContactPhone WITH CHECK ADD CONSTRAINT FK_tContactPhones FOREIGN KEY(ContactID)
REFERENCES tContact ([ID])
GO
ALTER TABLE tContactPhone CHECK CONSTRAINT FK_tContactPhones
GO
CREATE TABLE tClaim(
 [ID] [int] IDENTITY(1,1) NOT NULL,
 [ContactID] [int] NOT NULL,
 [Name] [nvarchar](50) NOT NULL,
 [Status] nvarchar(10) not null,
 CONSTRAINT [pkey_tClaim] PRIMARY KEY CLUSTERED 
 (
 [ID] ASC
 )
)
GO
ALTER TABLE tClaim WITH CHECK ADD CONSTRAINT FK_tClaim FOREIGN KEY(ContactID)
REFERENCES tContact ([ID])
GO
/*
 Add Test Data
*/
declare @Count int = 0
declare @ContactID int =0
while(@Count<100000)
begin
 set @Count = @Count+1
 insert into tContact(Name)
 select 'Name' + convert(nvarchar(10),@Count) 
 set @ContactID= SCOPE_IDENTITY()
 insert into tContactPhone(ContactID,Number,Type)
 select @ContactID,@Count+1,'Home'
 union select @ContactID,@Count+1,'Cell'
 insert into tClaim(ContactID,Name,Status)
 select @ContactID, convert(nvarchar(10),@ContactID)+'_ClaimName',case @Count % 25 when 0 then 'Won' else 'Closed' end
end
/*
 Add Indexes for Queries
*/
create index Status on tClaim(Status) include (Name,ContactID)
create index ContactID_Type on tContactPhone(ContactID,Type) include (Number)

Question 2

Thank you for providing detailed information (and your execution plans :), this is interesting. Just out of curiosity, do you see any difference in your execution plans if you switch your Type and ContactID columns around in your index creation script for your tContactPhone table? I.e. your index creation script as this: create index ContactID_Type on tContactPhone(Type, ContactID) include (Number). (Make sure to drop the old index too.)

Question 3

What about using the following indexes

create index Status on tClaim(Status, ContactID) include (Name); create index ContactID_Type on tContactPhone(Type, ContactID) include (Number)

it should result in a merge join

Question 4

It still reads way too many rows from the phone table. 100k phone reads for 4k claims/contact. Link To The Plan

Question 5

@JoshuaGrippo Did the performance improve / does it execute faster though?...as I see your updated execution plan is simpler now and utilizes an index seek operation (as opposed to index scan).

Question 6

It is faster. It is doing ~470 logical reads vs ~900, which means it is touching less data. It is a simpler plan because it changes a scan to a seek, It is still reading 100k rows, when there is only 4k rows. i am just dumb founded why it is not trying to use the contactid. When I have only the type_contactid index it even suggests I create the contactid_type index and it would give a 96.2 improvement, but when it is created it doesn't get used. I am trying to think like sql server and I would think that means using the contactid as the more selective key.

Question 7

You can get almost as good a plan with fewer indexes if you cluster tContactPhone by (ContactID,ID) instead of having a clustered index on ID and a seperate non-clustered index on ContactID. eg

CREATE TABLE tContactPhone(
 [ContactID] [int] NOT NULL,
 [ID] [int] IDENTITY(1,1) NOT NULL,
 [Type] [nvarchar](25) NOT NULL,
 [Number] [nvarchar](12) NOT NULL,
 CONSTRAINT [pkey_tContactPhone] PRIMARY KEY CLUSTERED 
 (
 [ContactID],[ID] 
 )
)

This is generally a better-performing pattern for "child tables" as the clustered index also supports the foreign key.

Question 8

This seems like a better pattern. For the moment, I ignored that I cannot make this change in production, because of a framework we use. I dropped my old pk and added the one you suggested. It still tried to use my ContactID_Type index. I dropped that index and then it still scanned the PK. Link to the Plan

Question 9

How many won claim do you have (as a percentage of the total claims)? If the percentage is lower than 1% (or if you have fewer than a thousand claims) maybe it would make sense to perform a Nested Loops join and index seeks in ContactID_Type for each contact. Else, hash joins (or merge joins) would probably be better (because using index seeks to read a large portion of the tContactPhone table would be less efficient than reading the entire table with a scan).

If a hash join is used with the ContactID_Type index, the Type column cannot be used for a seek. To use a seek, it needs an index which has the Type column as the first key column. So that's why the optimizer suggests the index on the Type column (because it is hoping to read less rows from that index).

Question 10

I think on average we are going to have 1-10% of the claims be won, but I cannot guarantee this. For the example data, I just forced it to be 4% of claims based on a recent sample. I am not sure how to actually force the engine to choose a hash join vs merge join etc. Is that a thing I can do? Just for giggles, I selected my 4k claims into a temp table and then joined that to tContactPhones and got the same result as above. I then added an index on Type_ContactID and it still gave me the 100k rows read, which was not helpful, but at least it tried to seek this time.

Question 11

You can force a particular join type by specifying it in the JOIN clause (for example INNER LOOP JOIN) or on the entire query, adding OPTION (LOOP JOIN) at the end. The last one would use this join type for all the joins in the query, but the first way also forces the join order, so use them carefully.

Question 12

I didn't know you could add a hint to a join to help shape the plan. While I appreciate the suggestion, it is not really what I am looking for. Similar to not use [options], I don't want to use [hints] either. I want to figure out why the engine is not making the plan that I think the majority of people would expect it to make and/or how to index properly so that it does make a reasonable plan.

score 2 · Answer 1 · 2021-05-21 17:23:51Z

You can get almost as good a plan with fewer indexes if you cluster tContactPhone by (ContactID,ID) instead of having a clustered index on ID and a seperate non-clustered index on ContactID. eg

CREATE TABLE tContactPhone(
 [ContactID] [int] NOT NULL,
 [ID] [int] IDENTITY(1,1) NOT NULL,
 [Type] [nvarchar](25) NOT NULL,
 [Number] [nvarchar](12) NOT NULL,
 CONSTRAINT [pkey_tContactPhone] PRIMARY KEY CLUSTERED 
 (
 [ContactID],[ID] 
 )
)

This is generally a better-performing pattern for "child tables" as the clustered index also supports the foreign key.

This seems like a better pattern. For the moment, I ignored that I cannot make this change in production, because of a framework we use. I dropped my old pk and added the one you suggested. It still tried to use my ContactID_Type index. I dropped that index and then it still scanned the PK. Link to the Plan

Razvan Socol Razvan Socol 3681 silver badge9 bronze badges · Answer 2 · 2021-05-21 18:13:03Z

1

How many won claim do you have (as a percentage of the total claims)? If the percentage is lower than 1% (or if you have fewer than a thousand claims) maybe it would make sense to perform a Nested Loops join and index seeks in ContactID_Type for each contact. Else, hash joins (or merge joins) would probably be better (because using index seeks to read a large portion of the tContactPhone table would be less efficient than reading the entire table with a scan).

If a hash join is used with the ContactID_Type index, the Type column cannot be used for a seek. To use a seek, it needs an index which has the Type column as the first key column. So that's why the optimizer suggests the index on the Type column (because it is hoping to read less rows from that index).

Share

Improve this answer

answered May 21, 2021 at 18:13

Razvan Socol's user avatar

Razvan Socol Razvan Socol

3681 silver badge9 bronze badges

3

I think on average we are going to have 1-10% of the claims be won, but I cannot guarantee this. For the example data, I just forced it to be 4% of claims based on a recent sample. I am not sure how to actually force the engine to choose a hash join vs merge join etc. Is that a thing I can do? Just for giggles, I selected my 4k claims into a temp table and then joined that to tContactPhones and got the same result as above. I then added an index on Type_ContactID and it still gave me the 100k rows read, which was not helpful, but at least it tried to seek this time.

Joshua Grippo
– Joshua Grippo

2021年05月21日 22:28:00 +00:00
Commented May 21, 2021 at 22:28
You can force a particular join type by specifying it in the JOIN clause (for example INNER LOOP JOIN) or on the entire query, adding OPTION (LOOP JOIN) at the end. The last one would use this join type for all the joins in the query, but the first way also forces the join order, so use them carefully.

Razvan Socol
– Razvan Socol

2021年05月22日 03:49:20 +00:00
Commented May 22, 2021 at 3:49
I didn't know you could add a hint to a join to help shape the plan. While I appreciate the suggestion, it is not really what I am looking for. Similar to not use [options], I don't want to use [hints] either. I want to figure out why the engine is not making the plan that I think the majority of people would expect it to make and/or how to index properly so that it does make a reasonable plan.

Joshua Grippo
– Joshua Grippo

2021年05月23日 18:44:08 +00:00
Commented May 23, 2021 at 18:44

Add a comment |

Stack Exchange Network

SQL Server not Using Index

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

SQL Server not Using Index

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions