Clustered Index Scan on 5+ billion rows instead of using non clusterd index for <1000 rows

Question 1

We have a huge SQL 2017 database containing more than 5 billion internetpages. One of the table contains the info about the page: PageID is the primary clustered key and UriKey is the non clustered index so we can lookup based on the Uri of the page. There are a lot more columns in that table as well, related to that page.

If we join with other tables and we join on the UriKey, the query will use the non clustered index only if we only require the UriKey itself. Once we add another column, it does a clustered index scan, wich can take up to days to complete.

This uses the correct index and takes less than a second to complete https://www.brentozar.com/pastetheplan/?id=rkk1vIb07

select kps.Page_ID, p.PageID, r.*, p2.UriKey
from LB_Keywords_Pages_Static kps
 inner join spidir2.dbo.se_pages p 
 on kps.page_id=p.pageid
 inner join Temp_KeyboostRedirects r
 on p.UriKey like '%/'+ r.source +'/%'
 inner join Spidir2.dbo.SE_Pages p2 
 on p2.UriKey='https://www.keyboost.nl'+ r.target
where Account_ID=-2147482063

This selects one column extra (p2.Domain_ID) and takes more than a day to complete https://www.brentozar.com/pastetheplan/?id=ByOpUIb0m

select kps.Page_ID, p.PageID, r.*, p2.UriKey, p2.Domain_ID
from LB_Keywords_Pages_Static kps
 inner join spidir2.dbo.se_pages p 
 on kps.page_id=p.pageid
 inner join Temp_KeyboostRedirects r
 on p.UriKey like '%/'+ r.source +'/%'
 inner join Spidir2.dbo.SE_Pages p2 
 on p2.UriKey='https://www.keyboost.nl'+ r.target
where Account_ID=-2147482063

When we specify the index to be used, it works like a charm https://www.brentozar.com/pastetheplan/?id=SkvkKIbAm

select kps.Page_ID, p.PageID, r.*, p2.UriKey, p2.Domain_ID
from LB_Keywords_Pages_Static kps
 inner join spidir2.dbo.se_pages p 
 on kps.page_id=p.pageid
 inner join Temp_KeyboostRedirects r
 on p.UriKey like '%/'+ r.source +'/%'
 inner join Spidir2.dbo.SE_Pages p2 with (index(SE_Pages_UriKey))
 on p2.UriKey='https://www.keyboost.nl'+ r.target
where Account_ID=-2147482063

I noticed this strange behavior also with other keys and queries. I didn't notice this behavior before when we were using SQL 2016. This is causing our database to become very slow.

If we really have to specify every time the index to be used, wouldn't it be better to move to a nosql database?

Question 2

Lenard, seems like you are querying an extra column other than NC index and once you ask for extra column other than the key column, it will do a clustered index scan, If you can alter the index I believe you can add domain_id as an included column

Question 3

@Bijujose that's correct for sure, but that domain_id isn't the only column I would need, there are a lot more possible columns I need. I can create of course an index with all other columns included, but that means a lot of TB of extra data and will slow down updates and inserts as it has to update both the nonclustered index and clustered index for every little change.

Question 4

Lienard, Yeah you are right, so you are pulling millions of records to front-end?

Question 5

@Bijujose no, most of the time it is only a selection of a few thousand.

Question 6

A Better Way To Select Star.

Question 7

Sometimes the person writing T-SQL has more accurate information than the query optimizer. In all of your query plans, SQL Server has a final cardinality estimate of 5360890000 rows, which is how many rows are in the dbo.SE_Pages table. Only a single row is ever returned though, and in the comments you said:

most of the time it['s result set] is only a selection of a few thousand

Something is going very wrong with cardinality estimation. From the optimizer's point of view, it's giving you the best plan that it can. It thinks that all plan options are expensive because billions of rows need to be returned. When returning all rows, in general, a clustered index scan will be costed much cheaper than using an NCI + key lookups to return every single row. The bad cardinality estimate results in a query plan that does the scan which takes days. The key to getting better performance without index hints is to fix the cardinality estimate.

You said that this problem seemed to start happening after you upgraded from SQL Server 2016 to SQL Server 2017. Did you gather statistics after the upgrade? The two most likely causes are that there's something wrong with the statistics for dbo.SE_Pages or Microsoft made a change in SQL Server 2017 that's harmful to your workload.

You can test if statistics are the cause by updating them for that table. You can test if SQL Server 2017 is the problem by changing the compatibility level of the relevant database to 130, although this could have other effects on your application. If you need another temporary workaround that doesn't involve an index try FAST 1. That hint encourages the optimizer to prefer plans which return the first row as quickly as possible. In your case, it's a way to disfavor plans that do a scan. If you need additional help you'll need to add more information to your question. The histogram for UriKey and the table structure (along with an explanation as to why you're partitioning by UriKey) are likely to be helpful.

Joe Obbish Joe Obbish 33k4 gold badges76 silver badges155 bronze badges · Answer 1 · 2018-11-21 01:27:58Z

Sometimes the person writing T-SQL has more accurate information than the query optimizer. In all of your query plans, SQL Server has a final cardinality estimate of 5360890000 rows, which is how many rows are in the dbo.SE_Pages table. Only a single row is ever returned though, and in the comments you said:

most of the time it['s result set] is only a selection of a few thousand

Something is going very wrong with cardinality estimation. From the optimizer's point of view, it's giving you the best plan that it can. It thinks that all plan options are expensive because billions of rows need to be returned. When returning all rows, in general, a clustered index scan will be costed much cheaper than using an NCI + key lookups to return every single row. The bad cardinality estimate results in a query plan that does the scan which takes days. The key to getting better performance without index hints is to fix the cardinality estimate.

You said that this problem seemed to start happening after you upgraded from SQL Server 2016 to SQL Server 2017. Did you gather statistics after the upgrade? The two most likely causes are that there's something wrong with the statistics for dbo.SE_Pages or Microsoft made a change in SQL Server 2017 that's harmful to your workload.

You can test if statistics are the cause by updating them for that table. You can test if SQL Server 2017 is the problem by changing the compatibility level of the relevant database to 130, although this could have other effects on your application. If you need another temporary workaround that doesn't involve an index try FAST 1. That hint encourages the optimizer to prefer plans which return the first row as quickly as possible. In your case, it's a way to disfavor plans that do a scan. If you need additional help you'll need to add more information to your question. The histogram for UriKey and the table structure (along with an explanation as to why you're partitioning by UriKey) are likely to be helpful.

Stack Exchange Network

Clustered Index Scan on 5+ billion rows instead of using non clusterd index for <1000 rows

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Clustered Index Scan on 5+ billion rows instead of using non clusterd index for <1000 rows

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions