We have a huge SQL 2017 database containing more than 5 billion internetpages. One of the table contains the info about the page: PageID is the primary clustered key and UriKey is the non clustered index so we can lookup based on the Uri of the page. There are a lot more columns in that table as well, related to that page.
If we join with other tables and we join on the UriKey, the query will use the non clustered index only if we only require the UriKey itself. Once we add another column, it does a clustered index scan, wich can take up to days to complete.
This uses the correct index and takes less than a second to complete https://www.brentozar.com/pastetheplan/?id=rkk1vIb07
select kps.Page_ID, p.PageID, r.*, p2.UriKey
from LB_Keywords_Pages_Static kps
inner join spidir2.dbo.se_pages p
on kps.page_id=p.pageid
inner join Temp_KeyboostRedirects r
on p.UriKey like '%/'+ r.source +'/%'
inner join Spidir2.dbo.SE_Pages p2
on p2.UriKey='https://www.keyboost.nl'+ r.target
where Account_ID=-2147482063
This selects one column extra (p2.Domain_ID) and takes more than a day to complete https://www.brentozar.com/pastetheplan/?id=ByOpUIb0m
select kps.Page_ID, p.PageID, r.*, p2.UriKey, p2.Domain_ID
from LB_Keywords_Pages_Static kps
inner join spidir2.dbo.se_pages p
on kps.page_id=p.pageid
inner join Temp_KeyboostRedirects r
on p.UriKey like '%/'+ r.source +'/%'
inner join Spidir2.dbo.SE_Pages p2
on p2.UriKey='https://www.keyboost.nl'+ r.target
where Account_ID=-2147482063
When we specify the index to be used, it works like a charm https://www.brentozar.com/pastetheplan/?id=SkvkKIbAm
select kps.Page_ID, p.PageID, r.*, p2.UriKey, p2.Domain_ID
from LB_Keywords_Pages_Static kps
inner join spidir2.dbo.se_pages p
on kps.page_id=p.pageid
inner join Temp_KeyboostRedirects r
on p.UriKey like '%/'+ r.source +'/%'
inner join Spidir2.dbo.SE_Pages p2 with (index(SE_Pages_UriKey))
on p2.UriKey='https://www.keyboost.nl'+ r.target
where Account_ID=-2147482063
I noticed this strange behavior also with other keys and queries. I didn't notice this behavior before when we were using SQL 2016. This is causing our database to become very slow.
If we really have to specify every time the index to be used, wouldn't it be better to move to a nosql database?
-
Lenard, seems like you are querying an extra column other than NC index and once you ask for extra column other than the key column, it will do a clustered index scan, If you can alter the index I believe you can add domain_id as an included columnBiju jose– Biju jose2018年11月20日 10:10:20 +00:00Commented Nov 20, 2018 at 10:10
-
@Bijujose that's correct for sure, but that domain_id isn't the only column I would need, there are a lot more possible columns I need. I can create of course an index with all other columns included, but that means a lot of TB of extra data and will slow down updates and inserts as it has to update both the nonclustered index and clustered index for every little change.Gregory Liénard– Gregory Liénard2018年11月20日 10:31:00 +00:00Commented Nov 20, 2018 at 10:31
-
Lienard, Yeah you are right, so you are pulling millions of records to front-end?Biju jose– Biju jose2018年11月20日 10:33:21 +00:00Commented Nov 20, 2018 at 10:33
-
@Bijujose no, most of the time it is only a selection of a few thousand.Gregory Liénard– Gregory Liénard2018年11月20日 10:51:27 +00:00Commented Nov 20, 2018 at 10:51
-
1A Better Way To Select Star.Erik Reasonable Rates Darling– Erik Reasonable Rates Darling2018年11月20日 12:49:52 +00:00Commented Nov 20, 2018 at 12:49
1 Answer 1
Sometimes the person writing T-SQL has more accurate information than the query optimizer. In all of your query plans, SQL Server has a final cardinality estimate of 5360890000 rows, which is how many rows are in the dbo.SE_Pages
table. Only a single row is ever returned though, and in the comments you said:
most of the time it['s result set] is only a selection of a few thousand
Something is going very wrong with cardinality estimation. From the optimizer's point of view, it's giving you the best plan that it can. It thinks that all plan options are expensive because billions of rows need to be returned. When returning all rows, in general, a clustered index scan will be costed much cheaper than using an NCI + key lookups to return every single row. The bad cardinality estimate results in a query plan that does the scan which takes days. The key to getting better performance without index hints is to fix the cardinality estimate.
You said that this problem seemed to start happening after you upgraded from SQL Server 2016 to SQL Server 2017. Did you gather statistics after the upgrade? The two most likely causes are that there's something wrong with the statistics for dbo.SE_Pages
or Microsoft made a change in SQL Server 2017 that's harmful to your workload.
You can test if statistics are the cause by updating them for that table. You can test if SQL Server 2017 is the problem by changing the compatibility level of the relevant database to 130, although this could have other effects on your application. If you need another temporary workaround that doesn't involve an index try FAST 1
. That hint encourages the optimizer to prefer plans which return the first row as quickly as possible. In your case, it's a way to disfavor plans that do a scan. If you need additional help you'll need to add more information to your question. The histogram for UriKey
and the table structure (along with an explanation as to why you're partitioning by UriKey
) are likely to be helpful.
Explore related questions
See similar questions with these tags.