I have a query that runs very fast with distinct than without distinct. when running both queries, the one with distinct takes 13% execution time, the one without distinct takes 87% execution time.
These are TSQL query in SQL Server 2014 on a database with 110 compatibility mode.
Strange thing is that the query with distinct is using seeks while the query without distinct is using 1 scan. Both query are identical and have the same where clause.
Can you help me understand why the query with the distinct is faster and why the query without distinct is not using a seek ?
Query plans : www.brentozar.com/pastetheplan/?id=ryr18jJsb
Queries:
select
SMS_R_SYSTEM.ItemKey
,SMS_R_SYSTEM.Name0,SMS_R_SYSTEM.SMS_Unique_Identifier0
,SMS_R_SYSTEM.Resource_Domain_OR_Workgr0
,SMS_R_SYSTEM.Client0
from vSMS_R_System as SMS_R_SYSTEM
inner join Add_Remove_Programs_DATA
on Add_Remove_Programs_DATA.MachineID = SMS_R_System.ItemKey
inner join Add_Remove_Programs_64_DATA
on Add_Remove_Programs_64_DATA.MachineID = SMS_R_System.ItemKey
where Add_Remove_Programs_DATA.DisplayName00 = 'aze'
or Add_Remove_Programs_64_DATA.DisplayName00 = 'aze'
;
select distinct
SMS_R_SYSTEM.ItemKey
,SMS_R_SYSTEM.Name0,SMS_R_SYSTEM.SMS_Unique_Identifier0
,SMS_R_SYSTEM.Resource_Domain_OR_Workgr0
,SMS_R_SYSTEM.Client0
from vSMS_R_System as SMS_R_SYSTEM
inner join Add_Remove_Programs_DATA
on Add_Remove_Programs_DATA.MachineID = SMS_R_System.ItemKey
inner join Add_Remove_Programs_64_DATA
on Add_Remove_Programs_64_DATA.MachineID = SMS_R_System.ItemKey
where Add_Remove_Programs_DATA.DisplayName00 = 'aze'
or Add_Remove_Programs_64_DATA.DisplayName00 = 'aze'
;
1 Answer 1
From the way I read your execution plans, your DISTINCT
query is selecting columns only from vSMS_R_System
, so the optimizer chooses to read that data first (retrieving the DISTINCT
values used in your SELECT
) and then uses a NESTED LOOP
to join up against the other tables.
The non-DISTINCT
query forces the optimizer to assume that many more rows might need to be read from vSMS_R_System
, so it chooses a MERGE JOIN
to scan the other tables and then a NESTED LOOP
to match up against vSMS_R_System
.
-
I was expecting the optimizer to seek the index for the non-distinct query. This is part that is confusing me since there is an obvious index to use. the distinct query is using it.Florent.K– Florent.K2017年09月29日 14:11:46 +00:00Commented Sep 29, 2017 at 14:11
Explore related questions
See similar questions with these tags.
DISTINCT
being present. It may help to know what keys and indexes are defined on those tables so I suggest editing that information into the question. Though as the first plan is scanning ~400K and ~4M rows from theAdd_Remove_*
tables, ifDisplayName00 = 'aze'
is particularly selective it could help that plan massively to have an index on each of those columns, perhapsINCLUDE
ingMachineID
'.