I am trying to tune the below query which takes 15-16 seconds no matter what value is passed in as a parameter, the query is:
select distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
from datagatheringruntime dgr
inner join processentitymapping pem on pem.entityid = dgr.entityid
inner join document d on d.entityid = pem.entityid or d.unitofworkid = pem.processid
left join PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
where rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
OPTION(RECOMPILE)
I have updated the statistics for all the tables touched by the query (excluding the DataGatheringRuntime
table which is quite big at ~100GB
) and have tried re-factoring the query using a CTE
but get the same execution plan and need some assistance.
The actual execution plan can be found here:
https://www.brentozar.com/pastetheplan/?id=ByUVIqlFE
It's clear from the execution plan that the problem lies with the outer input on the nested loop join
specifically with the lazy table spool
following the scan
of the non-clustered IX_Camunda_1
index on the Document
table but I have no idea how to tackle that issue and would appreciate any guidance.
-
Could you please provide DDL of these tables. I am not sure if this is relevant here but, have you tried using join options like inner loop join.Learning_DBAdmin– Learning_DBAdmin2019年04月02日 08:22:40 +00:00Commented Apr 2, 2019 at 8:22
2 Answers 2
I would try removing the OR
clause in the join between document
and processingentitymapping
You could do that with UNION
SELECT distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
FROM datagatheringruntime dgr
INNER JOIN processentitymapping pem on pem.entityid = dgr.entityid
INNER JOIN document d on d.entityid = pem.entityid
LEFT JOIN PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
WHERE rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
UNION
SELECT distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
FROM datagatheringruntime dgr
INNER JOIN processentitymapping pem on pem.entityid = dgr.entityid
INNER JOIN document d on d.unitofworkid = pem.processid
LEFT JOIN PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
WHERE rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
OPTION(RECOMPILE);
The reason being that the table spool is feeding the NESTED LOOPS
operator
enter image description here
And on this nested loops operator is the OR
predicate.
enter image description here
Filtering out until we have 9 rows remaining.
Changing the OR
to a UNION
should remove the spool, you might have to look into indexing after removing the OR
.
Indexes that could improve performance after rewriting with UNION
CREATE INDEX IX_EntityId
on document(EntityId)
INCLUDE(DocumentPath, DocumentName, DateCreated, PendingCorrespondenceId);
CREATE INDEX IX_UnitOfWorkId
on document(UnitOfWorkId)
INCLUDE(DocumentPath, DocumentName, DateCreated, PendingCorrespondenceId);
See here for another example on this
-
thank you so much for that, the query now executes in under one second (without any index changes). Thanks for posting that link too, you've helped me big time and I've learned something new, much appreciated!Fza– Fza2019年04月02日 09:08:05 +00:00Commented Apr 2, 2019 at 9:08
-
@Fza Awesome, good to hear :). No problem, happy to help!Randi Vertongen– Randi Vertongen2019年04月02日 09:27:49 +00:00Commented Apr 2, 2019 at 9:27
Instead of processing DataGatheringRuntime table which is quite big at ~100GB)
mutiples times
process it single time by putting them in #temp
table (削除) or CTE
(削除ここまで)
Then remove Distinct
. If there is duplicate data then find the reason behind duplicate data
and remove duplicate data by writing correct query.
What is the purpose of Distinct
and UNION
in same query ?
Create table #temp(entityid int,processid int)
select pem.entityid,pem.processid
from datagatheringruntime dgr
inner join processentitymapping pem on pem.entityid = dgr.entityid
where rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
--OPTION(RECOMPILE)
select d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
from document d
left join PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
where exists(select 1 from #temp pem where d.entityid = pem.entityid )
UNION ALL
select d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
from document d
left join PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
where exists(select 1 from #temp pem where d.unitofworkid = pem.processid )
-
CTEs don't persist data.Erik Reasonable Rates Darling– Erik Reasonable Rates Darling2019年08月02日 10:03:33 +00:00Commented Aug 2, 2019 at 10:03
Explore related questions
See similar questions with these tags.