SQL Server expensive nested loops join and lazy table spool

Question 1

I am trying to tune the below query which takes 15-16 seconds no matter what value is passed in as a parameter, the query is:

select distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
from datagatheringruntime dgr 
 inner join processentitymapping pem on pem.entityid = dgr.entityid 
 inner join document d on d.entityid = pem.entityid or d.unitofworkid = pem.processid
 left join PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
 where rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
OPTION(RECOMPILE)

I have updated the statistics for all the tables touched by the query (excluding the DataGatheringRuntime table which is quite big at ~100GB) and have tried re-factoring the query using a CTE but get the same execution plan and need some assistance.

The actual execution plan can be found here:

https://www.brentozar.com/pastetheplan/?id=ByUVIqlFE

It's clear from the execution plan that the problem lies with the outer input on the nested loop join specifically with the lazy table spool following the scan of the non-clustered IX_Camunda_1 index on the Document table but I have no idea how to tackle that issue and would appreciate any guidance.

enter image description here

Question 2

Could you please provide DDL of these tables. I am not sure if this is relevant here but, have you tried using join options like inner loop join.

Question 3

I would try removing the OR clause in the join between document and processingentitymapping

You could do that with UNION

SELECT distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
FROM datagatheringruntime dgr 
INNER JOIN processentitymapping pem on pem.entityid = dgr.entityid 
INNER JOIN document d on d.entityid = pem.entityid 
LEFT JOIN PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
WHERE rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
UNION
SELECT distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
FROM datagatheringruntime dgr 
INNER JOIN processentitymapping pem on pem.entityid = dgr.entityid 
INNER JOIN document d on d.unitofworkid = pem.processid
LEFT JOIN PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
WHERE rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
OPTION(RECOMPILE);

The reason being that the table spool is feeding the NESTED LOOPS operator enter image description here

And on this nested loops operator is the OR predicate. enter image description here

Filtering out until we have 9 rows remaining.

Changing the OR to a UNION should remove the spool, you might have to look into indexing after removing the OR.

Indexes that could improve performance after rewriting with UNION

CREATE INDEX IX_EntityId
on document(EntityId)
INCLUDE(DocumentPath, DocumentName, DateCreated, PendingCorrespondenceId);
CREATE INDEX IX_UnitOfWorkId
on document(UnitOfWorkId)
INCLUDE(DocumentPath, DocumentName, DateCreated, PendingCorrespondenceId);

See here for another example on this

Question 4

thank you so much for that, the query now executes in under one second (without any index changes). Thanks for posting that link too, you've helped me big time and I've learned something new, much appreciated!

Question 5

@Fza Awesome, good to hear :). No problem, happy to help!

Question 6

Instead of processing DataGatheringRuntime table which is quite big at ~100GB) mutiples times process it single time by putting them in #temp table ~~(削除) or CTE (削除ここまで)~~

Then remove Distinct. If there is duplicate data then find the reason behind duplicate data and remove duplicate data by writing correct query.

What is the purpose of Distinct and UNION in same query ?

Create table #temp(entityid int,processid int)
select pem.entityid,pem.processid
from datagatheringruntime dgr 
 inner join processentitymapping pem on pem.entityid = dgr.entityid 
 where rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
 --OPTION(RECOMPILE)
 select d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
from document d
left join PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
 where exists(select 1 from #temp pem where d.entityid = pem.entityid )
 UNION ALL
 select d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
from document d
left join PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
 where exists(select 1 from #temp pem where d.unitofworkid = pem.processid )

Question 7

CTEs don't persist data.

score 5 · Accepted Answer · 2019-04-02 08:17:17Z

I would try removing the OR clause in the join between document and processingentitymapping

You could do that with UNION

SELECT distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
FROM datagatheringruntime dgr 
INNER JOIN processentitymapping pem on pem.entityid = dgr.entityid 
INNER JOIN document d on d.entityid = pem.entityid 
LEFT JOIN PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
WHERE rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
UNION
SELECT distinct d.documentpath as path, d.documentname as name, d.datecreated as created, pc.DateProcessed
FROM datagatheringruntime dgr 
INNER JOIN processentitymapping pem on pem.entityid = dgr.entityid 
INNER JOIN document d on d.unitofworkid = pem.processid
LEFT JOIN PendingCorrespondence pc on pc.PendingCorrespondenceId = d.PendingCorrespondenceId
WHERE rootid = @P0 and dgr.name in('cust_pn', 'case_pn')
OPTION(RECOMPILE);

The reason being that the table spool is feeding the NESTED LOOPS operator enter image description here

And on this nested loops operator is the OR predicate. enter image description here

Filtering out until we have 9 rows remaining.

Changing the OR to a UNION should remove the spool, you might have to look into indexing after removing the OR.

Indexes that could improve performance after rewriting with UNION

CREATE INDEX IX_EntityId
on document(EntityId)
INCLUDE(DocumentPath, DocumentName, DateCreated, PendingCorrespondenceId);
CREATE INDEX IX_UnitOfWorkId
on document(UnitOfWorkId)
INCLUDE(DocumentPath, DocumentName, DateCreated, PendingCorrespondenceId);

See here for another example on this

thank you so much for that, the query now executes in under one second (without any index changes). Thanks for posting that link too, you've helped me big time and I've learned something new, much appreciated!

Stack Exchange Network

SQL Server expensive nested loops join and lazy table spool

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

SQL Server expensive nested loops join and lazy table spool

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions