1
\$\begingroup\$

I'm having performance issues with this query. If I remove the status column, it runs very fast; but adding the subquery in the column section delays the query way too much (1.02 min execution). How can I modify this query so it runs faster while still getting the desired data?

The reason I put that subquery there is because I needed the status for the latest activity; some activities have null status so I have to ignore them.

Establishments: 6.5k rows - EstablishmentActivities: 70k rows - Status: 2 (Active, Inactive)

SELECT DISTINCT
 est.id,
 est.trackingNumber,
 est.NAME AS 'establishment',
 actTypes.NAME AS 'activity',
 (
 SELECT stat3.NAME
 FROM SACPAN_EstablishmentActivities eact3
 INNER JOIN SACPAN_ActivityTypes at3
 ON eact3.activityType_FK = at3.code
 INNER JOIN SACPAN_Status stat3
 ON stat3.id = at3.status_FK
 WHERE eact3.establishment_FK = est.id
 AND eact3.rowCreatedDT = (
 SELECT MAX(est4.rowCreatedDT)
 FROM SACPAN_EstablishmentActivities est4
 INNER JOIN SACPAN_ActivityTypes at4
 ON est4.establishment_fk = est.id
 AND est4.activityType_FK = at4.code
 WHERE est4.establishment_fk = est.id
 AND at4.status_FK IS NOT NULL
 )
 AND at3.status_FK IS NOT NULL
 ) AS 'status',
 est.authorizationNumber,
 reg.NAME AS 'region',
 mun.NAME AS 'municipality',
 ISNULL(usr.NAME, '') + ISNULL(+ ' ' + usr.lastName, '')
 AS 'created',
 ISNULL(usr2.NAME, '') + ISNULL(+ ' ' + usr2.lastName, '')
 AS 'updated',
 est.rowCreatedDT,
 est.rowUpdatedDT,
 CASE WHEN est.rowUpdatedDT >= est.rowCreatedDT
 THEN est.rowUpdatedDT
 ELSE est.rowCreatedDT
 END AS 'LatestCreatedOrModified'
FROM SACPAN_Establishments est
INNER JOIN SACPAN_EstablishmentActivities eact
 ON est.id = eact.establishment_FK
INNER JOIN SACPAN_ActivityTypes actTypes
 ON eact.activityType_FK = actTypes.code
INNER JOIN SACPAN_Regions reg
 ON est.region_FK = reg.code --
INNER JOIN SACPAN_Municipalities mun
 ON est.municipality_FK = mun.code
INNER JOIN SACPAN_ContactEstablishments ce
 ON ce.establishment_FK = est.id
INNER JOIN SACPAN_Contacts con
 ON ce.contact_FK = con.id
--JOIN SACPAN_Status stat ON stat.id = actTypes.status_FK
INNER JOIN SACPAN_Users usr
 ON usr.id = est.rowCreatedBy_FK
LEFT JOIN SACPAN_Users usr2
 ON usr2.id = est.rowUpdatedBy_FK
WHERE (con.ssn = @ssn OR @ssn = '*')
 AND eact.rowCreatedDT = (
 SELECT MAX(eact2.rowCreatedDT)
 FROM SACPAN_EstablishmentActivities eact2
 WHERE eact2.establishment_FK = est.id
)
--AND est.id = 6266
ORDER BY 'LatestCreatedOrModified' DESC

I can't upload the execution plan because of rep points, but there's a 21% on one of the establishment activity tables.

Phrancis
20.5k6 gold badges69 silver badges155 bronze badges
asked Mar 4, 2013 at 1:40
\$\endgroup\$
1
  • \$\begingroup\$ Please provide DDL query for the tables and execution plan \$\endgroup\$ Commented Mar 4, 2013 at 10:52

2 Answers 2

3
\$\begingroup\$

Nitpicking

I find your table aliases to be difficult to follow. Take this with a grain of salt coming from an outsider to your organization. I find eact, eact3, eact4, etc. to be a bit confusing, for example. Granted, if there is an internal, consistent standard, disregard this part.

SELECT DISTINCT

This is a pretty expensive operation, and usually better to use GROUP BY instead. That way you run your uniqueness check on the result set instead of the source data set (which could be a LOT larger) and only on the columns you think/know may have duplicate values. For example:

GROUP BY est.id, est.trackingNumber, actTypes.NAME

Create a stored statement

Given the look of this whole script, this doesn't smell like an ad-hoc query to me. Assuming that's accurate, you would benefit from creating a stored statement, in this case likely a stored procedure. This commented line at the bottom kind of gave me a hint:

-- AND est.id = 6266

For example:

CREATE PROCEDURE FooBar
 @Establishment_ID INT
AS
BEGIN
/* your query here */
AND est.id = @Establishment_ID -- your commented line
ORDER BY 'LatestCreatedOrModified' DESC
END;
GO

Then, whenever needed:

EXEC FooBar 6266;

This way, the execution plan will be stored after it is called for the first time, speeding up (sometimes dramatically) execution after it is ran once.

Compliments

I have to note that your formatting, capitalization and aliasing are consistent. Kudos for that.

The elephant in the room

To me, the slowness of this query could be largely due to multiple levels of nested subqueries, each with multiple join operations. It appears that you join the same tables over and over, and this could be mitigated to some extent by having your nested where clauses in your outer query, whenever possible.

This:

 FROM SACPAN_EstablishmentActivities eact3 --(1)
 INNER JOIN SACPAN_ActivityTypes at3 -- (2)
 ON eact3.activityType_FK = at3.code
 INNER JOIN SACPAN_Status stat3 -- (3)

That:

 FROM SACPAN_EstablishmentActivities est4 -- (1)
 INNER JOIN SACPAN_ActivityTypes at4 -- (2)

And the other:

INNER JOIN SACPAN_EstablishmentActivities eact -- (1)
 ON est.id = eact.establishment_FK
INNER JOIN SACPAN_ActivityTypes actTypes -- (2)
 ON eact.activityType_FK = actTypes.code
INNER JOIN SACPAN_Regions reg
 ON est.region_FK = reg.code --
INNER JOIN SACPAN_Municipalities mun
 ON est.municipality_FK = mun.code
INNER JOIN SACPAN_ContactEstablishments ce
 ON ce.establishment_FK = est.id
INNER JOIN SACPAN_Contacts con
 ON ce.contact_FK = con.id
--JOIN SACPAN_Status stat -- (3)

Notice I've commented out where you are joining all the same tables. You could more than likely work most these conditions into your where clause and avoid unneeded overhead. It's impossible for me to test it out, since I don't have access to your DB, but try to eliminate subqueries/CTEs as much as you can help it.

answered Aug 13, 2014 at 3:01
\$\endgroup\$
2
\$\begingroup\$

Your code is calling several tables multiple times when it doesn't seem like it needs to. Can you pull out your Activities related tables into common table expressions and just link to those in your main query. Rough example:

;with created as
(
 select establishment_fk, max(rowCreatedDT) as maxCreated
 FROM SACPAN_EstablishmentActivities
 JOIN SACPAN_ActivityTypes at4
 ON activityType_FK = at4.code
 WHERE at4.status_FK IS NOT NULL 
),
 status as 
(
 SELECT eact3.establishment_fk, stat3.NAME as status, at3.NAME as activity
 FROM SACPAN_EstablishmentActivities eact3
 INNER JOIN SACPAN_ActivityTypes at3
 ON eact3.activityType_FK = at3.code
 INNER JOIN SACPAN_Status stat3
 ON stat3.id = at3.status_FK
 where exists
 (select 1 from created 
 where created.establishment_fk = eact3.establishment_fk
 and created.maxCreated = eact3.rowCreatedDT)
)
SELECT DISTINCT
 est.id,
 est.trackingNumber,
 est.NAME AS 'establishment',
 actTypes.NAME AS 'activity',
 AS 'status',
 est.authorizationNumber,
 reg.NAME AS 'region',
 mun.NAME AS 'municipality',
 ISNULL(usr.NAME, '') + ISNULL(+ ' ' + usr.lastName, '')
 AS 'created',
 ISNULL(usr2.NAME, '') + ISNULL(+ ' ' + usr2.lastName, '')
 AS 'updated',
 est.rowCreatedDT,
 est.rowUpdatedDT,
 CASE WHEN est.rowUpdatedDT >= est.rowCreatedDT
 THEN est.rowUpdatedDT
 ELSE est.rowCreatedDT
 END AS 'LatestCreatedOrModified'
FROM SACPAN_Establishments est
INNER JOIN status on status.establishment_fk = est.id
INNER JOIN SACPAN_Regions reg
 ON est.region_FK = reg.code --
INNER JOIN SACPAN_Municipalities mun
 ON est.municipality_FK = mun.code
INNER JOIN SACPAN_ContactEstablishments ce
 ON ce.establishment_FK = est.id
INNER JOIN SACPAN_Contacts con
 ON ce.contact_FK = con.id
INNER JOIN SACPAN_Users usr
 ON usr.id = est.rowCreatedBy_FK
LEFT JOIN SACPAN_Users usr2
 ON usr2.id = est.rowUpdatedBy_FK
WHERE (con.ssn = @ssn OR @ssn = '*')

That may be a little rough, but the idea here is to pull out what your Max Created date is just once and then just get your status as one set. I'm fairly certain that a large part of your slowness on the status subquery is how it's having to run; if you can get it into one set that you can then connect to the main query, will hopefully help.

answered Mar 12, 2013 at 23:35
\$\endgroup\$
1
  • 1
    \$\begingroup\$ I would like to say, while CTEs make the code much easier to read, they generally don't improve performance, because the CTE has to be ran every time it's called, just like a subquery. But I do like how your query looks a lot better. \$\endgroup\$ Commented Aug 13, 2014 at 3:24

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.