I'm dealing with performance issues with a 3rd-party view that's pulling from our production db. Troubleshooting is currently being directed at the hardware of our SQL Server box. Just to give you an idea of our production database's size, the largest table is around 32mil records. There are no major performance issues within our production environment itself.
I'm concerned that the view itself is the issue; that we may end up bending over backwards on hardware overhauls (adding more SSDs, indexing, and potentially even partitioning production tables) for nothing. I only have a few years' experience with SQL, so I wanted someone with a more trained eye to see if anything jumps out as being bad practice in this view?
A simple select top 10 * from new_view
statement takes over 40 seconds to run. I'm pasting a redacted version of the query below, just to show structure.
SELECT
A.xxxxx,
A.xxxxxxxxx,
A.xxxxxx,
E.xxxxxxx + '-' + substring(E.xxxxx, 1, LEN(E.xxxxx) - 1) + '-' + SUBSTRING(E.xxxxx, LEN(E.xxxxx), 1) [xxxxx],
A.xxxxxxx,
A.xxxxxxx,
A.xxxxxx,
A.xxxxxxxx,
SUM(ISNULL(B.xxx_xxx, 0)),
SUM(ISNULL(B.xxxxxx_xxxx, 0)),
A.xxxxx,
A.xxxxx,
F.xxxxx,
A.xxxxx,
A.xxxxx,
C.xxxxxxxxxxxx,
C.xxxxxxxxxxxxxxxxxxx,
C.xxxxxxxxxxx,
A.xxxxxxxx,
C.xxxxxxxxxxxxxxxx,
C.xxxxxxxxxxxxxx,
C.xxxxxxxxxxxxxxx,
C.xxxxxxxxxx,
D.xxxxxxxxx,
E.xxxxxxxxxx,
E.xxxxxxx,
CASE
WHEN
EXISTS
(
SELECT
1
FROM
Production.dbo.xxxxxxxxxxxxxxx A1
WHERE
A1.xxxxxx = D .xxxxxx
AND A1.xxxxxx = 'xxxxxx')
THEN
CAST(1 AS bit)
ELSE
CAST(0 AS bit)
END [xxxxxxxxx],
CASE
WHEN
EXISTS
(
SELECT
1
FROM
Production.dbo.xxxxxxxxx A1
WHERE
A1.[KEY] = D.xx_xx
AND A1.xxxxxxx = 'xxxxxxxxxx'
AND A1.xx_xxxx IS NOT NULL)
THEN
CAST(1 AS bit)
ELSE
CAST(0 AS bit)
END [xxxxxxxx],
G.xxxxxxxxxxxxx,
G.xxxxxxxxxxxx,
G.xxxxxxxxxxxxxxxxxx,
H.xxxxxxxxxxxxxx,
I.xxxxxxxxx,
J.xxxxxxxx,
K.xxxxxxxxx,
L.xxxxxxx,
CASE
WHEN
A.xxxx = 'xxxx'
AND A.xxxxxxx LIKE 'x%'
THEN
CAST(1 AS BIT)
ELSE
CAST(0 AS BIT)
END [xxxxxxxx],
CASE
WHEN
A.xxxxxxxx = 'xxxx'
AND A.xxxxxxx LIKE 'x%'
THEN
CAST(1 AS BIT)
ELSE
CAST(0 AS BIT)
END [xxxxxxxxx]
FROM
Production.dbo.xxxxxxxx A
LEFT OUTER JOIN
Production.dbo.xxxxxxxxx F
ON
A.xxxxxxx = F.xx_xx
INNER JOIN
Production.dbo.xxxxx B
ON
A.xxID = B.xxID
OUTER APPLY
(
SELECT
TOP 1 A1.xxxxxxID,
A1.xxxxx,
B1.xxxxxxxxxxx,
A1.xxxxxxxxxx,
A1.xxxxxxxx,
A1.xxxxxxxx,
A1.xxxxxxxxx,
A1.xxxxxxxxxx,
A1.xxxxxxxxxxxx
FROM
dbo.xxxxxxx A1
INNER JOIN
dbo.xxxxxxxx B1
ON
A1.xxxID = B1.xxxID
WHERE
A1.xxxID = A.xxxID
ORDER BY
A1.xxxxxx DESC) C
INNER JOIN
Production.dbo.xxxxxxxxx D
ON
A.xxxID = D .xxxxID
INNER JOIN
Production.dbo.xxxxxxxxx E
ON
D .xxxID = E.xxxxID
AND E.xxxx = 'Y'
LEFT OUTER JOIN
dbo.xxxxxxxxxxxxx G
ON
A.xxCD = G.xxCD
LEFT OUTER JOIN
(
SELECT
A1.xxxID,
STRING_AGG(A1.xxxxx, ',') [xxxxxxxxxxx]
FROM
dbo.xxxxxxxxxxxxxxx A1
GROUP BY
A1.xxxxxxxxxID) H
ON
G.xxxxxxxxxxID = H.xxxxxxxxxxxID
OUTER APPLY
(
SELECT
TOP 1 A1.xxxxID
FROM
Production.dbo.xxxxxxx A1
WHERE
A1.xxxxID = D.xxx_ID
AND A1.xxxxxx IN ('xxxxxxx')
ORDER BY
A1.xxxxxx DESC) I
OUTER APPLY
(
SELECT
TOP 1 A1.xxxxxID
FROM
Production.dbo.xxxxxxxxx A1
WHERE
A1.xxxxID = D.xxxxID
AND A1.xxxxxx IN ('xxxxx')
ORDER BY
A1.xxxxx DESC) J
OUTER APPLY
(
SELECT
TOP 1 A1.xxxxxxxx
FROM
Production.dbo.xxxxxxxx A1
WHERE
A1.xxxxID = D.xxxID
AND A1.xxxxx IN ('xxxxxx')
ORDER BY
A1.xxxxxx DESC) K
LEFT OUTER JOIN
Production.dbo.xxxxxxxx L
ON
E.xxxID = L.xxxx_id
GROUP BY
A.xxxxx,
A.xxxxxxxxx,
A.xxxxx,
A.xxxxxxx,
A.xxxxxxx,
A.xxxxxxxxx,
A.xxxxxxxxxx,
A.xxxxx,
A.xxxxxxx,
F.xxxxxx,
A.xxxxxxx,
A.xxxxxx,
C.xxxxxxxxxxxxxxx,
C.xxxxxxxxxxxxxxxxxxx,
A.xxxxxxxx,
C.xxxxxxxxxxxxxx,
C.xxxxxxxxxxxxxxx,
C.xxxxxxxxxxxxxxxx,
C.xxxxxxxxxxx,
E.xxxxxxxxxx,
E.xxxxxxxxx,
D.xxxxxxxxx,
D.xxxxxx,
D.xxxxx,
C.xxxxxxxxxxx,
G.xxxxxxxxxxxxxxxxxxxxx,
G.xxxxxxxxxxxxxxxxxxxxxxxxxxxx,
G.xxxxxxxxxxxxxxxxxxxxxxxxxx,
H.xxxxxxxxxxxxxxx,
E.xxxxx,
E.xxxxxxxx,
I.xxxxxxxxx,
J.xxxxxxxxx,
K.xxxxxxxxx,
L.xxxxxxx,
A.xxxxxxxxx
Anonymized version (.pesession
file) of the execution plan:
https://jkj7.com/files/exeplan_anonymized.pesession
2 Answers 2
Anonymized plans are a misery to look at, but you'll want to start by addressing these Eager Index Spools:
Looking at the query you provided, they are coming from the three OUTER APPLY subqueries.
You can find out more about fixing those in these posts, both by me:
- How Eager Index Spool Operators Hurt Query Performance In SQL Server
- Indexing SQL Server Queries For Performance: Fixing An Eager Index Spool
If you need further help, please provide an actual execution plan. Estimated plans leave much to be desired as far as details go.
-
Thank you for the guidance on the outer apply. I realize you're limited on details here, but zooming out a bit, would you agree this is a query-design issue and not a hardware/db-size one? [FYI, our hardware is running on flash-SSD's with avg. latency of .006ms, peak ~1ms.]James Johnson– James Johnson2024年07月30日 21:20:53 +00:00Commented Jul 30, 2024 at 21:20
-
1@JamesJohnson if you'd like that sort of assessment, I am available for consulting.Erik Reasonable Rates Darling– Erik Reasonable Rates Darling2024年07月30日 21:29:25 +00:00Commented Jul 30, 2024 at 21:29
Subqueries in a column expression are often a source of performance issues. SQL Server generally can't (or won't try to) optimize those, so they get run in sequence, over and over again. I would try converting those CASE WHEN EXISTS(SELECT 1 ...) THEN CAST(1 AS bit) ELSE CAST(0 AS bit) END
statement to use an OUTER APPLY
. This doesn't always work, but it's a quick and easy thing to do that sometimes has a big pay-off.
It should also be noted that the larger the GROUP BY
clause is, the slower the query generally runs. If there's any logical way to group on a smaller set of predicates, possibly in an inner query before taking on everything else, that give you a performance boost too.