In a number of the databases I'm helping to maintain, there is a pattern where the code passes in a list of IDs as an XML string to a stored procedure. A user-defined function turns them into a table which is then used to match IDs. The function that accomplishes this looks like this:
ALTER function [dbo].[XMLIdentifiers] (@xml xml)
returns table
as
return (
--get the ids from the xml
select Item.value('.', 'int') as id from @xml.nodes('//id') as T(Item)
)
This works fine when we test it in a wide variety of scenarios in SSMS. But if it's called within a stored procedure, it will run extremely slowly and the execution plan will show it spending the large majority of time parsing these IDs. This is true whether we write the results to a temp table, join to them or use them in a subquery.
Example from an execution plan using the above function:
execution plan
Can anyone offer insight as to why we are seeing such poor performance from these queries? Is there a better way to parse the values from XML? Most of the databases using this pattern are on SQL Server 2014 or 2016.
-
First and foremost I'd try to get them passed in another format - it should be trivial for the app to convert those into a delimited list or a TVP. How many IDs are typically being sent over? Can you provide the actual SP contents as well and the full execution plan ?LowlyDBA - John M– LowlyDBA - John M2017年12月15日 14:52:40 +00:00Commented Dec 15, 2017 at 14:52
-
What datatype is the value you use as a parameter to the function? (n)varchar() or xml?Mikael Eriksson– Mikael Eriksson2017年12月15日 14:58:20 +00:00Commented Dec 15, 2017 at 14:58
-
They are being passed in with the XML datatype.Wayne Rossi– Wayne Rossi2017年12月15日 14:59:19 +00:00Commented Dec 15, 2017 at 14:59
-
1The costs shown for XML operators are not at all reliable. It's just based on estimates with no idea of the size or complexity of the XML that will be actually received. With the temp table approach how long does the statement parsing and inserting to temp table actually take to execute?Martin Smith– Martin Smith2017年12月15日 15:21:32 +00:00Commented Dec 15, 2017 at 15:21
-
1Even 1 second to parse out 3 values is way to slow. I don’t believe the query you have showed here is the reason your procedure is slow.Mikael Eriksson– Mikael Eriksson2017年12月15日 22:24:18 +00:00Commented Dec 15, 2017 at 22:24
1 Answer 1
To answer part of your question, yes there is a better way to parse the values from the XML.
Always( * ) extract the text() from the xml node at the earliest opportunity.
( * - "it depends" but always test to make sure)
In your case, this means changing the "nodes" method to use the text() node as part of the xpath query:
ALTER function [dbo].[XMLIdentifiers] (@xml xml)
returns table
as
return (
--get the ids from the xml
select Item.value('.', 'int') as id from @xml.nodes('//id/text()') as T(Item)
)
In this simple test, you will see not only does a simple statistics test show a dramatic improvement, but also the execution plan changes significantly.
declare @x xml
select @x = (select top(100000) row_number() over(order by @@spid) as [n] from sys.columns as [a],sys.columns as [b] for xml auto,elements,type);
declare @c int;
set statistics io,time on;
-- Usual suspect
select @c = nd.value('.','int')
from @x.nodes('//n') x(nd);
-- Using text() when we might also need other parts of the node
select @c = nd.value('(./text())[1]','int')
from @x.nodes('//n') x(nd);
-- Using text() when that is all we need from the node
select @c = nd.value('.','int')
from @x.nodes('//n/text()') x(nd);
set statistics io,time off;
RESULTS
Not using text(): CPU time = 1281 ms, elapsed time = 1459 ms.
Without text node Using text() late: CPU time = 719 ms, elapsed time = 739 ms.
With late text node Using text() early: CPU time = 406 ms, elapsed time = 473 ms.
Explore related questions
See similar questions with these tags.