I have a SELECT
query to a linked server. It looks similar to this
SELECT @Var1 = (SELECT COL1
FROM [LinkedServer].[dbo].[TableA]
WHERE <Condition1>),
@Var2 = (SELECT COL1
FROM [LinkedServer].[dbo].[TableA]
WHERE <Condition2>),
@Var3 = (SELECT COL1
FROM [LinkedServer].[dbo].[TableA]
WHERE <Condition3>),
...
FROM [LinkedServer].[dbo].[TableB]
WHERE <Condition4>
So there are multiples SELECT
statements to the same table in this linked server. Is there a way to improve the performance? Is copying TableA
to a temp table in local database server a bad idea? Size of TableA
is large.
[UPDATE1]: Updated my query
SELECT VAR_1,
(SELECT SOL
FROM [LinkedServer].[dbo].[TableSol]
WHERE COL_1 = [TableA].COL_1
AND COL_2 = [TableA].COL_2
AND COL_A = [TableA].COL_A) AS VAR_2,
(SELECT SOL
FROM [LinkedServer].[dbo].[TableSol]
WHERE COL_1 = [TableA].COL_3
AND COL_2 = [TableA].COL_4
AND COL_A = [TableA].COL_A) AS VAR_3,
(SELECT SOL
FROM [LinkedServer].[dbo].[TableSol]
WHERE COL_1 = [TableA].COL_5
AND COL_2 = [TableA].COL_6
AND COL_A = [TableA].COL_A) AS VAR_4,
FROM [LinkedServer].[dbo].[TableA]
INNER JOIN [LinkedServer].[dbo].[TableB] ON [TableA].COL_10 = [TableB].COL_11
It took me 15 secs to run the above query. I changed to use crosstab like below:
SELECT VAR_1
SOL.VAR_2,
SOL.VAR_3,
SOL.VAR_4,
FROM [LinkedServer].[dbo].[TableA]
INNER JOIN [LinkedServer].[dbo].[TableB] ON [TableA].COL_10 = [TableB].COL_11
CROSS APPLY (SELECT
CASE
WHEN COL_1 = [TableA].COL_1 AND COL_2 = [TableA].COL_2 THEN SOL
ELSE ''
END AS VAR_2,
CASE
WHEN COL_1 = [TableA].COL_3 AND COL_2 = [TableA].COL_4 THEN SOL
ELSE ''
END AS VAR_3,
CASE
WHEN COL_1 = [TableA].COL_5 AND COL_2 = [TableA].COL_6 THEN SOL
ELSE ''
END AS VAR_4
FROM [LinkedServer].[dbo].[TableSol]
WHERE COL_A = [TableA].COL_A
) AS SOL (VAR_2, VAR_3, VAR_4)
But it took me much longer (> 10 mins and I just stop because it did not finish). Did I miss something?
2 Answers 2
The query can be rewritten with a crosstab, in order to avoid hitting TableA multiple times:
DECLARE @TableA TABLE (col1 int, col2 int);
DECLARE @TableB TABLE (col3 int, col4 int);
INSERT INTO @TableA
VALUES
(1,1),
(2,2),
(3,3),
(4,4);
INSERT INTO @TableB
VALUES
(1,1),
(2,2),
(3,3),
(4,4);
DECLARE @Var1 int, @Var2 int, @Var3 int;
SELECT @Var1 = CA.var1,
@Var2 = CA.var2,
@Var3 = CA.var3
FROM @TableB AS TB
CROSS APPLY (
SELECT
SUM(CASE WHEN col2 = 1 THEN col1 END) AS var1,
-- ^^ This is <Condition1>
SUM(CASE WHEN col2 = 2 THEN col1 END) AS var2,
-- ^^ This is <Condition2>
SUM(CASE WHEN col2 = 3 THEN col1 END) AS var3
-- ^^ This is <Condition3>
FROM @TableA AS TA
) AS CA(var1, var2, var3)
WHERE TB.col3 > 2 -- <Condition4>
SELECT @Var1, @Var2, @var3
This should make it faster. If it doesn't, please post the execution plan, so that we can investigate it further.
-
Hi I changed my query, but it even gets longer. I cannot post the execution plan as I do not have enough privilege on the linked server.rcs– rcs2016年09月20日 07:57:57 +00:00Commented Sep 20, 2016 at 7:57
You can execute the query at the linked server as a pass-through query using the 4-part name notation and sp_executesql
:
DECLARE @var1 int, @var2 int, @var3 int, @var4 int;
EXEC [linkedserver].[databasename].sys.sp_executesql
N'
SELECT @var1 = VAR_1,
@var2 = (
SELECT SOL
FROM [dbo].[TableSol]
WHERE COL_1 = [TableA].COL_1
AND COL_2 = [TableA].COL_2
AND COL_A = [TableA].COL_A),
@var3 = (
SELECT SOL
FROM [dbo].[TableSol]
WHERE COL_1 = [TableA].COL_3
AND COL_2 = [TableA].COL_4
AND COL_A = [TableA].COL_A),
@var4 = (
SELECT SOL
FROM [dbo].[TableSol]
WHERE COL_1 = [TableA].COL_5
AND COL_2 = [TableA].COL_6
AND COL_A = [TableA].COL_A),
FROM [dbo].[TableA]
INNER JOIN [dbo].[TableB]
ON [TableA].COL_10 = [TableB].COL_11
',
N'@var1 int OUTPUT, @var2 int OUTPUT, @var3 int OUTPUT, @var4 int OUTPUT',
@var1 OUTPUT,
@var2 OUTPUT,
@var3 OUTPUT,
@var4 OUTPUT;
SELECT @var1, @var2, @var3, @var4;
If you don't need to assign to variables, you can use this syntax:
EXEC [linkedserver].[databasename].sys.sp_executesql
N'
SELECT VAR_1,
(
SELECT SOL
FROM [dbo].[TableSol]
WHERE COL_1 = [TableA].COL_1
AND COL_2 = [TableA].COL_2
AND COL_A = [TableA].COL_A),
(
SELECT SOL
FROM [dbo].[TableSol]
WHERE COL_1 = [TableA].COL_3
AND COL_2 = [TableA].COL_4
AND COL_A = [TableA].COL_A),
(
SELECT SOL
FROM [dbo].[TableSol]
WHERE COL_1 = [TableA].COL_5
AND COL_2 = [TableA].COL_6
AND COL_A = [TableA].COL_A),
FROM [dbo].[TableA]
INNER JOIN [dbo].[TableB]
ON [TableA].COL_10 = [TableB].COL_11
'
-
What if the
SELECT
statement returns more than one row? If I do looping by usingCURSOR
, it will hurt the performance as well.rcs– rcs2016年09月20日 09:48:32 +00:00Commented Sep 20, 2016 at 9:48 -
From your original question, it looked like you needed to return a single row and populate some variables from the results. If that is not the case any more, don't do it.spaghettidba– spaghettidba2016年09月20日 09:51:06 +00:00Commented Sep 20, 2016 at 9:51
-
From the original question, yes the query was returning a single row. I used a cursor previously to do looping. In my updated query, I removed the cursor at that point to improve the performance.rcs– rcs2016年09月20日 09:52:24 +00:00Commented Sep 20, 2016 at 9:52
-
Cool. Then don't assign to variables and don't pass them. I'll update the answer.spaghettidba– spaghettidba2016年09月20日 09:53:10 +00:00Commented Sep 20, 2016 at 9:53
-
Ok I just tried this one, but it does not seem to give noticeable improvement. I use a larger dataset, and the time taken for the initial query is 1 mins 47 secs, and using
EXEC [linkedserver].[databasename].sys.sp_executesql
takes 1 mins 45 secs.rcs– rcs2016年09月22日 06:03:06 +00:00Commented Sep 22, 2016 at 6:03
Explore related questions
See similar questions with these tags.
OPENQUERY
does not allow variables. I need some variables to be passed through the query.