I have a select statement which is infact a subquery within a larger select statement built up programmatically. The problem is if I elect to include this subquery it acts as a bottle neck and the whole query becomes painfully slow.
An example of the data is as follows:
Payment
.Receipt_no|.Person |.Payment_date|.Type|.Reversed|
2|John |01/02/2001 |PA | |
1|John |01/02/2001 |GX | |
3|David |15/04/2003 |PA | |
6|Mike |26/07/2002 |PA |R |
5|John |01/01/2001 |PA | |
4|Mike |13/05/2000 |GX | |
8|Mike |27/11/2004 |PA | |
7|David |05/12/2003 |PA |R |
9|David |15/04/2003 |PA | |
The subquery is as follows :
select Payment.Person,
Payment.amount
from Payment
inner join (Select min([min_Receipt].Person) 'Person',
min([min_Receipt].Receipt_no) 'Receipt_no'
from Payment [min_Receipt]
inner join (select min(Person) 'Person',
min(Payment_date) 'Payment_date'
from Payment
where Payment.reversed != 'R' and Payment.Type != 'GX'
group by Payment.Person) [min_date]
on [min_date].Person= [min_Receipt].Person and [min_date].Payment_date = [min_Receipt].Payment_date
where [min_Receipt].reversed != 'R' and [min_Receipt].Type != 'GX'
group by [min_Receipt].Person) [1stPayment]
on [1stPayment].Receipt_no = Payment.Receipt_no
This retrieves the first payment of each person by .Payment_date (ascending), .Receipt_no (ascending) where .type is not 'GX' and .Reversed is not 'R'. As Follows:
Payment
.Receipt_No|.Person|.Payment_date
5|John |01/01/2001
3|David |15/04/2003
8|Mike |27/11/2004
(削除) I am unable to move the subquery out to a temporary table as temporary tables are simply not supported within the programming language used by my application. (削除ここまで)
Edit : Incorrect statement. Temporary tables are supported and therefore this is a valid option.
Following a post on StackOverflow -
The Query was rewritten as the following.
Query 1.
select min(Payment.Person) 'Person',
min(Payment.receipt_no) 'receipt_no'
from
Payment a
where
a.type<>'GX' and (a.reversed not in ('R') or a.reversed is null)
and a.payment_date =
(select min(payment_date) from Payment i
where i.Person=a.Person and i.type <> 'GX'
and (i.reversed not in ('R') or i.reversed is null))
group by a.Person
I added this as a subquery within my much larger query, however it still ran very slowly. So I tried rewriting the query whilst trying to avoid the use of aggregate functions and came up with the following.
Query 2.
SELECT
receipt_no,
person,
payment_date,
amount
FROM
payment a
WHERE
receipt_no IN
(SELECT
top 1 i.receipt_no
FROM
payment i
WHERE
(i.reversed NOT IN ('R') OR i.reversed IS NULL)
AND i.type<>'GX'
AND i.person = a.person
ORDER BY i.payment_date DESC, i.receipt_no ASC)
Which I wouldn't necessarily think of as being more efficient. In fact if I run the two queries side by side on my larger data set Query 1. completes in a matter of milliseconds where as Query 2. takes several seconds.
However if I then add them as subqueries within a much larger query, the larger query completes in hours using Query 1. and completes in 40 seconds using Query 2.
I can only attribute this to the use of aggregate functions in one and not the other.
1 Answer 1
I see that in your question you said:
"I am unable to move the subquery out to a temporary table as temporary tables are simply not supported within the programming language used by my application."
But, have you considered calling a stored procedure instead? Is this even an option, considering the limitations with the programming language?
If this is a viable option, you could simply have the results of your subquery inserted into a temp table transparently & encapsulate all the logic in the stored procedure.
Edit
I got to thinking about this some more, and perhaps the columns that you're using in your JOIN
condition are of different collations. While this will usually result in a specific error message, there may be some implicit collation coversion occurring instead (see: MSDN: Collation Precedence (Transact-SQL)) between the sub-query & the data being joined.
Here are a few links about collation that might be useful to you:
Difference between collation
SQL_Latin1_General_CP1_CI_AS
andLatin1_General_CI_AS
SQL SERVER – Find Collation of Database and Table Column Using T-SQL
SQL SERVER – Change Collation of Database Column – T-SQL Script
Also, you may be able to trick your programming language into using a temp table with syntax like this:
SELECT *
FROM tempdb..#MyTempTable
Just keep in mind that sometimes the temp database has a different collation then the data you're working with too, in which case you'll need to explicitly convert the data to/from each collation.
-
1\$\begingroup\$ Completely agree with Alexander ...also tryin to add few index to speed a bit more \$\endgroup\$Paritosh– Paritosh2012年11月09日 02:17:33 +00:00Commented Nov 9, 2012 at 2:17
-
\$\begingroup\$ 2 people
completely agree
, but this answer has neither up-votes nor edits to improve whatever could be better explained? I'm baffled. \$\endgroup\$ANeves– ANeves2012年11月12日 08:30:51 +00:00Commented Nov 12, 2012 at 8:30 -
\$\begingroup\$ The statement I made with regards to my programming language not supporting temporary tables is incorrect and I have marked it as such. I'm not sure where I drew this conclusion from? Therefore using a temporary table is a valid option. From what I can tell there are no differences in collation between columns included in the joins. Though the information you posted on collation within Sql made for interesting reading most of which I was unaware of. \$\endgroup\$DMK– DMK2012年11月12日 11:51:57 +00:00Commented Nov 12, 2012 at 11:51
RANK()
or equivalent? \$\endgroup\$RANK()
is not available in SQL Server 2k. :( \$\endgroup\$