4
\$\begingroup\$

I'm working with this query to get counts from three different tables, and to group the results by account ID:

SELECT accountId , SUM(ApplicantsCount) as ApplicantsCount, SUM(ApprovedCount) as ApprovedCount, SUM(ScreenedCount) as ScreenedCount
FROM (
 SELECT application.accountId, COUNT(*) ApplicantCount, 0 ApprovedCount, 0 ScreenedCount
 FROM application
 [accountIdCondition]
 GROUP BY application.accountId
 UNION
 SELECT application.accountId, 0, COUNT(*), 0
 FROM application
 JOIN termsofapproval ON application.accountId = termsofapproval.accountId
 JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition] 
 GROUP BY application.accountId
 UNION
 SELECT application.accountId, 0, 0, COUNT(*)
 FROM application
 JOIN screened 
 ON application.id = screened.applicationId
 [accountIdCondition]
 GROUP BY application.accountId
 ) CountsTable
GROUP BY CountsTable.accountId

However, it runs too slowly, and times out the task handler I'm calling it in. Is there a way I can write this to run faster?

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Oct 23, 2014 at 20:35
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

I took the liberty of throwing this in to an SQLFiddle here

If I play with the query, and run the screened and approved queries on the data that I chose, I see you have a condition which may or may not be a bug. If you consider my data, where I have multiple accountId values per application, then, your subquery:

SELECT application.accountId, 0, COUNT(*), 0
FROM application
JOIN termsofapproval ON application.accountId = termsofapproval.accountId
JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition] 
GROUP BY application.accountId

That query will duplicate the count of approved users if there are multiple applications for the same accountId. In the SQLFiddle I have, it returns a count of 4 for only 2 distinct approvaluserjoin records.

It is likely that in your data it is not possible to get that condition though... right?

Regardless. I believe the more logical representation of your query is as follows (which I have in this SQLFiddle here

SELECT application.accountId ,
 count(distinct application.id) as ApplicantsCount,
 count(distinct approvaluserjoin.id) as ApprovedCount,
 count(distinct screened.id) as ScreenedCount
FROM application
left join termsofapproval on application.accountId = termsofapproval.accountId
left join approvaluserjoin on approvaluserjoin.termsofapprovalId = termsofapproval.id
left join screened on screened.applicationId = application.id
where application.accountId <= 2
group by application.accountId

Note how there is only one join, using left outer joins. Also note that a count of a null value is 0, so the null values in the outer-join results do not contribute to the sum. The coutn(distinct ...) construct allows you to count the things you are interested in, even if the query returns them in multiple contexts.

You will need to carefully understand the query, the implications are different to yours, and it may be more accurate than what you have (or less accurate).

answered Oct 23, 2014 at 22:52
\$\endgroup\$
1
  • \$\begingroup\$ that's a good catch on the sql error, thanks - that shouldn't happen often, but I think it would be possible. All in all, that's a great improvement. \$\endgroup\$ Commented Oct 24, 2014 at 14:46

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.