Grouping counts from tables by account ID

Question 1

I'm working with this query to get counts from three different tables, and to group the results by account ID:

SELECT accountId , SUM(ApplicantsCount) as ApplicantsCount, SUM(ApprovedCount) as ApprovedCount, SUM(ScreenedCount) as ScreenedCount
FROM (
 SELECT application.accountId, COUNT(*) ApplicantCount, 0 ApprovedCount, 0 ScreenedCount
 FROM application
 [accountIdCondition]
 GROUP BY application.accountId
 UNION
 SELECT application.accountId, 0, COUNT(*), 0
 FROM application
 JOIN termsofapproval ON application.accountId = termsofapproval.accountId
 JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition] 
 GROUP BY application.accountId
 UNION
 SELECT application.accountId, 0, 0, COUNT(*)
 FROM application
 JOIN screened 
 ON application.id = screened.applicationId
 [accountIdCondition]
 GROUP BY application.accountId
 ) CountsTable
GROUP BY CountsTable.accountId

However, it runs too slowly, and times out the task handler I'm calling it in. Is there a way I can write this to run faster?

Question 2

I took the liberty of throwing this in to an SQLFiddle here

If I play with the query, and run the screened and approved queries on the data that I chose, I see you have a condition which may or may not be a bug. If you consider my data, where I have multiple accountId values per application, then, your subquery:

SELECT application.accountId, 0, COUNT(*), 0
FROM application
JOIN termsofapproval ON application.accountId = termsofapproval.accountId
JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition] 
GROUP BY application.accountId

That query will duplicate the count of approved users if there are multiple applications for the same accountId. In the SQLFiddle I have, it returns a count of 4 for only 2 distinct approvaluserjoin records.

It is likely that in your data it is not possible to get that condition though... right?

Regardless. I believe the more logical representation of your query is as follows (which I have in this SQLFiddle here

SELECT application.accountId ,
 count(distinct application.id) as ApplicantsCount,
 count(distinct approvaluserjoin.id) as ApprovedCount,
 count(distinct screened.id) as ScreenedCount
FROM application
left join termsofapproval on application.accountId = termsofapproval.accountId
left join approvaluserjoin on approvaluserjoin.termsofapprovalId = termsofapproval.id
left join screened on screened.applicationId = application.id
where application.accountId <= 2
group by application.accountId

Note how there is only one join, using left outer joins. Also note that a count of a null value is 0, so the null values in the outer-join results do not contribute to the sum. The coutn(distinct ...) construct allows you to count the things you are interested in, even if the query returns them in multiple contexts.

You will need to carefully understand the query, the implications are different to yours, and it may be more accurate than what you have (or less accurate).

Question 3

that's a good catch on the sql error, thanks - that shouldn't happen often, but I think it would be possible. All in all, that's a great improvement.

rolfl rolfl 98.1k17 gold badges219 silver badges419 bronze badges · Accepted Answer · 2014-10-23 22:52:20Z

I took the liberty of throwing this in to an SQLFiddle here

If I play with the query, and run the screened and approved queries on the data that I chose, I see you have a condition which may or may not be a bug. If you consider my data, where I have multiple accountId values per application, then, your subquery:

SELECT application.accountId, 0, COUNT(*), 0
FROM application
JOIN termsofapproval ON application.accountId = termsofapproval.accountId
JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition] 
GROUP BY application.accountId

That query will duplicate the count of approved users if there are multiple applications for the same accountId. In the SQLFiddle I have, it returns a count of 4 for only 2 distinct approvaluserjoin records.

It is likely that in your data it is not possible to get that condition though... right?

Regardless. I believe the more logical representation of your query is as follows (which I have in this SQLFiddle here

SELECT application.accountId ,
 count(distinct application.id) as ApplicantsCount,
 count(distinct approvaluserjoin.id) as ApprovedCount,
 count(distinct screened.id) as ScreenedCount
FROM application
left join termsofapproval on application.accountId = termsofapproval.accountId
left join approvaluserjoin on approvaluserjoin.termsofapprovalId = termsofapproval.id
left join screened on screened.applicationId = application.id
where application.accountId <= 2
group by application.accountId

Note how there is only one join, using left outer joins. Also note that a count of a null value is 0, so the null values in the outer-join results do not contribute to the sum. The coutn(distinct ...) construct allows you to count the things you are interested in, even if the query returns them in multiple contexts.

You will need to carefully understand the query, the implications are different to yours, and it may be more accurate than what you have (or less accurate).

that's a good catch on the sql error, thanks - that shouldn't happen often, but I think it would be possible. All in all, that's a great improvement.

Stack Exchange Network

Grouping counts from tables by account ID

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Grouping counts from tables by account ID

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions