In following query I'm use both Common Table Expression and Outer queries. Apparently both looks same to me.
If I summarize my requirement, I have employees in SalaryTrans
table and I want to find out amounts to be recovered from their salaries monthly. Table Rcovery
contains all recovery data. Table Instalment
contains all previously paid amounts so I want to sum all paid amounts and then it is deducted from total amount to calculate balance
amount.
So what I'm doing here is in CTE
, I'm getting two numeric fields called recovered_amount
and total_amount
. In second query I'm checking whether recovered_amount
is not null and if it's not null I'm calculating the balance
.
In outer query I have a criteria to filter out less than 0 balances from the result.
WITH CTE (employee_id, RID, instalment, reference, recovered_amount, total_amount) AS
(SELECT SalaryTrans.employee_id, Recovery.RID, Recovery.amount / Recovery.duration AS instalment,
SalaryTrans.reference, SUM(Instalment.recovered_amount) AS recovered_amount, Recovery.amount AS total_amount
FROM Recovery INNER JOIN
SalaryTrans ON Recovery.emp_id = SalaryTrans.employee_id LEFT OUTER JOIN
Instalment ON Recovery.RID = Instalment.recovery_id
GROUP BY SalaryTrans.employee_id, SalaryTrans.reference, Recovery.RID, Recovery.amount / Recovery.duration, Recovery.amount)
SELECT RID, Balance, reference FROM
(SELECT RID,
"Balance" = CASE WHEN recovered_amount !=NULL
THEN total_amount-recovered_amount
ELSE total_amount END ,
reference FROM CTE) AS Result
WHERE Balance >0
This query gives intended output. But I want to know Can I do whole thing using CTE or simply should I use outer queries in this case. Also any idea to make it look better?
Could you please review this and give your feedback?
2 Answers 2
Common table expressions
In general, common table expressions are an excellent way to build complex queries. For databases that support CTEs, I fully recommend using them in preference to correlated subqueries, since chaining is much more readable than nesting. I would suggest that you rewrite the nested Result
subquery as a CTE as well, if there weren't a completely superior approach (see Rewrite below).
If you're going to use CTEs, name them properly. CTE
is just about the least informative name possible for a CTE.
Schema
You have some inconsistencies in the schema. For example, Recovery
has an emp_id
column, but SalaryTrans
has an employee_id
column. You should also name the columns with consistent capitalization.
Nitpicks
Looking inside your CTE, I see a few minor problems. I won't go into too much detail, since the whole query should be rewritten anyway.
I'm going to make a guess that the Recovery
table has RID
as its primary key, and that the SalaryTrans
table has reference
as its primary key. (It would be nice to include such information about the schema when asking questions — it makes reviews easier to write.) Since Recovery.amount / Recovery.duration AS instalment
is never used in the main query, there is no point to computing it in the CTE. Also, it would be redundant to GROUP BY Recovery.amount / Recovery.duration
, since there would never be two different values of Recovery.amount / Recovery.duration
for any given RID
. (The only effect it has would be to trigger a division-by-zero error if duration
is zero, and I doubt that that was intentional.)
Your CTE should mention the columns, tables, and GROUP BY
columns in a logical and consistent order.
Part of the reason you have so many GROUP BY
columns is that you are joining SalaryTrans
prematurely.
Rewrite
The LEFT OUTER JOIN
is problematic. Assuming that the Instalment
table is sanely designed such that the recovered_amount
column is non-nullable, the LEFT OUTER JOIN
introduces a needless null-handling complication. Rather than joining the Recovery
and Instalment
tables side-by-side, you want to combine them vertically.
WITH RecoveryEntries AS (
SELECT RID, emp_id, amount
FROM Recovery
UNION ALL
SELECT Recovery.RID, Recovery.emp_id, -Instalment.recovered_amount
FROM Recovery
INNER JOIN Instalment
ON Recovery.RID = Instalment.recovery_id
), NetRecovery AS (
SELECT RID, emp_id, SUM(amount) AS Balance
FROM RecoveryEntries
GROUP BY RID, emp_id
)
SELECT NetRecovery.RID, NetRecovery.Balance, SalaryTrans.reference
FROM NetRecovery
INNER JOIN SalaryTrans
ON NetRecovery.emp_id = SalaryTrans.employee_id
WHERE NetRecovery.Balance > 0
-
\$\begingroup\$ Nice review, Thank you very much!. One thing to clarify... I need
Recovery.amount / Recovery.duration AS instalment
too in the final result. So what's the best way to add this? \$\endgroup\$CAD– CAD2014年07月06日 05:11:55 +00:00Commented Jul 6, 2014 at 5:11 -
\$\begingroup\$ To get
instalment
, if I rejoinRecovery
table in the finalSELECT...
, I can get the required output. Is it ok? \$\endgroup\$CAD– CAD2014年07月06日 05:54:27 +00:00Commented Jul 6, 2014 at 5:54 -
1\$\begingroup\$ We've covered all three ways: 1)
LEFT OUTER JOIN
, as in your original question; 2) Treatinstalment
just likeRecovery.emp_id
; 3) RejoinRecovery
. Each has advantages and disadvantages. \$\endgroup\$200_success– 200_success2014年07月06日 06:19:54 +00:00Commented Jul 6, 2014 at 6:19
I find the query rather difficult to read due to your indentation style. For your benefit and for the benefit of other reviewers, I've reformatted your query, changing only the whitespace:
WITH CTE (employee_id, RID, instalment, reference, recovered_amount, total_amount) AS (
SELECT SalaryTrans.employee_id
, Recovery.RID
, Recovery.amount / Recovery.duration AS instalment
, SalaryTrans.reference
, SUM(Instalment.recovered_amount) AS recovered_amount
, Recovery.amount AS total_amount
FROM Recovery
INNER JOIN SalaryTrans
ON Recovery.emp_id = SalaryTrans.employee_id
LEFT OUTER JOIN Instalment
ON Recovery.RID = Instalment.recovery_id
GROUP BY SalaryTrans.employee_id
, SalaryTrans.reference
, Recovery.RID
, Recovery.amount / Recovery.duration
, Recovery.amount
)
SELECT RID, Balance, reference
FROM (
SELECT RID
, "Balance" = CASE
WHEN recovered_amount != NULL THEN total_amount - recovered_amount
ELSE total_amount
END
, reference
FROM CTE
) AS Result
WHERE Balance > 0
In particular, note:
- For anything other than a trivial
SELECT
, use a separate line for each column being selected. (My use of leading commas may be controversial. I find that it helps prevent silly mistakes when editing the query to add and remove columns.) - The indentation matches the clauses of each expression. For example,
- The big-picture structure is
WITH
(CTEs)SELECT
(something).- Within the CTE,
SELECT
takes aFROM
clause and aGROUP BY
clause.- The
FROM
clause consists of twoJOIN
s.
- The
- The main
SELECT
has aFROM
clause and aWHERE
clause.- Within the
CASE
, make it more obvious that theTHEN
is associated withWHEN
.
- Within the
- Within the CTE,
- The big-picture structure is
- Use a space on each side of binary operators such as
!=
,-
, and>
.