I have the following Oracle SQL code that works, but I would like to try and find a way to clean it up so it's not so repetitive. It is looking at orders for a part ordered each FY in an Oracle Database.
SELECT
-- Attribute
R.ASN,
-- Code for FY17
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 8 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 7 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 8 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 7 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY17,
-- Code for FY18
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 7 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 6 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 7 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 6 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY18,
-- Code for FY19
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 6 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 5 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 6 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 5 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY19,
-- Code for FY20
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 5 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 4 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 5 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 4 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY20,
-- Code for FY21
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 4 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 3 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 4 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 3 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY21,
-- Code for FY22
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 3 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 2 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 3 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 2 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY22,
-- Code for FY23
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 2 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 1 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 2 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 1 || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY23,
-- Code for FY24
COUNT(CASE
WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 1 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) +
COUNT(CASE
WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE(EXTRACT(YEAR FROM SYSDATE) - 1 || '-10-01', 'YYYY-MM-DD')
AND TO_DATE(EXTRACT(YEAR FROM SYSDATE) || '-09-30', 'YYYY-MM-DD')
THEN R.ASN
END) AS FY24
-- Table name in Oracle Database
FROM CLS_REQN_SHIP_ARCH R
WHERE R.ASN = 'S100-1A3PS1'
GROUP BY R.ASN;
I have tried loops, but loops are not supported in Oracle SQL within a single SQL query. Is there anything else I could look at to clean this up?
2 Answers 2
tidy data
Column names like FY17 or FY24 are not helpful. You want just a single FY column, which has values like 2017 or 2024 in different rows.
Hadley Wickham describes the concept in some detail.
looping
Every SELECT statement loops over rows of a relation. That can be a base table, or the result of a JOIN.
Create a "fiscal_year" table, with "start" and "end" columns corresponding to the October and September dates of interest. Do a JOIN against that to obtain the desired COUNT results.
dealing with NULLs
We see the same BETWEEN expression multiple times,
once for checking if date_bb is within a given fiscal year, and
once for checking date_alloc.
It would be more convenient to check whether
COALESCE(date_bb, date_alloc)
is within the date range.
extra comments
In the current code there's little need for comments like this:
-- Code for FY17
A simple blank line, or even an empty comment, would suffice.
The (visually prominent) trailing AS FY17 is enough
to tell us about the stanza, plus it actually matters for the output.
My DRY concern is that sometimes the author or a maintenance engineer will make copy-n-paste mistakes, and might for example mislabel the subsequent FY18 stanza as "Code for FY17". If we can avoid saying the same thing in two places, that is typically the prudent path to follow.
-
\$\begingroup\$ Honest question (out of ignorance): I'm trying to envision the "JOIN" proposal. My best guess would require ~365 rows/year (maybe only ~200 "business days (M-F)") each with a unique date and all with the same translation "fiscal year" value (an INT ??) (for grouping and counting)... I'm presuming the OP's table is full of items listed in (sales?) transactions, 'n' transactions logged every day of the calendar year. If you've time, expanding on your 'JOIN' approach would be informative (for some of us amateurs..) Thanks!
:-)\$\endgroup\$user272752– user2727522024年08月17日 13:22:17 +00:00Commented Aug 17, 2024 at 13:22 -
1\$\begingroup\$ @Fe2O3, the OP didn't offer a reprex with example schema + data (which is fine), so it's not convenient for me to modify and test the OP code. We see eight result columns, based on counts of
date_bbordate_allocfalling within a date range. So do eight INSERTs in the newfiscal_yeartable. The first row will mention'2016年10月01日'and'2017年09月30日'. Cartesian cross product (JOIN) between the tables lets us loop over eight fiscal years, producing eight "tidy" result rows. Given a SqlFiddle I would fiddle with it. \$\endgroup\$J_H– J_H2024年08月17日 16:20:06 +00:00Commented Aug 17, 2024 at 16:20 -
\$\begingroup\$ Thanks... I think... It's likely I'm ignorant of the SQL mechanism that connects a given date to another table's row of only start/end dates of a range... It's okay, though, as I don't have to write/maintain the code.
:-)Cheers! \$\endgroup\$user272752– user2727522024年08月17日 21:51:41 +00:00Commented Aug 17, 2024 at 21:51
"Be kind to my mistakes"
--Kate Bush
It's been more than 2 decades since I dabbled in SQL. My meager hope is to help the OP understand what it is they are seeking to achieve.
Shallow simplification
...
COUNT(CASE WHEN R.DATE_BB IS NOT NULL
AND R.DATE_BB BETWEEN TO_DATE('2016-10-01', 'YYYY-MM-DD')
AND TO_DATE('2017-09-30', 'YYYY-MM-DD')
THEN R.ASN END)
+ COUNT(CASE WHEN R.DATE_BB IS NULL
AND R.DATE_ALLOC BETWEEN TO_DATE('2016-10-01', 'YYYY-MM-DD')
AND TO_DATE('2017-09-30', 'YYYY-MM-DD')
THEN R.ASN END)
AS FY17,
...
There are 8 blocks like this returning 8 columns of "counts" for each R.ASN in the table returned by the WHERE clause.
Those columns will be returned in ascending year sequence (left-to-right), making it, perhaps, more difficult to determine current ("interesting") values from historical values.
Approximately 6 weeks from posting this answer, the column titles returned by the OP's query will be "off by one" from what they should be.
- Is
current + 7 years backsignificant? Could5years (total) be as useful? - Can someone maintain this code once each year (after 01 Oct) to simply shift the oldest year to the new-newest year, adapting the obvious 4 instances where a hard-coded date appears there? (Also, change the column title to, for instance,
FY25and fix up two commas?)
Appropriateness
- One must presume that the OP's code is an MRE, as it shows a single hard-coded item identifier being used.
- One must consider the level of expertise available now and in the future that can be brought to bear on this project.
- Running the OP's code unaltered on Oct 15th will produce erroneous but seemingly plausible results. This is worse than failing as it may go undetected until it is too late to remedy!
- Running the suggested revised code unaltered on Oct 15th will give results that draw immediate attention to the necessity to trivially update the code for the new financial year.
- Most anyone able to read, think and reason, irrespective of their familiarity with SQL, could adapt the program to function correctly after Sep 30th.
Suggestions
- Ask the users if 8 years of history is necessary. Is anyone interested in that much history? Some people are easily overloaded and can become flummoxed. Sometimes, less is more.
- Re-order the columns in descending year sequence (so that the current year is on the left and the most ancient year is on the right.)
- Consider archiving ancient transaction data and purging it from the database (to improve performance.) There may be 3-5 queries in a year about what happened in the last few decades, but thousands of queries every day about what's happening in the last few days or weeks.
- Urgent: Has Y2K already been forgotten?
Change those "column labels" to 4 digit years ASAP.