1
\$\begingroup\$

DB version is: Oracle 12c. Restricted to the use of inline PL/SQL.

The end goal here is to get a count of open items by day, inclusive of created date, exclusive of the end date.

The data structure is approximately:

| id | open_date | close_date |
|----|------------|------------|
| a | 01/01/2020 | 01/04/2020 |
| b | 01/02/2020 | 01/05/2020 |

The end result would be something like:

| date | open_item |
|------------|-----------|
| 01/01/2020 | 1 |
| 01/02/2020 | 2 |
| 01/03/2020 | 2 |
| 01/04/2020 | 1 |
| 01/05/2020 | 0 |

the start date in all this is user input, but that's not terribly relevant.

What I had done is generate a table of dates using that user input date in a cte like so:

with caledar_dates as (
select user_input_date + rownum-1 dates
from dual connect by rownum < sysdate-user_input_date+1
)

Then joined the items table to the calendar_dates table in another CTE like so:

item_list as (
 select
 a.dates,
 b.id
 from calendar_dates a
 left join ticket_table b on a.dates>=b.created_date-1 and a.dates<b.closed_date
)

Finally, I then simply

select count(b.id), dates from item_list group by dates 

There may be a few syntax errors above since I am truncating the actual code. This query works; however, it seems like an inefficient, wordy, ugly approach to the problem and if the user inputs a date a few years back, the query takes a few minutes to run.

Looking for an alternative and hopefully more efficient approach.

Not sure if this went on SO or here so happy to post over there if that seems the better venue.

Mahalo's in advance.

asked Feb 4, 2020 at 1:48
\$\endgroup\$
3
  • \$\begingroup\$ Welcome to Code Review! "I am truncating the actual code" is usually not something that is liked very much here on this site. Unfortunately my knowledge on your specific topic is not deep enough to be able to judge if crucial information is missing because of the abbreviated code. \$\endgroup\$ Commented Feb 4, 2020 at 8:34
  • \$\begingroup\$ Thanks AlexV. The code that is removed is just to remove information specific to the database I am working, ie other columns I am pulling + a few column renames. The time difference between running the truncated code vs. the actual code is not measurable on my side. \$\endgroup\$ Commented Feb 4, 2020 at 15:36
  • \$\begingroup\$ One thing to note, I would recommend using TRUNC(b.created_date) instead of b.created_date-1. I know it seems trivial, but I ran into an issue with this a few years back where something happened exactly at midnight (DATE datatype so down to the second) and it was a huge pain to debug what was going wrong when I pulled too much info back. \$\endgroup\$ Commented Apr 2, 2020 at 15:22

1 Answer 1

2
\$\begingroup\$

In looking at your query, I think there may be a few things you can do to improve your performance. The first is that you can aggregate your ticket_table ahead of time based on the request interval. Second, unless it is needed for the logic that you redacted, I would avoid using the CTE once you start doing the real logic. You may be depriving the optomizer of information it could use. You might try something like the following query:

WITH query_interval AS
(
 SELECT TO_DATE('09/01/2012', 'MM/DD/YYYY') AS START_DATE, TRUNC(SYSDATE) AS END_DATE
 FROM DUAL
),
interval_dates AS
(
 SELECT qi.start_date + ROWNUM - 1 AS I_DATE
 FROM query_interval qi
 CONNECT BY ROWNUM < qi.end_date - qi.start_date
)
SELECT i.i_date, SUM(sub.cycle_count)
FROM interval_dates i
LEFT JOIN (SELECT GREATEST(TRUNC(tt.start_date), qi.start_date) AS OPEN_DATE,
 LEAST(TRUNC(tt.end_date)-1, qi.end_date) AS CLOSE_DATE,
 COUNT(*) AS CYCLE_COUNT
 FROM ticket_table tt
 CROSS JOIN query_interval qi
 WHERE GREATEST(TRUNC(tt.start_date), qi.start_date) <= LEAST(TRUNC(tt.end_date)-1, qi.end_date)
 GROUP BY GREATEST(TRUNC(tt.start_date), qi.start_date), LEAST(TRUNC(tt.end_date)-1, qi.end_date)) sub ON I_DATE BETWEEN sub.open_date AND sub.close_date
GROUP BY i.i_date
ORDER BY i.i_date;

Most of this is similar to what you already had, but there are some differences. First you notice that I added a CTE with just the interval. This allows me to perform the aggregation that I mentioned earlier. The aggregation works like this:

  1. First, we limit our search to tickets which were open at some point during our query interval. That I'm using the GREATEST <= LEAST to determine an overlap between the ticket and the query interval.
  2. We clip tickets that hang over the ends of our search interval. You really don't care that they are open before your search, just that they are open during your search.
  3. We aggregate all cycles which now have the same start and end dates.

After that, the logic is pretty similar to what you already had. A left join between the full list of dates and our modified table. Then instead of using COUNT, we use SUM.

answered Apr 2, 2020 at 16:15
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.