4

I have a road_insp table:

create table road_insp
 ( 
 insp_id int,
 road_id int,
 insp_date date, 
 condition number(38,2)
 ) ;
insert into road_insp (insp_id, road_id, insp_date, condition) values (1,100,to_date('01-APR-04','DD-MON-YY'),.9);
insert into road_insp (insp_id, road_id, insp_date, condition) values (2,100,to_date('01-APR-11','DD-MON-YY'),.7);
insert into road_insp (insp_id, road_id, insp_date, condition) values (3,100,to_date('01-MAR-12','DD-MON-YY'),.7);
insert into road_insp (insp_id, road_id, insp_date, condition) values (4,100,to_date('01-MAR-17','DD-MON-YY'),.6);
insert into road_insp (insp_id, road_id, insp_date, condition) values (5,100,to_date('01-MAR-18','DD-MON-YY'),.6);
insert into road_insp (insp_id, road_id, insp_date, condition) values (6,200,to_date('01-JUN-10','DD-MON-YY'),.4);
insert into road_insp (insp_id, road_id, insp_date, condition) values (7,200,to_date('01-JUN-12','DD-MON-YY'),.3);
commit;
select
 insp_id,
 road_id,
 extract(year from insp_date) as insp_year,
 condition as condition
from
 road_insp
order by
 road_id,
 insp_date
;
 INSP_ID ROAD_ID INSP_YEAR CONDITION
---------- ---------- ---------- ----------
 1 100 2004 .9 
 2 100 2011 <-gap .7 
 3 100 2012 .7
 4 100 2017 <-gap .6 
 5 100 2018 .6
 6 200 2010 .4
 7 200 2012 <-gap .3 

I can visualize road condition over time like this (road #100):

enter image description here


However, as noted in red, there are time gaps between the inspections.

Inspections for road #100 are missing for these years:

  • 2005
  • 2006
  • 2007
  • 2008
  • 2009
  • 2010
  • 2013
  • 2014
  • 2015
  • 2016

The gaps exist because not all of the roads in the City are inspected every year. While I wouldn't consider the gaps to be errors in the data, I am concerned that when I visualize the data, the gaps paint a misleading picture. I say this, because, the time intervals between inspections are certainly not constant, yet when I look at the graph, I'm lead to believe that the time intervals are constant (which is not true).

To compensate for this quirk, I would like to fill in the gaps between inspections using a select query in a view. Using such a view, the data could be visualized like this:

enter image description here

+---------+---------+------------+-----------+
| INSP_ID | ROAD_ID | INSP_DATE | CONDITION |
+---------+---------+------------+-----------+
| 1 | 100 | 2004年04月01日 | .9 |
| | 100 | 2005年01月01日 | .9 |
| | 100 | 2006年01月01日 | .9 |
| | 100 | 2007年01月01日 | .9 |
| | 100 | 2008年01月01日 | .9 |
| | 100 | 2009年01月01日 | .9 |
| | 100 | 2010年01月01日 | .9 |
| 2 | 100 | 2011年04月01日 | .7 |
| 3 | 100 | 2012年03月01日 | .7 |
| | 100 | 2013年01月01日 | .7 |
| | 100 | 2014年01月01日 | .7 |
| | 100 | 2015年01月01日 | .7 |
| | 100 | 2016年01月01日 | .7 |
| 4 | 100 | 2017年03月01日 | .6 |
| 5 | 100 | 2018年03月01日 | .6 |
+---------+---------+------------+-----------+
| 6 | 200 | 2010年06月01日 | .4 |
| | 200 | 2011年01月01日 | .4 |
| 7 | 200 | 2012年06月01日 | .35 |
+---------+---------+------------+-----------+

As you can see, the time intervals are now constant and more intuitive to interpret.

The filler records would be populated like this:

  • insp_id would be null
  • road_id would be the id of the last completed inspection
  • insp_date would be January 1st of the applicable year
  • condition would be the condition of the last completed inspection
asked Nov 11, 2017 at 5:22
0

1 Answer 1

3
with
 date_range as (select min(insp_date) mind, max(insp_date) maxd from road_insp),
 insp_dates as (select add_months(trunc(mind, 'YYYY'), (level - 1)*12) as insp_date 
 from date_range connect by level <= extract(year from maxd) - extract(year from mind))
select
 ri.insp_id, ri.road_id, 
 nvl(case when extract(year from ri.insp_date) = extract(year from id.insp_date) 
 then ri.insp_date else id.insp_date end, ri.insp_date) as insp_date,
 ri.condition
from
 insp_dates id right join
 (select insp_id, road_id, insp_date, condition as condition,
 lead(insp_date) over (partition by road_id order by insp_date) as next_insp_date
 from road_insp) ri 
 on id.insp_date >= trunc(ri.insp_date, 'YYYY') 
 and id.insp_date < trunc(ri.next_insp_date, 'YYYY')
order by insp_id, road_id, insp_date
;

Generate dates between min and max insp_date.

Find the next insp_date for each inspection to have them in 1 row.

Join the generated dates to the original dataset on date ranges based on year, excluding the upper bound, because that is the next inspection with existing data. Outer join, so the last inspection for a road (where next inspection is null) is included as well.

If there is generated insp_date is in the same year as the original data, use the original data, otherwise use the generated insp_date.

 INSP_ID ROAD_ID INSP_DATE CONDITION
---------- ---------- ---------- ----------
 1 100 2004年04月01日 .9
 1 100 2005年01月01日 .9
 1 100 2006年01月01日 .9
 1 100 2007年01月01日 .9
 1 100 2008年01月01日 .9
 1 100 2009年01月01日 .9
 1 100 2010年01月01日 .9
 2 100 2011年04月01日 .7
 3 100 2012年03月01日 .7
 3 100 2013年01月01日 .7
 3 100 2014年01月01日 .7
 3 100 2015年01月01日 .7
 3 100 2016年01月01日 .7
 4 100 2017年03月01日 .6
 5 100 2018年03月01日 .6
 6 200 2010年06月01日 .4
 6 200 2011年01月01日 .4
 7 200 2012年06月01日 .3
answered Nov 11, 2017 at 9:59
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.