I had originally posted this in SO but reasoned that it would be more appropriate here.
I have a view in an sqlite database such as this:
SQL Fiddle (http://www.sqlfiddle.com/#!5/ae95b/1)
This represents a list of boxes and items that go into each box. The items are always ordered by box_start
and the ranges box_start
and box_end
never overlap.
As shown in the first row there's a box which can store items with codes from 1 inclusive to 3 inclusive. For example the items that would go in the box A would be 'a' and 'b'.
The box-items grouping is denoted by the coloring, rows with the same color means that those are a group.
There are also items without a designated box (box label is null), such as items 'c', 'f', 'g', 'h'.
I need to write a query, if it is possible at all, to create ad-hoc boxes for items without a box and group contiguous items into the same box as follows:
As seen in the image the item 'c' and its corresponding box were labeled with '4-4' and the previously unassigned items 'f', 'g', 'h' are now grouped under the same box labeled '8-10' which corresponds to 'min(box_start
of f,g,h)-max(box_end
of f,g,h)'
I not sure how to do this in SQLite. I thought about using somekind of recursive query with CTEs but don't have a clue how to do it.
1 Answer 1
After some work I have the following query:
select
min(box_start) as box_start,
box_end,
box_label,
is_box,
item_code,
item
from
(
select
box_start,
box_end,
box_label,
is_box,
item_code,
item
from
table1
where
box_label is not null
union all
select
table1.box_start as box_start,
table1.box_end as box_end,
intervals.A || '-' || intervals.B as box_label,
table1.is_box as is_box,
table1.item_code as item_code,
table1.item as item
from
(
select
box_start,
box_end,
box_label,
is_box,
A,
B,
max(max_interval_size) as max_interval_size
from
(
select
box_start,
box_end,
box_label,
is_box,
A,
B,
max(interval_size) as max_interval_size
from
(
select
fixed_table.box_start as box_start,
fixed_table.box_end as box_end,
fixed_table.box_label as box_label,
fixed_table.is_box as is_box,
fixed_table.box_start as A,
windowed_table.box_end as B,
(windowed_table.box_end - fixed_table.box_start) as interval_size
from
table1 fixed_table
join table1 windowed_table on
fixed_table.box_start <= windowed_table.box_end
where
interval_size >= 0
and fixed_table.box_label is null
and windowed_table.box_label is null
and fixed_table.is_box = 'FALSE'
and windowed_table.is_box = 'FALSE'
except
select
without_a_box.*
from
(
select
fixed_table.box_start as box_start,
fixed_table.box_end as box_end,
fixed_table.box_label as box_label,
fixed_table.is_box as is_box,
fixed_table.box_start as A,
windowed_table.box_end as B,
(windowed_table.box_end - fixed_table.box_start) as interval_size
from
table1 fixed_table
join table1 windowed_table on
fixed_table.box_start <= windowed_table.box_end
where
interval_size >= 0
and fixed_table.box_label is null
and windowed_table.box_label is null
and fixed_table.is_box = 'FALSE'
and windowed_table.is_box = 'FALSE'
) as without_a_box
,
(
select distinct
with_box.box_start as start_with_box
from
table1 with_box
where
with_box.is_box = 'FALSE'
and with_box.box_label is not null
) as items_inside_a_box
where
items_inside_a_box.start_with_box > without_a_box.A
and items_inside_a_box.start_with_box < without_a_box.B
) as without_intervals_that_intersect_boxed_items
group by
A
) as final
group by
B
) as intervals
join table1 on
table1.box_start >= intervals.A
and table1.box_end <= intervals.B
and table1.box_label is null
)
group by
box_label,
is_box,
item_code,
item
order by
box_start,
item_code
SQL Fiddle: http://www.sqlfiddle.com/#!7/4a643e/142
Even though it seems to do its job I am not sure is correct in all cases and if it won't be a performance bottleneck.
I hope someone has a better solution