Given a set of points $\{p_1, p_2, \dots p_n\}$ and a set of intervals $I =\{[a_1, b_1], \dots [a_m, b_m]\}$, you are asked to find a set of subintervals $S = \{[c_1, d_1], \dots [c_m, d_m]\}$ where $[c_i, d_i] \subseteq [a_i, b_i]$ s.t. each $p_j \in \bigcup_{[c, d] \in S} [c, d]$ and $\sum_{[c_i, d_i] \in S}|d_i - c_i|$ is minimized. Informally you have to find a subinterval for each given interval, s.t. each point is covered by a subinterval and the sum of the sizes of the subintervals is minimized.
Is there a polynomial algorithm for it?
1 Answer 1
There might exist a polynomial time algorithm for this. Here are some pointers:
Suppose the set of points sorted in ascending order. The intervals are also sorted in ascending order by their start time $a_i$. If not, they can be sorted in $O(n\log n)$ and $O(m\log m)$ time, respectively. Also note that for each $i$, we have $b_i \ge a_i$ and $d_i \ge c_i$.
Now it is easy to observe that intervals in $S$ must be non-overlapping; otherwise, we can always adjust the bounds of such overlapping intervals and get a reduced cost.
Also notice that, for any interval $i$, both $c_i$ and $d_i$ have to be some points $p$ and $p'$ that are covered by $[c_i,d_i] \subseteq [a_i,b_i]$.
Thus, we have $c_1 = p_1$ and $c_m = p_n$. Also keep in mind that for some interval $i$, we may have $c_i = d_i$, and thus do not contribute to the cost.
Now since $c_1$ and $d_m$ are fixed, the objective function $minimize: \sum_{i=1}^m (d_i - c_i)$ can be re written as $minimize: (d_m - c_1) - \sum_{i=1}^{m-1} (c_{i+1} - d_i)$. Which is equivalent to $maximize: \sum_{i=1}^{m-1} (c_{i+1} - d_i)$. This means we have to maximally space out the consecutive intervals.
If two intervals $[a_i,b_i]$ and $[a_{i+1},b_{i+1}]$ are partially overlapping (i.e., $a_{i+1} < b_i$), then there must be a pair of consecutive elements $p_j$ and $p_{j+1}$ in that overlap region such that $(p_{j+1} - p_j)$ is the maximum among all such possible pairs. This can be determined in polynomial time.
PS: I have not yet been able to figure out how to deal with the following case:
When two intervals $[a_i,b_i]$ and $[a_{i+1},b_{i+1}]$ are such that the latter is fully contained within the former one.
I am posting this an answer so that others may use the above facts in some nicer way to come up with a working algorithm.
Explore related questions
See similar questions with these tags.