0
$\begingroup$

Consider the following problem from Jeff Erickson: Algorithms that also appears in this post, which wants us to prove a lower bound for the problem.

Suppose you are given an array $A[1 .. n]$ of integers, each of which may be positive, negative, or zero. A contiguous subarray $A[i .. j]$ is called a positive interval if the sum of its entries is greater than zero. Describe and analyze an algorithm to compute the minimum number of positive intervals that cover every positive entry in $A$. For example, given the following array as input, your algorithm should output 3ドル$. If every entry in the input array is negative, your algorithm should output 0ドル$.
$${+}3\ {-}5\ {+}7\ {-}4\ {+}1\ {-}8\ {+}3\ {-}7\ {+}5\ {-}9\ {+}5\ {-}2\ {+}4$$

My question is, is there a greedy approach for this problem?

I have an idea, find the largest positive number, then according to whether there is some positive numbers in each side, go left or right to cover more positive number. But I think this idea doesn't work.

喜欢算法和数学
39.2k4 gold badges35 silver badges96 bronze badges
asked Apr 29, 2022 at 23:28
$\endgroup$
3
  • $\begingroup$ It looks like your idea is a bit too complicated and unlikely to work. $\endgroup$ Commented Apr 30, 2022 at 0:42
  • $\begingroup$ We ask that references satisfy scholarly requirements: please include at least the title of the book and the full name of the author, and the chapter and exercise number, so that others with the same question are more likely to be able to find this page by search. Thank you! $\endgroup$ Commented Apr 30, 2022 at 20:05
  • $\begingroup$ Please clarify in the question: are intervals allowed to overlap? $\endgroup$ Commented May 18, 2023 at 7:55

3 Answers 3

1
$\begingroup$

There is a simple greedy algorithm.

A greedy approach

The greedy idea is obvious: starting from the very left, extend the coverage to the right as far as possible each time.

However, it may not be obvious to get it right. We should search each next positive interval by the following two criteria:

  1. It should cover the next uncovered positive entry.
  2. It should cover as many entries as possible after that entry among all positive intervals that covers that entry.

To boost performance, after the next $u$ncovered positive entry $A[u]$ is found, we will find the optimal start of the positive interval to cover $A[u]$ and the most entries after $A[u]$. That optimal start is an entry such that the sum of entries between it and $A[u]$ is the largest. In fact, that optimal start can be ignored; knowing that largest sum is enough.

Let $\text{sum}(A[i..j])=A[i]+A[i+1]+\cdots+A[j]$, the sum of entries in $A[i..j]$.

Here is the greedy algorithm.

$u\leftarrow0.$ $\ rightmost\leftarrow0.$ $\ answer\leftarrow 0.$ Repeat the following 3ドル$ steps.

  1. $u\leftarrow$ the least index greater than $rightmost$ at which the entry of $A$ is positive.
    If there is no such index, return $answer$.
    Otherwise, $answer\leftarrow answer+1$. We will cover $A[u]>0$ and as many entries as possible after it with one positive interval.
  2. $surplus\leftarrow$ the largest sum among $\text{sum}(A[u..u])$, $\text{sum}(A[u-1..u])$, $\cdots$, $\text{sum}(A[1..u])$.
    For the sake of correctness proof, let $start$ be the corresponding starting index of $surplus$, which is the optimal start of the desired positive interval.
  3. $rightmost\leftarrow$ the largest index $k$ such that $ surplus+\text{sum}(A[u+1..k])>0$, where $u+1\le k\le n$. If there is no such index, $rightmost\leftarrow u$.
    Conceptually, we select the positive interval $A[start, rightmost]$.

Complexity Analysis

Each one of step 1, 2 and 3 takes $O(n)$ time.

$u$ starts at 0ドル$ and ends at most $n$. Since $u\le n$ always increases at step 1 unless the algorithm returns, step 1 and hence the whole loop body of step 1, 2, 3 can repeat at most $n$ times. Hence the whole algorithm takes $O(n^2)$ time.

The post here optimizes the approach above to run in $O(n)$ time.

Correctness proof

Let $A[s_1..r_1], A[s_2..r_2], \cdots$ be the positive intervals selected successively by the greedy algorithm. So $r_1<r_2<\cdots$.
Let $A[s_1'..r_1'], A[s_2'..r_2'], \cdots$ be some positive intervals that covers all positive entries in $A$. WLOG, assume $r_1'<r_2'<\cdots$.

Let $s_0=r_0=s_0'=r_0'=0$.

It is enough to prove the greedy algorithm "stays ahead", i.e., $r_i\ge r_i'$ for all $i$.

The case $i=0$ is trivial. Suppose $r_i\ge r_i'$ for some $i$.

Let $u$ and $start=s_{i+1}$ be the values found in the step 1 and 2 of the algorithm that leads to the selected interval $A[s_{i+1}..r_{i+1}]$, i.e., $A[u]$ is the positive entry after $A[r_i]$ and $\text{sum}(A[start..u])$ is the largest among $\text{sum}(A[*..u])$.

Since $r_i'\le r_i<u$, $A[u]$ is not covered by any interval $A[s_j'.. r_j']$, 1ドル\le j\le i$. Hence $A[u]$ is in some $A[s_k'..r_k']$, where $k\ge i+1$. $$ \text{sum}(A[u+1..r_k'])=\text{sum}(A[s_k'..r_k'])-\text{sum}(A[s_k'..u])>0-\text{sum}(A[start..u])$$ By the step 3ドル$ of the algorithm, the inequality above implies $r_{i+1}\ge r_k'$. Hence $r_{i+1}\ge r_k'\ge r_{i+1}'$.

A Python implementation

While the indices above start with 1ドル$, they start with 0ドル$ in the implementation.

from itertools import accumulate
def minimum_cover(a):
 n = len(a)
 answer = 0
 rightmost = -1
 while True:
 uncovered = rightmost + 1
 while uncovered < n and a[uncovered] <= 0:
 uncovered += 1
 if uncovered == n:
 break
 surplus = max(accumulate(a[uncovered::-1]))
 new_rightmost = uncovered + 1 + max(
 (idx for idx, sm in enumerate(accumulate(a[uncovered + 1:n]))
 if surplus + sm > 0), default=-1)
 if new_rightmost > rightmost:
 rightmost = new_rightmost
 answer += 1
 return answer
print(minimum_cover([1, 7, -9, 2, -10, -5, 5, -10, 4, -2, 3, -1, 2])) # 2
print(minimum_cover([1, -3, 3, -2, 1, -2, 1])) # 2
answered Apr 30, 2022 at 0:41
$\endgroup$
1
  • $\begingroup$ Overlapping positive intervals are needed to cover [1, -2, 1, -2, 3, -2, 1, -2, 1] with minimum number of positive intervals. Ditto for [1, -7, 3, -5, 9, -6, 4, -8, 2]. $\endgroup$ Commented May 17, 2023 at 13:54
1
$\begingroup$

The accepted greedy solution doesn't work.

I don't think there is a greedy algorithm. There is a counterexample to the other solution, where they use the heuristic of adding the largest positive interval starting at the current positive integer.

Consider the counterexample:

[1, -10, 10, -8, 3, -6, 4, -10, 8]

The proposed greedy algorithm will cover the array with 3 intervals:

(1, -10, 10) (3, -6, 4) (8)

But the array can be optimally covered with just 2 intervals:

(1) (10, -8, 3, -6, 4, -10, 8)
answered Oct 11, 2022 at 4:14
$\endgroup$
1
  • $\begingroup$ Thanks for catching my error. Please check my updated answer. $\endgroup$ Commented Dec 9, 2022 at 14:01
0
$\begingroup$

When we have an array like this: [1, -6, 5, -100, 101, -8, 3, -6, 4, -90, 8],

and apply the above greedy algorithm to it, the outcome is 2

However, I think that the correct answer should be 3.

answered May 17, 2023 at 9:00
$\endgroup$
3
  • $\begingroup$ It is not required that the positive intervals are disjoint. sum([1, -6, 5, -100, 101]$=1>0$.$\quad$ sum([101, -8, 3, -6, 4, -90, 8]$=12>0$. The correct outcome should be 2ドル$. $\endgroup$ Commented May 17, 2023 at 12:44
  • $\begingroup$ Please clarify in your answer: Does above greedy algorithm refer to ErroR's or John L's? $\endgroup$ Commented May 18, 2023 at 9:34
  • $\begingroup$ It refers to John L's. but I misunderstood the problem. I didn't know overlapping was possible. Thank you for pointing out the part I misunderstood. $\endgroup$ Commented May 18, 2023 at 13:33

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.