Find the length of the longest subarray having sum greater than k

Question 1

I tried to solve this problem but could not do it better than $O(n^2)$.

My Algorithm:

1. Calculate prefix sum
2. for i in 1..n:
 for j in 1..i:
 if presum[i] - presum[j-1] > k:
 ans = max(ans, i - j)

Edit: I have found a solution which works in $O(n\log n)$. This solution does a binary search on the length of the subarray and chooses the maximum length of the subarray satisfying the property.

The code is as follows:

#include<stdio.h>
long long int a[1000005]={0},sum[1000005]={0},n,k;
long long int check(long long int len)
{
long long int i,c=0;
for(i=len;i<=n;i++)
{
 if((sum[i]-sum[i-len])>=k)
 c++;
}
 return c;
 }
int main()
{
 long long int i;
 scanf("%lld",&n);
 scanf("%lld",&k);
 for(i=1;i<=n;i++)
 {
 scanf("%lld",&a[i]);
 
 sum[i]=a[i]+sum[i-1];
 }
 long long int low=0,high=n,ans=0,count=0;
 while(low<=high)
 {
 long long int mid=(low+high)/2;
 long long int p=check(mid);
 if(p>0)
 {
 ans=mid;
 low=mid+1;
 }
 else
 {
 high=mid-1;
 }
}
if(ans==0)
 printf("-1\n");
else
 printf("%lld\n",ans);
return 0; 
 }

My doubts: How does this algorithm discards the search space?

More specifically, what I want to know that suppose there does not exist any subarray of length $len$, then how can we show that there won't exist any subarray of length $>len$. The above code does the same way, whenever there does not exist any subarray of length $mid$ it removes the other half and continues to search for an answer in the lower half by decreasing the $high$ to $mid-1$.

Question 2

does the subarray have to be contiguous?

Question 3

yes the subarray has to be contiguous

Question 4

(You present a 1. - without either 2., 0. or 1.1. that looks pointless.) How about accumulating postfix sums, too?

Question 5

The algorithm you posted is not correct. It is doing binary search on a non-monotone function. For example, [100, 100, -100, -100, 100, 100] And k=50, there exists a valid interval of length 3 and 5, but none with length 4.

Question 6

Thanks for your review!

Question 7

We can improve your code and speed it up from $O(n^2)$ to $O(n \cdot \log n)$ with one change.

First precompute all prefix sum. Let $P_t$ = the sum of the first $t$ elements. Notice that $P_t$ is defined from 0ドル \leq t \leq n$ and in particular $P_0=0$. This step is $O(n)$ so it is ok.

Now you need to find two index 0ドル \leq j < i \leq n$ such that $P_i - P_j > k$ and $i - j$ is maximum.

The next step in your solution is fixing the right most index ($i$), so we need to figure out a way to find fast the leftmost $j$ such that $P_i - P_j > k$. For a fixed $i$ the condition can be converted to:

What is the smallest $j$ such that $P_i - k > P_j$?

We might be tempted to do binary search here, but it is not correct since $P_j$ is not sorted... but wait, we can just get rid of "not sorted elements" easily. Let's say that if there exist two index $a < b$ such that $P_a \leq P_b$, then we are not interested at all in the element at position $b$, since it will never be the leftmost index of an optimal interval, given that we can extend that interval up to position $a$.

With this in mind we can define an index $j$ important, if there is no index $k < j$ such that $P_k \leq P_j$, i.e. $j$ is the smallest element in the prefix ending at $j$. One nice property of important elements, is that they form a strictly decreasing subsequence of array $P$, and for any fixed $i$, the optimal election of $j$ is an important index.

Since the important index array forms a strictly decreasing subsequence we can perform binary search without any issue. To build such array, we inserts elements one by one, if the new element is less than the minimum element in the array then it is important and should be inserted, otherwise, we don't add this element. The minimum element of the important index array will always be the last one.

Pseudocode:

def largest_subarray(array, k):
 n := length of array
 P := prefix sum of array
 # Remember that P[0] = 0, P[1] = array[1], ...
 # List of important index in increasing order.
 # Notice the values such indexes are pointing too are decreasing.
 # That is P[J[0]], P[J[1]], ... is strictly decreasing
 # Initially there is only index 0
 J := [0]
 # Answer of this function.
 longest_length := -1
 for i in 1..n:
 # Check if there is at least one solution ending
 # at index i. i.e. If P[i] - min(P[j < i]) > k
 
 # Index of the smallest element
 last_j := J[-1]
 if P[i] - P[J[-1]] > k:
 # There is at least one solution, so we can run binary search
 # to find the optimal solution. i.e the smallest index j
 # such that P[i] - P[j] > k
 # Since j belongs to J, we are finding the smallest index
 # p such that P[i] - P[J[p]] > k
 p := bin_search()
 longest_length := max(longest_length, i - J[p])
 # Check if i is important
 if P[i] < P[last_j]:
 J.push(i)
 
 return longest_length

Question 8

Hey, can you please explain how does binary searching on the length of the subarray work ? I have added the code in edit section.

Question 9

Here is an $O(n)$ algorithm.

Compute the array of the prefix sums of the given array. This can be done in $O(n)$ time.

Let $p=(p_0, p_1, \cdots, p_n)$ be the array of the prefix sums, where $p_0=0$, $p_1=a_1$, $p_2=a_1+a_2$, etc. Now the problem is to find the largest $j-i$ such that $j>i$ and $p_j-p_i>k$.

Suppose $(p_i,p_j)$ is a pair such that $j>i$ and $p_j-p_i>k$. If $j-i$ reaches the maximum with $(p_i,p_j)$, then

$p_i$ should be the smallest element among $p_0, p_1, \cdots, p_i$; otherwise, we can replace $p_i$ by that smallest elements and obtain a larger $j-i$.
$p_j$ should be the largest element among $p_j, p_{j+1}, \cdots, p_n$; otherwise, we can find replace $p_j$ by that largest element and obtain a larger $j-i$.

That means, we need only to search for $(p_i,p_j)$ where $p_i$ is the smallest element in some prefix of $p$ and $p_j$ is the largest element in some suffix of $p$. This search can be done by the handy two-pointer technique.

First, we retain the smallest element so far as we traverse $p$ from left to right, to form an array of length at most $n$, from which we will choose $p_i$. Call it array left. We will let left_p be the index of $p_i$ in left.
Second, we retain the largest element so far as we traverse $p$ from right to left, to form an array of length at most $n$, from which we will choose $p_j$. Call it array right. We will let right_p be the index of $p_j$ in right.
For each left_p, we will find right_p furthest to the right such that $p_j-p_i>k$ and keep track of the maximum $j-i$ found so far, after which we will we will increase left_p. Although this step may involve a nested loop, it will be done in $O(n)$ time since in each iteration, either left_p and right_p will be increased.

Here is an implementation in Python.

Marcelo Fornet 1,1976 silver badges15 bronze badges · Answer 1 · 2020-08-19 00:27:10Z

We can improve your code and speed it up from $O(n^2)$ to $O(n \cdot \log n)$ with one change.

First precompute all prefix sum. Let $P_t$ = the sum of the first $t$ elements. Notice that $P_t$ is defined from 0ドル \leq t \leq n$ and in particular $P_0=0$. This step is $O(n)$ so it is ok.

Now you need to find two index 0ドル \leq j < i \leq n$ such that $P_i - P_j > k$ and $i - j$ is maximum.

The next step in your solution is fixing the right most index ($i$), so we need to figure out a way to find fast the leftmost $j$ such that $P_i - P_j > k$. For a fixed $i$ the condition can be converted to:

What is the smallest $j$ such that $P_i - k > P_j$?

We might be tempted to do binary search here, but it is not correct since $P_j$ is not sorted... but wait, we can just get rid of "not sorted elements" easily. Let's say that if there exist two index $a < b$ such that $P_a \leq P_b$, then we are not interested at all in the element at position $b$, since it will never be the leftmost index of an optimal interval, given that we can extend that interval up to position $a$.

With this in mind we can define an index $j$ important, if there is no index $k < j$ such that $P_k \leq P_j$, i.e. $j$ is the smallest element in the prefix ending at $j$. One nice property of important elements, is that they form a strictly decreasing subsequence of array $P$, and for any fixed $i$, the optimal election of $j$ is an important index.

Since the important index array forms a strictly decreasing subsequence we can perform binary search without any issue. To build such array, we inserts elements one by one, if the new element is less than the minimum element in the array then it is important and should be inserted, otherwise, we don't add this element. The minimum element of the important index array will always be the last one.

Pseudocode:

def largest_subarray(array, k):
 n := length of array
 P := prefix sum of array
 # Remember that P[0] = 0, P[1] = array[1], ...
 # List of important index in increasing order.
 # Notice the values such indexes are pointing too are decreasing.
 # That is P[J[0]], P[J[1]], ... is strictly decreasing
 # Initially there is only index 0
 J := [0]
 # Answer of this function.
 longest_length := -1
 for i in 1..n:
 # Check if there is at least one solution ending
 # at index i. i.e. If P[i] - min(P[j < i]) > k
 
 # Index of the smallest element
 last_j := J[-1]
 if P[i] - P[J[-1]] > k:
 # There is at least one solution, so we can run binary search
 # to find the optimal solution. i.e the smallest index j
 # such that P[i] - P[j] > k
 # Since j belongs to J, we are finding the smallest index
 # p such that P[i] - P[J[p]] > k
 p := bin_search()
 longest_length := max(longest_length, i - J[p])
 # Check if i is important
 if P[i] < P[last_j]:
 J.push(i)
 
 return longest_length

Hey, can you please explain how does binary searching on the length of the subarray work ? I have added the code in edit section.

喜欢算法和数学 39.2k4 gold badges35 silver badges96 bronze badges · Answer 2 · 2020-09-03 09:18:25Z

Here is an $O(n)$ algorithm.

Compute the array of the prefix sums of the given array. This can be done in $O(n)$ time.

Let $p=(p_0, p_1, \cdots, p_n)$ be the array of the prefix sums, where $p_0=0$, $p_1=a_1$, $p_2=a_1+a_2$, etc. Now the problem is to find the largest $j-i$ such that $j>i$ and $p_j-p_i>k$.

Suppose $(p_i,p_j)$ is a pair such that $j>i$ and $p_j-p_i>k$. If $j-i$ reaches the maximum with $(p_i,p_j)$, then

$p_i$ should be the smallest element among $p_0, p_1, \cdots, p_i$; otherwise, we can replace $p_i$ by that smallest elements and obtain a larger $j-i$.
$p_j$ should be the largest element among $p_j, p_{j+1}, \cdots, p_n$; otherwise, we can find replace $p_j$ by that largest element and obtain a larger $j-i$.

That means, we need only to search for $(p_i,p_j)$ where $p_i$ is the smallest element in some prefix of $p$ and $p_j$ is the largest element in some suffix of $p$. This search can be done by the handy two-pointer technique.

First, we retain the smallest element so far as we traverse $p$ from left to right, to form an array of length at most $n$, from which we will choose $p_i$. Call it array left. We will let left_p be the index of $p_i$ in left.
Second, we retain the largest element so far as we traverse $p$ from right to left, to form an array of length at most $n$, from which we will choose $p_j$. Call it array right. We will let right_p be the index of $p_j$ in right.
For each left_p, we will find right_p furthest to the right such that $p_j-p_i>k$ and keep track of the maximum $j-i$ found so far, after which we will we will increase left_p. Although this step may involve a nested loop, it will be done in $O(n)$ time since in each iteration, either left_p and right_p will be increased.

Here is an implementation in Python.

Stack Exchange Network

Find the length of the longest subarray having sum greater than k

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find the length of the longest subarray having sum greater than k

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions