I am unable to understand the logic behind O(n)
solution of this classical problem- Maximum length sub-array having given sum
(negative numbers included),
I read that it can be solved in O(n) time complexity and O(n) space complexity
, but the logic behind it is still unclear to me. I have read this solution:
- Initialize sum = 0 and maxLen = 0.
- Create a hash table having (sum, index) tuples.
- For i = 0 to n - 1, perform the following steps:
- Accumulate arr[i] to sum.
- If sum == k, update maxLen = i + 1.
- Check whether sum is present in the hash table or not. If not present, then add it to the hash table as (sum, i) pair.
- Check if (sum - k) is present in the hash table or not. If present, then obtain index of (sum - k) from the hash table as index. Now check if maxLen < (i - index), then update maxLen = (i - index).
- Return maxLen.
Examples:
Input : arr[] = { 10, 5, 2, 7, 1, 9 }, k = 15
Output : 4
The sub-array is {5, 2, 7, 1}.
Input : arr[] = {-5, 8, -14, 2, 4, 12}, k = -5
Output : 5
Taken from this link: https://www.geeksforgeeks.org/longest-sub-array-sum-k/
Could someone please elaborate on the solution provided for this problem?
2 Answers 2
This is an example of a dynamic algorithm. I will adapt the algorithm in the link, as I don't think that it is written in the most helpful way.
- Initialise
sum = 0
andmaxLen = 0
. - Create a hash table having
(sum, index)
tuples. - For
i = 0 to n-1
, perform the following steps:- Set
sum = sum + arr[i]
. - If
sum == k
, updatemaxLen = i+1
, continue from here to next iteration. - Check whether
sum
is present in the hash table or not. If not present, then add it to the hash table as(sum, i)
pair. - Check if
(sum-k)
is present in the hash table or not. If present, then obtain(sum-k, index)
from the hash table. IfmaxLen < (i-index)
, then updatemaxLen = (i-index)
. ReturnmaxLen
.
- Set
If sum == k
at case 3.3, then we know that the sub-array [a[0],...,a[i]]
is the maximum-length sub-array whose sum is k
. Otherwise, we continue to 3.3. If there was no sub-array until now whose sum we equal to the current sum
, then we add (sum, i)
to the hash table. The reason we don't overwrite existing entries will become apparent at the end.
We now continue to 3.4. If we came across a sub-array S
whose sum was sum - k
, then there must be pair of the form (sum - k, index)
in the hash table. S
is therefore the sub-array of the form [a[0], ..., a[index]]
.
We obtain a new sub-array [a[index + 1], ..., a[i]]
whose sum is k
and whose length is i - index
as follows: Remove all elements in S
from the sub-array [a[0], ..., a[i]]
. Its sum must be sum - (sum - k) = k
. Now the question is whether the length of the new sub-array i - index > maxLen
. If it is, set maxLen = i - index
.
Now we can see why we didn't update existing entries in 3.3. We do not want to add larger arrays (greater index
) for a given sum. This would have made i - index
smaller.
This is a polynomial-time algorithm, as we look at each element exactly once, and the hash table accesses are also constant time.
A sum of consecutive elements in a list can be expressed as:
$a_j + a_{j+1} + \ldots +a_k = (a_1 + \ldots + a_k) - (a_1 + \ldots + a_{j-1})$
Let's write
$S_n := a_1 + \ldots + a_n$
Then if we just go through the list calculating $S_i$ and remembering which numbers we got, we can check at each step whether there's a front part we can subtract off to get a given sum of consecutive elements.
Explore related questions
See similar questions with these tags.