Largest subarray with sum equal to 0

Question 1

This is a typical interview question:

Given an array that contains both positive and negative elements without 0, find the largest subarray whose sum equals 0.

I am not sure if this will satisfy all the edge case. If someone can comment on that, it would be excellent. I also want to extend this to sum equaling to any , not just 0. How should I go about it? And any pointers to optimize this further is also helpful.

from collections import Counter
def sub_array_sum(array,k=0):
 start_index = -1
 hash_sum = {}
 current_sum = 0
 keys = set()
 best_index_hash = Counter()
 for i in array:
 start_index += 1
 current_sum += i
 if current_sum in hash_sum:
 hash_sum[current_sum].append(start_index)
 keys.add(current_sum)
 else:
 if current_sum == 0:
 best_index_hash[start_index] = [(0,start_index)]
 else:
 hash_sum[current_sum] = [start_index]
 if keys:
 for k_1 in keys:
 best_start = hash_sum.get(k_1)[0]
 best_end_list = hash_sum.get(k_1)[1:]
 for best_end in best_end_list:
 if abs(best_start-best_end) in best_index_hash:
 best_index_hash[abs(best_start-best_end)].append((best_start+1,best_end))
 else:
 best_index_hash[abs(best_start-best_end)] = [(best_start+1,best_end)]
 if best_index_hash:
 (bs,be) = best_index_hash[max(best_index_hash.keys(),key=int)].pop()
 print array[bs:be+1]
 else:
 print "No sub array with sum equal to 0"
def Main():
 a = [6,-2,8,5,4,-9,8,-2,1,2]
 b = [-8,8]
 c = [7,8,-1,1]
 d = [2200,300,-6,6,5,-9]
 e = [-9,-6,8,6,-14,9,6]
 sub_array_sum(a)
 sub_array_sum(b)
 sub_array_sum(c)
 sub_array_sum(d)
 sub_array_sum(e)
if __name__ == '__main__':
 Main()

Question 2

Are only subarrays with contiguous elements from the original array allowed, or arbitrary elements?

Question 3

Yes. Only contiguous elements.

Question 4

I think that contiguous subsequences are called substrings.

Question 5

I have to admit I didn't fully understood the algorithm. Anyway it seems suboptimal, due to use of heaviweight containers and a nested loops, and is really hard to follow.

I would rather start with an observation that replacing the array with its partial sums reduces the problem to finding two equal elements as distant as possible (two partial sums being equal means the sum in between is 0). This is pretty trivial:

Calculate partial sums (\$O(n)\$)
Stable sort (sum(i),i) tuples (\$O(n\log^2n\$))
Scan for largest equal range (\$O(n)\$)

Question 6

Just to explain why this works. You create an array with element like \$ a'_i = a_0+...+a_i \$ So \$ a'_i = a'_{(i+k)} \Rightarrow a_0+...+a_i = a_0+...+a_i + ... +a_{(i+k)} \Rightarrow 0 = a_{(i+1)} + ... + a_{(i+k)} \$

Question 7

Here is a linear time solution. The only drawback is that it relies on dictionaries; well, it's still \$O(1)\$ per dictionary operation, but the constant factors are a little bit larger. Also, I would implement a class for representing exactly solutions to the problem:

class ZeroSubarray:
 def __init__(self, arr, from_index, to_index):
 self.arr = arr
 self.from_index = from_index
 self.to_index = to_index
 def __str__(self):
 ret = ""
 ret += "from_index=" + str(self.from_index)
 ret += ", to_index=" + str(self.to_index)
 ret += ", subarray=["
 sep = ""
 for i in range(self.from_index, self.to_index):
 ret += sep
 sep = ", "
 ret += str(self.arr[i])
 ret += "]"
 return ret
def zero_subarray_length(arr):
 cumulative_array = [0 for i in range(len(arr) + 1)]
 # This is O(n), where n = len(arr):
 for i in range(1, len(cumulative_array)):
 cumulative_array[i] = cumulative_array[i - 1] + arr[i - 1]
 map = {}
 # This is O(n) as well
 for index in range(len(cumulative_array)):
 current_value = cumulative_array[index]
 if current_value not in map.keys():
 list = [index]
 map[current_value] = list
 else:
 map[current_value].append(index)
 best_distance = 0
 best_start_index = -1
 # O(n) too. Each index into cumulative is stored in a dicttionary
 # and so the size of the dict is O(n):
 for value, list in map.items():
 min_index = list[0]
 max_index = list[-1]
 current_distance = max_index - min_index
 if best_distance < current_distance:
 best_distance = current_distance
 best_start_index = min_index
 return ZeroSubarray(arr, best_start_index, best_start_index + best_distance)
print(zero_subarray_length([2, 3, 6, -1, -4]))

Hope that helps.

vnp vnp 58.7k4 gold badges55 silver badges144 bronze badges · Answer 1 · 2014-12-03 08:04:45Z

I have to admit I didn't fully understood the algorithm. Anyway it seems suboptimal, due to use of heaviweight containers and a nested loops, and is really hard to follow.

I would rather start with an observation that replacing the array with its partial sums reduces the problem to finding two equal elements as distant as possible (two partial sums being equal means the sum in between is 0). This is pretty trivial:

Calculate partial sums (\$O(n)\$)
Stable sort (sum(i),i) tuples (\$O(n\log^2n\$))
Scan for largest equal range (\$O(n)\$)

Just to explain why this works. You create an array with element like \$ a'_i = a_0+...+a_i \$ So \$ a'_i = a'_{(i+k)} \Rightarrow a_0+...+a_i = a_0+...+a_i + ... +a_{(i+k)} \Rightarrow 0 = a_{(i+1)} + ... + a_{(i+k)} \$

coderodde coderodde 31.8k15 gold badges77 silver badges202 bronze badges · Answer 2 · 2017-07-31 16:34:56Z

Here is a linear time solution. The only drawback is that it relies on dictionaries; well, it's still \$O(1)\$ per dictionary operation, but the constant factors are a little bit larger. Also, I would implement a class for representing exactly solutions to the problem:

class ZeroSubarray:
 def __init__(self, arr, from_index, to_index):
 self.arr = arr
 self.from_index = from_index
 self.to_index = to_index
 def __str__(self):
 ret = ""
 ret += "from_index=" + str(self.from_index)
 ret += ", to_index=" + str(self.to_index)
 ret += ", subarray=["
 sep = ""
 for i in range(self.from_index, self.to_index):
 ret += sep
 sep = ", "
 ret += str(self.arr[i])
 ret += "]"
 return ret
def zero_subarray_length(arr):
 cumulative_array = [0 for i in range(len(arr) + 1)]
 # This is O(n), where n = len(arr):
 for i in range(1, len(cumulative_array)):
 cumulative_array[i] = cumulative_array[i - 1] + arr[i - 1]
 map = {}
 # This is O(n) as well
 for index in range(len(cumulative_array)):
 current_value = cumulative_array[index]
 if current_value not in map.keys():
 list = [index]
 map[current_value] = list
 else:
 map[current_value].append(index)
 best_distance = 0
 best_start_index = -1
 # O(n) too. Each index into cumulative is stored in a dicttionary
 # and so the size of the dict is O(n):
 for value, list in map.items():
 min_index = list[0]
 max_index = list[-1]
 current_distance = max_index - min_index
 if best_distance < current_distance:
 best_distance = current_distance
 best_start_index = min_index
 return ZeroSubarray(arr, best_start_index, best_start_index + best_distance)
print(zero_subarray_length([2, 3, 6, -1, -4]))

Hope that helps.

Stack Exchange Network

Largest subarray with sum equal to 0

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Largest subarray with sum equal to 0

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions