3
\$\begingroup\$

This is a typical interview question:

Given an array that contains both positive and negative elements without 0, find the largest subarray whose sum equals 0.

I am not sure if this will satisfy all the edge case. If someone can comment on that, it would be excellent. I also want to extend this to sum equaling to any , not just 0. How should I go about it? And any pointers to optimize this further is also helpful.

from collections import Counter
def sub_array_sum(array,k=0):
 start_index = -1
 hash_sum = {}
 current_sum = 0
 keys = set()
 best_index_hash = Counter()
 for i in array:
 start_index += 1
 current_sum += i
 if current_sum in hash_sum:
 hash_sum[current_sum].append(start_index)
 keys.add(current_sum)
 else:
 if current_sum == 0:
 best_index_hash[start_index] = [(0,start_index)]
 else:
 hash_sum[current_sum] = [start_index]
 if keys:
 for k_1 in keys:
 best_start = hash_sum.get(k_1)[0]
 best_end_list = hash_sum.get(k_1)[1:]
 for best_end in best_end_list:
 if abs(best_start-best_end) in best_index_hash:
 best_index_hash[abs(best_start-best_end)].append((best_start+1,best_end))
 else:
 best_index_hash[abs(best_start-best_end)] = [(best_start+1,best_end)]
 if best_index_hash:
 (bs,be) = best_index_hash[max(best_index_hash.keys(),key=int)].pop()
 print array[bs:be+1]
 else:
 print "No sub array with sum equal to 0"
def Main():
 a = [6,-2,8,5,4,-9,8,-2,1,2]
 b = [-8,8]
 c = [7,8,-1,1]
 d = [2200,300,-6,6,5,-9]
 e = [-9,-6,8,6,-14,9,6]
 sub_array_sum(a)
 sub_array_sum(b)
 sub_array_sum(c)
 sub_array_sum(d)
 sub_array_sum(e)
if __name__ == '__main__':
 Main()
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Dec 3, 2014 at 6:54
\$\endgroup\$
3
  • \$\begingroup\$ Are only subarrays with contiguous elements from the original array allowed, or arbitrary elements? \$\endgroup\$ Commented Dec 3, 2014 at 9:23
  • \$\begingroup\$ Yes. Only contiguous elements. \$\endgroup\$ Commented Dec 3, 2014 at 9:24
  • \$\begingroup\$ I think that contiguous subsequences are called substrings. \$\endgroup\$ Commented Aug 1, 2017 at 6:54

2 Answers 2

4
\$\begingroup\$

I have to admit I didn't fully understood the algorithm. Anyway it seems suboptimal, due to use of heaviweight containers and a nested loops, and is really hard to follow.

I would rather start with an observation that replacing the array with its partial sums reduces the problem to finding two equal elements as distant as possible (two partial sums being equal means the sum in between is 0). This is pretty trivial:

  • Calculate partial sums (\$O(n)\$)

  • Stable sort (sum(i),i) tuples (\$O(n\log^2n\$))

  • Scan for largest equal range (\$O(n)\$)

answered Dec 3, 2014 at 8:04
\$\endgroup\$
1
  • \$\begingroup\$ Just to explain why this works. You create an array with element like \$ a'_i = a_0+...+a_i \$ So \$ a'_i = a'_{(i+k)} \Rightarrow a_0+...+a_i = a_0+...+a_i + ... +a_{(i+k)} \Rightarrow 0 = a_{(i+1)} + ... + a_{(i+k)} \$ \$\endgroup\$ Commented Dec 3, 2014 at 10:35
0
\$\begingroup\$

Here is a linear time solution. The only drawback is that it relies on dictionaries; well, it's still \$O(1)\$ per dictionary operation, but the constant factors are a little bit larger. Also, I would implement a class for representing exactly solutions to the problem:

class ZeroSubarray:
 def __init__(self, arr, from_index, to_index):
 self.arr = arr
 self.from_index = from_index
 self.to_index = to_index
 def __str__(self):
 ret = ""
 ret += "from_index=" + str(self.from_index)
 ret += ", to_index=" + str(self.to_index)
 ret += ", subarray=["
 sep = ""
 for i in range(self.from_index, self.to_index):
 ret += sep
 sep = ", "
 ret += str(self.arr[i])
 ret += "]"
 return ret
def zero_subarray_length(arr):
 cumulative_array = [0 for i in range(len(arr) + 1)]
 # This is O(n), where n = len(arr):
 for i in range(1, len(cumulative_array)):
 cumulative_array[i] = cumulative_array[i - 1] + arr[i - 1]
 map = {}
 # This is O(n) as well
 for index in range(len(cumulative_array)):
 current_value = cumulative_array[index]
 if current_value not in map.keys():
 list = [index]
 map[current_value] = list
 else:
 map[current_value].append(index)
 best_distance = 0
 best_start_index = -1
 # O(n) too. Each index into cumulative is stored in a dicttionary
 # and so the size of the dict is O(n):
 for value, list in map.items():
 min_index = list[0]
 max_index = list[-1]
 current_distance = max_index - min_index
 if best_distance < current_distance:
 best_distance = current_distance
 best_start_index = min_index
 return ZeroSubarray(arr, best_start_index, best_start_index + best_distance)
print(zero_subarray_length([2, 3, 6, -1, -4]))

Hope that helps.

answered Jul 31, 2017 at 16:34
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.