Nsum problem: find all n elements in an array whose sum equals a given value

Question 1

I am pretty new to Python, so I want to listen to any advice that improve my coding style in a "pythonic" way, even about naming style of variables. The following code reflects a very general paradigm that formulates a problem as depth-first search in a tree, which I find interesting and powerful.

Suppose arr = [3, 5, 2, 1, 1, 7, 10, 2], we want to find all n=2 elements whose sum equals val=8. The algorithm should return a list of tuples, where each tuple is INDICES of n=2 elements (at different positions of the array) whose sum is val=8: [(3,5), (4,5), (0,1)], so we have,

arr[3]+arr[5] = 1+7 = 8; 
arr[4]+arr[5] = 1+7 = 8; 
arr[0]+arr[1] = 3+5 = 8.

Here is my code:

def nsum(arr, n, val): 
 if arr is None or len(arr) < n:
 return [] 
 # first sort the array by indice, so that we can search sequentially
 # with early stopping
 sorted_indice = sorted(xrange(len(arr)), key=lambda i: arr[i])
 size = len(arr)
 def nsum_recursive(arr, prev_indice, prev_sum, result):
 ''' This is the main algorithm that is a recursive function.
 Think about the problem as tree search using depth-first search,
 Each node/state is determined by (prev_indice, prev_sum), where
 prev_indice is a list of currently explored indice, whose sum is
 prev_sum. When len(prev_indice)==n, we reach a leaf node. Then we
 can check whether the sum equals val.
 '''
 # termination condition (leaf nodes)
 if len(prev_indice) == n:
 if prev_sum == val:
 result.append(tuple(sorted_indice[i] for i in prev_indice))
 else:
 current_count = len(prev_indice)
 # depth-first search
 for idx in range(
 0 if not prev_indice else prev_indice[-1]+1, 
 size-(n-current_count-1)):
 current_sum = prev_sum + arr[sorted_indice[idx]]
 # early stopping (pruning of branches)
 if current_sum>val and idx+1<size and arr[sorted_indice[idx+1]]>=0:
 break
 nsum_recursive(arr, prev_indice+[idx], 
 current_sum, result)
 # start DFS from the root node
 result = []
 nsum_recursive(arr, [], 0, result)
 return result

Question 2

[(3,5), (4,5), (0,1)] - I don't see how 4+5 or 0+1 equals 8.

Question 3

Sorry for the confusion, they are "INDICES" of the array. So arr[4] + arr[5] = 1 + 7 = 8.

Question 4

the array is supposed to have only positive numbers? An index can be repeated?

Question 5

The array can have negative numbers, not just positive numbers. An index will be used only once in any single tuple of indices of n numbers, that is, those n numbers whose sum is val should be at different positions of an array.

Question 6

def nsum(arr, n, val):

A documentation string here is important. The names of the variables are so vague that even knowing the task in advance is difficult to understand their meaning.

 if arr is None or len(arr) < n:
 return []

This check is non very useful. It does not decrease the algorithm complexity. In particular the check arr is None is wrong... you want the function to fail if the arr is not a list, otherwise you hide possible errors in the caller's code.

 # first sort the array by indice, so that we can search sequentially
 # with early stopping
 sorted_indice = sorted(xrange(len(arr)), key=lambda i: arr[i])
 size = len(arr)

invert the two lines above, so that you can use size in the previous line. Why to use xrange instead of range? The name of the variable sorted_indice is not very good for my taste: it is not "sorted" (maybe it is "sorting") and what does "indice" means? Maybe you mean "indices"?

 def nsum_recursive(arr, prev_indice, prev_sum, result):

Here the complexity is too much... we have four variables plus the three of the enclosing function: very difficult to keep track of all of them. arr is repeated. result should be the return value.

 ''' This is the main algorithm that is a recursive function.
 Think about the problem as tree search using depth-first search,
 Each node/state is determined by (prev_indice, prev_sum), where
 prev_indice is a list of currently explored indice, whose sum is
 prev_sum. When len(prev_indice)==n, we reach a leaf node. Then we
 can check whether the sum equals val.
 '''

The explanation is not very clear. You describe what are the nodes of the tree but what are the children of a node?

 # termination condition (leaf nodes)
 if len(prev_indice) == n:
 if prev_sum == val:
 result.append(tuple(sorted_indice[i] for i in prev_indice))
 else:

The two nested if are better replaced by a single if and the program should return at once, so that else is not required. This would reduce the indentation of your code. Also it seems that the check for equality and the check for early stopping are quite similar and could be done together.

 current_count = len(prev_indice)
 # depth-first search
 for idx in range(
 0 if not prev_indice else prev_indice[-1]+1, 
 size-(n-current_count-1)):

here it is very difficult to understand if the range of numbers is correct.

 current_sum = prev_sum + arr[sorted_indice[idx]]
 # early stopping (pruning of branches)
 if current_sum>val and idx+1<size and arr[sorted_indice[idx+1]]>=0:
 break

You can make the early stopping better (and maybe simpler) by checking that the current_sum plus the product of the remaining terms by the smallest term is larger that the target sum. You could also make the opposite check: if summing the larger possibile terms you cannot reach the target sum.

 nsum_recursive(arr, prev_indice+[idx], 
 current_sum, result)
 # start DFS from the root node
 result = []
 nsum_recursive(arr, [], 0, result)
 return result

score 2 · Accepted Answer · 2014-09-27 22:43:19Z

def nsum(arr, n, val):

A documentation string here is important. The names of the variables are so vague that even knowing the task in advance is difficult to understand their meaning.

 if arr is None or len(arr) < n:
 return []

This check is non very useful. It does not decrease the algorithm complexity. In particular the check arr is None is wrong... you want the function to fail if the arr is not a list, otherwise you hide possible errors in the caller's code.

 # first sort the array by indice, so that we can search sequentially
 # with early stopping
 sorted_indice = sorted(xrange(len(arr)), key=lambda i: arr[i])
 size = len(arr)

invert the two lines above, so that you can use size in the previous line. Why to use xrange instead of range? The name of the variable sorted_indice is not very good for my taste: it is not "sorted" (maybe it is "sorting") and what does "indice" means? Maybe you mean "indices"?

 def nsum_recursive(arr, prev_indice, prev_sum, result):

Here the complexity is too much... we have four variables plus the three of the enclosing function: very difficult to keep track of all of them. arr is repeated. result should be the return value.

 ''' This is the main algorithm that is a recursive function.
 Think about the problem as tree search using depth-first search,
 Each node/state is determined by (prev_indice, prev_sum), where
 prev_indice is a list of currently explored indice, whose sum is
 prev_sum. When len(prev_indice)==n, we reach a leaf node. Then we
 can check whether the sum equals val.
 '''

The explanation is not very clear. You describe what are the nodes of the tree but what are the children of a node?

 # termination condition (leaf nodes)
 if len(prev_indice) == n:
 if prev_sum == val:
 result.append(tuple(sorted_indice[i] for i in prev_indice))
 else:

The two nested if are better replaced by a single if and the program should return at once, so that else is not required. This would reduce the indentation of your code. Also it seems that the check for equality and the check for early stopping are quite similar and could be done together.

 current_count = len(prev_indice)
 # depth-first search
 for idx in range(
 0 if not prev_indice else prev_indice[-1]+1, 
 size-(n-current_count-1)):

here it is very difficult to understand if the range of numbers is correct.

 current_sum = prev_sum + arr[sorted_indice[idx]]
 # early stopping (pruning of branches)
 if current_sum>val and idx+1<size and arr[sorted_indice[idx+1]]>=0:
 break

You can make the early stopping better (and maybe simpler) by checking that the current_sum plus the product of the remaining terms by the smallest term is larger that the target sum. You could also make the opposite check: if summing the larger possibile terms you cannot reach the target sum.

 nsum_recursive(arr, prev_indice+[idx], 
 current_sum, result)
 # start DFS from the root node
 result = []
 nsum_recursive(arr, [], 0, result)
 return result

Stack Exchange Network

Nsum problem: find all n elements in an array whose sum equals a given value

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Nsum problem: find all n elements in an array whose sum equals a given value

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions