Divide array into three disjoint sets with equal sum

Question 1

Problem definition : Array partition problem

Given an array of positive integers, divide it into three disjoint subsets having equal sum. These disjoint sets cover the complete array.

Example Input: [7, 6, 1, 7] Output: [[7], [6, 1], [7]]

Input: [7, 6, 2, 7] Output: Not possible to partition in three parts

Input: [[1, 1, 1, 1, 3, 2 ] Output:[[1, 1, 1], [1, 2], [3]]

Solution Brute force solution of examining all possible subsets in iterative approach.

# Gets all possible subsets of input_array which doesn't include element from skip_indices
def get_subsets(input_array, skip_indices=[]):
 def get_incremental_subsets(index, subsets): 
 new_subsets = []
 new_subsets.append([index]) 
 for count in range(len(subsets)):
 current_subset = subsets[count].copy()
 current_subset.append(index)
 if len(current_subset) == 1:
 new_subsets.append([current_subset])
 else:
 new_subsets.append(current_subset)
 return subsets + new_subsets
 subsets = []
 for count in range(len(input_array)):
 if count not in skip_indices:
 subsets = get_incremental_subsets(count, subsets) 
 return subsets
# Sum of a subset
def get_subset_sum(input_array, subset_indices):
 subset_sum = 0
 for subset_index in subset_indices:
 subset_sum = subset_sum + input_array[subset_index]
 return subset_sum
# Find partition input_array in two subsets such that each subset have sum equal to sum_value. 
# Subset exclude elements present in skip_indices
def partition_two(input_array, skip_indices, sum_value):
 input_array_sum = sum(input_array)
 for index in skip_indices:
 input_array_sum = input_array_sum - input_array[index]
 
 if input_array_sum != 2*sum_value: 
 return None
 
 subsets = get_subsets(input_array, skip_indices)
 for index in range(len(subsets)):
 if get_subset_sum(input_array, subsets[index]) == sum_value:
 return subsets[index]
 return None
 
def partition_three(input_array): 
 # Check if input_array sum is actually multiple of 3
 input_array_sum = sum(input_array)
 if input_array_sum%3 != 0: 
 return None 
 target_sum = input_array_sum/3
 
 #Iterate over all subsets
 for subset in get_subsets(input_array):
 # Skip subsets where sum is not equal to target_sum
 if get_subset_sum(input_array, subset) != target_sum: 
 continue
 
 # If subset with target_sum is found, try finding two subsets with target_sum in remaining array 
 subset_1 = partition_two(input_array, subset, target_sum)
 if subset_1 != None: 
 return [subset,\
 subset_1,\
 list(set([index for index in range(len(input_array))]) - set(subset) - set(subset_1))]
 return None
 
print(partition_three([1, 1, 1 ])) # Possible. Subset indices are [[0], [1], [2]]
print(partition_three([1, 1, 1, 3, 2, 1 ])) # Possible. Subset indices are [[0, 1, 2], [3], [4, 5]]
print(partition_three([1, 1, 1, 1, 3, 2 ])) # Possible. Subset indices are [[0, 1, 2], [4], [3, 5]]
print(partition_three([1, 5 ])) # Not possible
print(partition_three([])) # Not possible
print(partition_three([7, 3, 2, 1, 5, 4, 8 ])) # Possible. Subset indices are [[0, 1], [3, 4, 5], [2, 6]
print(partition_three([1, 3, 6, 2, 7, 1, 2, 8 ])) # Possible. Subset indices are [[0, 1, 2], [3, 4, 5], [6, 7]]
print(partition_three([7, 6, 1, 7 ])) #Possible. Subset indices are [[0], [1, 2], [3]]
print(partition_three([7, 6, 2, 7 ])) # Not possible
print(partition_three([17, 17, 17, 34, 34, 34, 59, 59, 59])) # Possible. Subset indices are [[0, 3, 6], [1, 4, 7], [8, 2, 5]]
print(partition_three([20, 23, 25, 30, 49, 45, 27, 30, 30, 40, 22, 19])) # Possible. Subset indices are [[0, 2, 3, 5], [1, 6, 7, 9], [8, 10, 11, 4]]

Review ask

Functional correctness
Algorithm improvements
Python usage

Question 2

Your problem description has output of list values. Your solution outputs list indices. Why?

Question 3

PEP 8

The Style Guide for Python Code has a several conventions all Python programmers should follow. You deviate from the guidelines in a few minor areas.

Binary operators should be surrounded by a space. You have 2*sum_value, input_array_sum%3 and input_array_sum/3, which should be written 2 * sum_value, input_array_sum % 3 and input_array_sum / 3
You have a long statement continued over 3 lines using backslashes. The backslashes are completely unnecessary, since the backslashes are contained inside brackets ([...]). The backslashes can be simply removed.
Spaces should not proceed ]. Many of your test cases have a space before the closing ], such as [1, 1, 1 ]. The final space should be removed ([1, 1, 1])

Speed improvements

def get_subset_sum(input_array, subset_indices):
 subset_sum = 0
 for subset_index in subset_indices:
 subset_sum = subset_sum + input_array[subset_index]
 return subset_sum

This code creates a subset_sum variable, assigns an integer to it, retrieves the value of the variable, adds another number to it (creating a new number object), and reassigns that to the variable, and repeats.

A minor improvement would be to use the += operator.

def get_subset_sum(input_array, subset_indices):
 subset_sum = 0
 for subset_index in subset_indices:
 subset_sum += input_array[subset_index]
 return subset_sum

but it would be much better to let the Python interpreter keep track of the intermediate results itself, without having to keep reassigning it to a variable:

def get_subset_sum(input_array, subset_indices):
 return sum(input_array[subset_index] for subset_index in subset_indices)

Fast reject

You immediately return None if the sum of the input array is not a multiple of 3. There are other checks you can make:

If the array length is less that 3, you could immediately stop, since one or more partitions would be empty.
If any element is larger that target_sum no solution is possible.

[1, 5] fails both the above conditions, but you run the entire algorithm on the input anyway futilely searching for a solution. With 2 elements, the failure is fairly fast, but with 99 ones, and one 99, you start exhaustively looking for subsets that total 66, of which there are many, but one subset will always have to include the 99, which means no solution is possible.

Algorithmic improvement

Your example cases contain many duplicate values. By searching all subset of indices, there are 24 different ways of selecting 4 ones from a list containing 4 ones, but the order does not matter when you end up summing these. If instead you maintained a count of all values (ie, using collections.Counter) you could avoid generating equivalent subsets which have already been examined, by selecting subsets containing 0 to n of those repeated values.

AJNeufeld AJNeufeld 35.3k5 gold badges41 silver badges103 bronze badges · Accepted Answer · 2021-10-15 21:58:50Z

PEP 8

The Style Guide for Python Code has a several conventions all Python programmers should follow. You deviate from the guidelines in a few minor areas.

Binary operators should be surrounded by a space. You have 2*sum_value, input_array_sum%3 and input_array_sum/3, which should be written 2 * sum_value, input_array_sum % 3 and input_array_sum / 3
You have a long statement continued over 3 lines using backslashes. The backslashes are completely unnecessary, since the backslashes are contained inside brackets ([...]). The backslashes can be simply removed.
Spaces should not proceed ]. Many of your test cases have a space before the closing ], such as [1, 1, 1 ]. The final space should be removed ([1, 1, 1])

Speed improvements

def get_subset_sum(input_array, subset_indices):
 subset_sum = 0
 for subset_index in subset_indices:
 subset_sum = subset_sum + input_array[subset_index]
 return subset_sum

This code creates a subset_sum variable, assigns an integer to it, retrieves the value of the variable, adds another number to it (creating a new number object), and reassigns that to the variable, and repeats.

A minor improvement would be to use the += operator.

def get_subset_sum(input_array, subset_indices):
 subset_sum = 0
 for subset_index in subset_indices:
 subset_sum += input_array[subset_index]
 return subset_sum

but it would be much better to let the Python interpreter keep track of the intermediate results itself, without having to keep reassigning it to a variable:

def get_subset_sum(input_array, subset_indices):
 return sum(input_array[subset_index] for subset_index in subset_indices)

Fast reject

You immediately return None if the sum of the input array is not a multiple of 3. There are other checks you can make:

If the array length is less that 3, you could immediately stop, since one or more partitions would be empty.
If any element is larger that target_sum no solution is possible.

[1, 5] fails both the above conditions, but you run the entire algorithm on the input anyway futilely searching for a solution. With 2 elements, the failure is fairly fast, but with 99 ones, and one 99, you start exhaustively looking for subsets that total 66, of which there are many, but one subset will always have to include the 99, which means no solution is possible.

Algorithmic improvement

Your example cases contain many duplicate values. By searching all subset of indices, there are 24 different ways of selecting 4 ones from a list containing 4 ones, but the order does not matter when you end up summing these. If instead you maintained a count of all values (ie, using collections.Counter) you could avoid generating equivalent subsets which have already been examined, by selecting subsets containing 0 to n of those repeated values.

Stack Exchange Network

Divide array into three disjoint sets with equal sum

1 Answer 1

PEP 8

Speed improvements

Fast reject

Algorithmic improvement

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Divide array into three disjoint sets with equal sum

1 Answer 1

PEP 8

Speed improvements

Fast reject

Algorithmic improvement

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions