Problem definition : Array partition problem
Given an array of positive integers, divide it into three disjoint subsets having equal sum. These disjoint sets cover the complete array.
Example Input: [7, 6, 1, 7] Output: [[7], [6, 1], [7]]
Input: [7, 6, 2, 7] Output: Not possible to partition in three parts
Input: [[1, 1, 1, 1, 3, 2 ] Output:[[1, 1, 1], [1, 2], [3]]
Solution Brute force solution of examining all possible subsets in iterative approach.
# Gets all possible subsets of input_array which doesn't include element from skip_indices
def get_subsets(input_array, skip_indices=[]):
def get_incremental_subsets(index, subsets):
new_subsets = []
new_subsets.append([index])
for count in range(len(subsets)):
current_subset = subsets[count].copy()
current_subset.append(index)
if len(current_subset) == 1:
new_subsets.append([current_subset])
else:
new_subsets.append(current_subset)
return subsets + new_subsets
subsets = []
for count in range(len(input_array)):
if count not in skip_indices:
subsets = get_incremental_subsets(count, subsets)
return subsets
# Sum of a subset
def get_subset_sum(input_array, subset_indices):
subset_sum = 0
for subset_index in subset_indices:
subset_sum = subset_sum + input_array[subset_index]
return subset_sum
# Find partition input_array in two subsets such that each subset have sum equal to sum_value.
# Subset exclude elements present in skip_indices
def partition_two(input_array, skip_indices, sum_value):
input_array_sum = sum(input_array)
for index in skip_indices:
input_array_sum = input_array_sum - input_array[index]
if input_array_sum != 2*sum_value:
return None
subsets = get_subsets(input_array, skip_indices)
for index in range(len(subsets)):
if get_subset_sum(input_array, subsets[index]) == sum_value:
return subsets[index]
return None
def partition_three(input_array):
# Check if input_array sum is actually multiple of 3
input_array_sum = sum(input_array)
if input_array_sum%3 != 0:
return None
target_sum = input_array_sum/3
#Iterate over all subsets
for subset in get_subsets(input_array):
# Skip subsets where sum is not equal to target_sum
if get_subset_sum(input_array, subset) != target_sum:
continue
# If subset with target_sum is found, try finding two subsets with target_sum in remaining array
subset_1 = partition_two(input_array, subset, target_sum)
if subset_1 != None:
return [subset,\
subset_1,\
list(set([index for index in range(len(input_array))]) - set(subset) - set(subset_1))]
return None
print(partition_three([1, 1, 1 ])) # Possible. Subset indices are [[0], [1], [2]]
print(partition_three([1, 1, 1, 3, 2, 1 ])) # Possible. Subset indices are [[0, 1, 2], [3], [4, 5]]
print(partition_three([1, 1, 1, 1, 3, 2 ])) # Possible. Subset indices are [[0, 1, 2], [4], [3, 5]]
print(partition_three([1, 5 ])) # Not possible
print(partition_three([])) # Not possible
print(partition_three([7, 3, 2, 1, 5, 4, 8 ])) # Possible. Subset indices are [[0, 1], [3, 4, 5], [2, 6]
print(partition_three([1, 3, 6, 2, 7, 1, 2, 8 ])) # Possible. Subset indices are [[0, 1, 2], [3, 4, 5], [6, 7]]
print(partition_three([7, 6, 1, 7 ])) #Possible. Subset indices are [[0], [1, 2], [3]]
print(partition_three([7, 6, 2, 7 ])) # Not possible
print(partition_three([17, 17, 17, 34, 34, 34, 59, 59, 59])) # Possible. Subset indices are [[0, 3, 6], [1, 4, 7], [8, 2, 5]]
print(partition_three([20, 23, 25, 30, 49, 45, 27, 30, 30, 40, 22, 19])) # Possible. Subset indices are [[0, 2, 3, 5], [1, 6, 7, 9], [8, 10, 11, 4]]
Review ask
- Functional correctness
- Algorithm improvements
- Python usage
-
\$\begingroup\$ Your problem description has output of list values. Your solution outputs list indices. Why? \$\endgroup\$AJNeufeld– AJNeufeld2021年10月15日 21:18:11 +00:00Commented Oct 15, 2021 at 21:18
1 Answer 1
PEP 8
The Style Guide for Python Code has a several conventions all Python programmers should follow. You deviate from the guidelines in a few minor areas.
- Binary operators should be surrounded by a space. You have
2*sum_value
,input_array_sum%3
andinput_array_sum/3
, which should be written2 * sum_value
,input_array_sum % 3
andinput_array_sum / 3
- You have a long statement continued over 3 lines using backslashes. The backslashes are completely unnecessary, since the backslashes are contained inside brackets (
[...]
). The backslashes can be simply removed. - Spaces should not proceed
]
. Many of your test cases have a space before the closing]
, such as[1, 1, 1 ]
. The final space should be removed ([1, 1, 1]
)
Speed improvements
def get_subset_sum(input_array, subset_indices):
subset_sum = 0
for subset_index in subset_indices:
subset_sum = subset_sum + input_array[subset_index]
return subset_sum
This code creates a subset_sum
variable, assigns an integer to it, retrieves the value of the variable, adds another number to it (creating a new number object), and reassigns that to the variable, and repeats.
A minor improvement would be to use the +=
operator.
def get_subset_sum(input_array, subset_indices):
subset_sum = 0
for subset_index in subset_indices:
subset_sum += input_array[subset_index]
return subset_sum
but it would be much better to let the Python interpreter keep track of the intermediate results itself, without having to keep reassigning it to a variable:
def get_subset_sum(input_array, subset_indices):
return sum(input_array[subset_index] for subset_index in subset_indices)
Fast reject
You immediately return None
if the sum of the input array is not a multiple of 3. There are other checks you can make:
- If the array length is less that 3, you could immediately stop, since one or more partitions would be empty.
- If any element is larger that
target_sum
no solution is possible.
[1, 5]
fails both the above conditions, but you run the entire algorithm on the input anyway futilely searching for a solution. With 2 elements, the failure is fairly fast, but with 99 ones, and one 99, you start exhaustively looking for subsets that total 66, of which there are many, but one subset will always have to include the 99, which means no solution is possible.
Algorithmic improvement
Your example cases contain many duplicate values. By searching all subset of indices, there are 24 different ways of selecting 4 ones from a list containing 4 ones, but the order does not matter when you end up summing these. If instead you maintained a count of all values (ie, using collections.Counter
) you could avoid generating equivalent subsets which have already been examined, by selecting subsets containing 0 to n of those repeated values.
Explore related questions
See similar questions with these tags.