Array partition using dynamic programming

Question 1

What modification should I apply to the dynamic programming implementation of two partition problem to solve the following task:

You are given an array of positive integers as input, denote it C. The program should decide if it is possible to partition the array into two equal sum subsequences. You are allowed to remove some elements from the array, but not all, in order to make such partition feasible.

Example:

Suppose the input is 4 5 11 17 9. Two partition is possible if we remove 11 and 17. My question is what adjustments to my two partition implementation I should make to determine if two partition is possible (may or may not require to remove some elements) or output that two partition is impossible even if some elements are removed. The program should run in O(sum^2 * C) time.

Here is my two partition implementation in Python:

def two_partition(C):
 n = len(C)
 s = sum(C)
 
 if s % 2 != 0: return False
 
 T = [[False for _ in range(n + 1)] for _ in range(s//2 + 1)]
 for i in range(n + 1): T[0][i] = True
 
 for i in range(1, s//2 + 1):
 for j in range(1, n + 1):
 T[i][j] = T[i][j-1]
 if i >= C[j-1]:
 T[i][j] = T[i][j] or T[i-C[j-1]][j-1]
 
 return T[s // 2][n]

For example, with input [2, 3, 1] the expected output is {2,1} and {3}. This makes it is possible to partition the array into two equal subsets. We don't need to remove any elements in this case. In the above example of 4 5 11 17 9, the two subsets are possible if we remove 11 and 17. This leaves {4,5} and {9}.

Question 2

Start with determining how your table should be dimensioned. Your current algo calls for a C*s/2 table, and the complexity is O(C*s). The required algo should have O(C*s*s) complexity, so...

Question 3

Create a 3 dimensional array indexed by sum of 1st partition, sum of 2nd partition and number of elements. T[i][j][k] if only true if it's possible to have two disjoint subsets with sum i & j respectively within the first k elements.

To calculate it, you need to consider three possibilities for each element. Either it's present in first set, or second set, or it's removed entirely. Doing this in a loop for each combination of sum possible generates the required array in O(sum ^ 2 * C).

To find the answer to your question, all you need to check is that there is some sum i such that T[i][i][n] is true. This implies that there are two distinct subsets both of which sum to i, as required by the question.

If you need to find the actual subsets, doing so is easy using a simple backtracking function. Just check which of the three possibilities are possible in the back_track functions and recurse.

Here's a sample implementation:

def back_track(T, C, s1, s2, i):
 if s1 == 0 and s2 == 0: return [], []
 if T[s1][s2][i-1]:
 return back_track(T, C, s1, s2, i-1)
 elif s1 >= C[i-1] and T[s1 - C[i-1]][s2][i-1]:
 a, b = back_track(T, C, s1 - C[i-1], s2, i-1)
 return ([C[i-1]] + a, b)
 else:
 a, b = back_track(T, C, s1, s2 - C[i-1], i-1)
 return (a, [C[i-1]] + b)
def two_partition(C):
 n = len(C)
 s = sum(C)
 T = [[[False for _ in range(n + 1)] for _ in range(s//2 + 1)] for _ in range(s // 2 + 1)]
 for i in range(n + 1): T[0][0][i] = True
 for s1 in range(0, s//2 + 1):
 for s2 in range(0, s//2 + 1):
 for j in range(1, n + 1):
 T[s1][s2][j] = T[s1][s2][j-1]
 if s1 >= C[j-1]:
 T[s1][s2][j] = T[s1][s2][j] or T[s1-C[j-1]][s2][j-1]
 if s2 >= C[j-1]:
 T[s1][s2][j] = T[s1][s2][j] or T[s1][s2-C[j-1]][j-1]
 for i in range(1, s//2 + 1):
 if T[i][i][n]:
 return back_track(T, C, i, i, n)
 return False
print(two_partition([4, 5, 11, 9]))
print(two_partition([2, 3, 1]))
print(two_partition([2, 3, 7]))

Question 4

@גלעדברקן Whoops, the function was returning true or false so I totally missed that part. It can be easily done by backtracking on successful i & j. If T1[i][j-1] is true as well, then jth element is skipped, else it is needed. decrement j until you hit 0. Apply the same procedure in reverse for T2.

Question 5

Just want to clarify what is the role of T2 and why the last loop guarantees that two partition is possible. Can you briefly explain?

Question 6

@user1812 it doesn't. This function returns False for [2,3,1]. The idea is broken and cannot be fixed. Your complexity estimate is correct.

Question 7

@n.m. But, the answer should be false for [2, 3, 1]. The partition is supposed to be in two subarrays, not subsets. How is that possible for [2, 3, 1]?

Question 8

@user1812 The last loop guarantees that there is some sum i, such that you can find a subset among first j elements and a subset from j+1 onwards both of which equal i. Removing all the elements other than the those two subsets will lead to two subarrays with equal sum.

Question 9

To determine if it's possible, keep a set of unique differences between the two parts. For each element, iterate over the differences seen so far; subtract and add the element. We're looking for the difference 0.

4 5 11 17 9
0 (empty parts)
|0 ± 4| = 4
set now has 4 and empty-parts-0
|0 ± 5| = 5
|4 - 5| = 1
|4 + 5| = 9
set now has 4,5,1,9 and empty-parts-0
|0 ± 11| = 11
|4 - 11| = 7
|4 + 11| = 15
|5 - 11| = 6
|5 + 11| = 16
|1 - 11| = 10
|1 + 11| = 12
|9 - 11| = 2
|9 + 11| = 20
... (iteration with 17)
|0 ± 9| = 9
|4 - 9| = 5
|4 + 9| = 13
|5 - 9| = 4
|5 + 9| = 14
|1 - 9| = 8
|1 + 9| = 10
|9 - 9| = 0
Bingo!

Python code:

def f(C):
 diffs = set()
 for n in C:
 new_diffs = [n]
 for d in diffs:
 if d - n == 0:
 return True
 new_diffs.extend([abs(d - n), abs(d + n)])
 diffs = diffs.union(new_diffs)
 return False

Output:

> f([2, 3, 7, 2])
=> True
> f([2, 3, 7])
=> False
> f([7, 1000007, 1000000])
=> True

Question 10

Dynamic programming solution is preferred. Also please justify your solution, i.e why your algorithm works and running time.

Question 11

@user1812 It seems like dynamic programming to me - the overlapping subproblems are the achievable differences between the two disjoint subsets. It works because it tries to place each element either in one subset (-) or the other (+) and leaves the choice of neither in the set of seen differences. Running time is O(|C| * number-of-achievable-differences).

Question 12

I quickly adapted code for searching of three equal-sums subsets to given problem.

Algorithm tries to put every item A[idx] in the first bag, or in the second bag (both are real bags) or in the third (fake) bag (ignored items). Initial values (available space) in the real bags are half of overall sum. This approach as-is has exponential complexity (decision tree with 3^N leaves)

But there is a lot of repeating distributions, so we can remember some state and ignore branches with no chance, so a kind of DP - memoization is used. Here mentioned state is set of available space in real bags when we use items from the last index to idx inclusively.

Possible size of state storage might reach N * sum/2 * sum/2

Working Delphi code (is not thoroughly tested, seems has a bug with ignored items output)

function Solve2(A: TArray<Integer>): string;
var
 Map: TDictionary<string, boolean>;
 Lists: array of TStringList;
 found: Boolean;
 s2: integer;
function CheckSubsetsWithItem(Subs: TArray<Word>; idx: Int16): boolean;
var
 key: string;
 i: Integer;
begin
 if (Subs[0] = Subs[1]) and (Subs[0] <> s2) then begin
 found:= True;
 Exit(True);
 end;
 if idx < 0 then
 Exit(False);
 //debug map contains current rests of sums in explicit representation
 key := Format('%d_%d_%d', [subs[0], subs[1], idx]);
 if Map.ContainsKey(key) then
 //memoisation
 Result := Map.Items[key]
 else begin
 Result := false;
 //try to put A[idx] into the first, second bag or ignore it
 for i := 0 to 2 do begin
 if Subs[i] >= A[idx] then begin
 Subs[i] := Subs[i] - A[idx];
 Result := CheckSubsetsWithItem(Subs, idx - 1);
 if Result then begin
 //retrieve subsets themselves at recursion unwindning
 if found then
 Lists[i].Add(A[idx].ToString);
 break;
 end
 else
 //reset sums before the next try
 Subs[i] := Subs[i] + A[idx];
 end;
 end;
 //remember result - memoization
 Map.add(key, Result);
 end;
end;
var
 n, sum: Integer;
 Subs: TArray<Word>;
begin
 n := Length(A);
 sum := SumInt(A);
 s2 := sum div 2;
 found := False;
 Map := TDictionary<string, boolean>.Create;
 SetLength(Lists, 3);
 Lists[0] := TStringList.Create;
 Lists[1] := TStringList.Create;
 Lists[2] := TStringList.Create;
 if CheckSubsetsWithItem([s2, s2, sum], n - 1) then begin
 Result := '[' + Lists[0].CommaText + '], ' +
 '[' + Lists[1].CommaText + '], ' +
 ' ignored: [' + Lists[2].CommaText + ']';
 end else
 Result := 'No luck :(';
end;
begin
 Memo1.Lines.Add(Solve2([1, 5, 4, 3, 2, 16,21,44, 19]));
 Memo1.Lines.Add(Solve2([1, 3, 9, 27, 81, 243, 729, 6561]));
end;
[16,21,19], [1,5,4,2,44], ignored: [3]
No luck :(

merlyn 2,3711 gold badge20 silver badges26 bronze badges · Accepted Answer · 2018-11-12 06:20:11Z

Create a 3 dimensional array indexed by sum of 1st partition, sum of 2nd partition and number of elements. T[i][j][k] if only true if it's possible to have two disjoint subsets with sum i & j respectively within the first k elements.

To calculate it, you need to consider three possibilities for each element. Either it's present in first set, or second set, or it's removed entirely. Doing this in a loop for each combination of sum possible generates the required array in O(sum ^ 2 * C).

To find the answer to your question, all you need to check is that there is some sum i such that T[i][i][n] is true. This implies that there are two distinct subsets both of which sum to i, as required by the question.

If you need to find the actual subsets, doing so is easy using a simple backtracking function. Just check which of the three possibilities are possible in the back_track functions and recurse.

Here's a sample implementation:

def back_track(T, C, s1, s2, i):
 if s1 == 0 and s2 == 0: return [], []
 if T[s1][s2][i-1]:
 return back_track(T, C, s1, s2, i-1)
 elif s1 >= C[i-1] and T[s1 - C[i-1]][s2][i-1]:
 a, b = back_track(T, C, s1 - C[i-1], s2, i-1)
 return ([C[i-1]] + a, b)
 else:
 a, b = back_track(T, C, s1, s2 - C[i-1], i-1)
 return (a, [C[i-1]] + b)
def two_partition(C):
 n = len(C)
 s = sum(C)
 T = [[[False for _ in range(n + 1)] for _ in range(s//2 + 1)] for _ in range(s // 2 + 1)]
 for i in range(n + 1): T[0][0][i] = True
 for s1 in range(0, s//2 + 1):
 for s2 in range(0, s//2 + 1):
 for j in range(1, n + 1):
 T[s1][s2][j] = T[s1][s2][j-1]
 if s1 >= C[j-1]:
 T[s1][s2][j] = T[s1][s2][j] or T[s1-C[j-1]][s2][j-1]
 if s2 >= C[j-1]:
 T[s1][s2][j] = T[s1][s2][j] or T[s1][s2-C[j-1]][j-1]
 for i in range(1, s//2 + 1):
 if T[i][i][n]:
 return back_track(T, C, i, i, n)
 return False
print(two_partition([4, 5, 11, 9]))
print(two_partition([2, 3, 1]))
print(two_partition([2, 3, 7]))

@גלעדברקן Whoops, the function was returning true or false so I totally missed that part. It can be easily done by backtracking on successful i & j. If T1[i][j-1] is true as well, then jth element is skipped, else it is needed. decrement j until you hit 0. Apply the same procedure in reverse for T2.
Just want to clarify what is the role of T2 and why the last loop guarantees that two partition is possible. Can you briefly explain?
@user1812 it doesn't. This function returns False for [2,3,1]. The idea is broken and cannot be fixed. Your complexity estimate is correct.
@n.m. But, the answer should be false for [2, 3, 1]. The partition is supposed to be in two subarrays, not subsets. How is that possible for [2, 3, 1]?
@user1812 The last loop guarantees that there is some sum i, such that you can find a subset among first j elements and a subset from j+1 onwards both of which equal i. Removing all the elements other than the those two subsets will lead to two subarrays with equal sum.

CollectivesTM on Stack Overflow

Array partition using dynamic programming

3 Answers 3

18 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

18 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related