Here is a recursive implementation of the Subset sum problem:
using System;
namespace Exercise
{
class SubsetSum
{
static void Main(string[] args)
{
set = new int[] { 2, 3, 1, -1};
PrintSet(set, "Initial Set.");
sum = 4;
Console.WriteLine("Wanted sum = {0}", sum);
FindSubsetSum();
}
//-------------------------------------------------------------
static int[] set;
static int[] subSetIndexes;
static int sum;
static int numberOfSubsetSums;
//------------------------------------------------------------
/*
Method: FindSubsetSum()
*/
private static void FindSubsetSum()
{
numberOfSubsetSums = 0;
int numberOfElements = set.Length;
FindPowerSet(numberOfElements);
}
//-------------------------------------------------------------
/*
Method: FindPowerSet(int n, int k)
*/
private static void FindPowerSet(int n)
{
// Super set - all sets with size: 0, 1, ..., n - 1
for (int k = 0; k <= n - 1; k++)
{
subSetIndexes = new int[k];
CombinationsNoRepetition(k, 0, n - 1);
}
if (numberOfSubsetSums == 0)
{
Console.WriteLine("No subsets with wanted sum exist.");
}
}
//-------------------------------------------------------------
/*
Method: CombinationsNoRepetition(int k, int iBegin, int iEnd);
*/
private static void CombinationsNoRepetition(int k, int iBegin, int iEnd)
{
if (k == 0)
{
PrintSubSet();
return;
}
for (int i = iBegin; i <= iEnd; i++)
{
subSetIndexes[k - 1] = i;
++iBegin;
CombinationsNoRepetition(k - 1, iBegin, iEnd);
}
}
}
}
Input:
-
Output:
Initial Set.
{2 ,3 ,1 ,-1}
Wanted sum = 4
(1 ,3)
(-1 ,3 ,2)
The algorithm is based on finding the Power set except the trivial empty set of indexes, which is in turn based on finding all the combinationswithout repetition of indexes with size 1, 2,..., n. Then the sets of indexes corresponding to array elements with the wanted sum are printed.
Any comments regarding the style and implementation will be appreciated.
Is the complexity of this algorithm 2n, where n - number of elements in the array?
Is there more efficient approach to that problem, probably iteration instead of recursion?
Here are the helper functions:
private static void PrintSubSet()
{
int currentSubsetSum = 0;
// accumulate sum of current subset
for (int i = 0; i < subSetIndexes.Length; i++)
{
currentSubsetSum += set[subSetIndexes[i]];
}
// if wanted sum: print current subset elements
if (currentSubsetSum == sum)
{
++numberOfSubsetSums;
Console.Write("(");
for (int i = 0; i < subSetIndexes.Length; i++)
{
Console.Write(set[subSetIndexes[i]]);
if (i < subSetIndexes.Length - 1)
{
Console.Write(" ,");
}
}
Console.WriteLine(")");
}
}
//-------------------------------------------------------------
private static void PrintSet(int[] arr, string label = "")
{
Console.WriteLine(label);
Console.Write("{");
for (int i = 0; i < arr.Length; i++)
{
Console.Write(arr[i]);
if (i < arr.Length - 1)
{
Console.Write(" ,");
}
}
Console.WriteLine("}");
}
2 Answers 2
My major critique of your code is that you mix up all kinds of concerns all over the place. The code that computes the subsets also does the printing and the counting and all that stuff.
Some thoughts on how to improve this:
The problem is the subset sum problem. Its input is a set of integers. So why do you do everything in arrays? Make a class
Set
, or use an existing class, and have that represent your set of integers.Once you have that class, build basic operations out of it. So for example, a
Set<T>
should have aPowerSet
member, which returnsSet<Set<T>>
, or even better,EnumeratePowerSet
, which returnsIEnumerable<Set<T>>
.Now that you have those operations, you can use LINQ to do queries against them.
In short, I'd like to see a method like this:
// Takes a set and a sum, returns a sequence of subsets that
// sum to the given value.
static IEnumerable<Set<int>> SubsetSum(Set<int> items, int sum)
{
return from subset in items.EnumeratePowerSet()
where subset.Sum() == sum
select subset;
}
static void Display(IEnumerable<Set<int>> subsets)
{
if (!subsets.Any())
Console.WriteLine("No subsets sum to that value.");
else
foreach(var subset in subsets)
PrintSubset(subset);
Now, can you implement the EnumeratePowerSet
method on Set<T>
?
Is the complexity of this algorithm 2n, where n - number of elements in the array?
Yes.
Is there more efficient approach to that problem, probably iteration instead of recursion?
There is no known algorithm that does better than exponential in general. (There is no proof that there is no such algorithm. The discovery of such an algorithm or a proof that none exists would be a major result in computer science.)
However there are algorithms that do better on typical subset sum problems given certain constraints. There is a huge amount of research on this problem; I leave it to you to consult your local university library.
-
\$\begingroup\$ Thank you once more for the great answer! I should acquaint myself with the mechanics of iterators and class inheritance in C# as my current knowledge on the subject is coming mostly from C++. Having said that, from what I understand
Set<int>
must be a generic class inheriting from, let's sayList
and should allow iteration (implement IEnumerable)? The memberAny()
is "is empty" returningtrue
if there are subsets. Now,EnumeratePowerSet()
should involve some LINQ and a lambda (need to learn as well) expression taking advantage of the class properties? \$\endgroup\$Ziezi– Ziezi2016年10月14日 06:25:34 +00:00Commented Oct 14, 2016 at 6:25 -
1\$\begingroup\$ @Ziezi there's an
ISet<T>
interface in System.Collections.Generic you could use as a base for your set. There's alsoHashSet<T>
if you'd like to use an existing implementation of a set. \$\endgroup\$Johnbot– Johnbot2016年10月14日 08:13:34 +00:00Commented Oct 14, 2016 at 8:13
Complexity
The time complexity of your algorithm is \$O(n*2^n)\,ドル because you find \2ドル^n\$ subsets, and then you compute the sum of each subset which takes \$O(n)\$ time per subset. It would be faster if you kept a running sum as you generated each subset, so that when you reached the end of the generating a subset you would have the sum already computed. That way you could achieve the desired \$O(2^n)\$ time complexity.
Here is how you could do this:
private static void CombinationsNoRepetition(int k, int iBegin, int iEnd, int curSum)
{
if (k == 0)
{
if (curSum == sum)
PrintSubSet();
return;
}
for (int i = iBegin; i <= iEnd; i++)
{
subSetIndexes[k - 1] = i;
++iBegin;
CombinationsNoRepetition(k - 1, iBegin, iEnd, curSum + set[i]);
}
}
where your initial call to the function passes in 0
for curSum
.
Bug
Your program doesn't find an answer if the entire set is a solution. For example, change your input set to {1, 1, 1, 1}
and it won't find any solution. You need to change this line in FindPowerSet()
:
for (int k = 0; k <= n - 1; k++)
to this
for (int k = 0; k <= n; k++)
Other things
- It's confusing that you call your array variable
set
. - It's extremely hard to understand your code because your functions do things other than what their names imply. At first, I couldn't even figure out how your program worked until I found out that
PrintSubSet()
was responsible for determining whether the current subset was a solution. I thinkCheckSubSet()
would be a better name. - You use static variables all over so it's hard to know what functions are using/modifying what variables. For a toy problem such as this, it's not a big deal. But in general, whenever you write functions that have side effects (meaning they change global state somewhere), it's becomes difficult to reason through what happens when you call those functions. It would be better if your functions accepted their input through arguments and returned their results through return values. So instead of just
CheckSubSet()
, it would beisSubSet = CheckSubSet(subset)
.