Algorithm to optimize grouping

Question 1

I would like to know if there's a known algorithm or best practice way to do the following:

I have a collection with a subcollection, for example:

R1 R2 R3
-- -- --
M M M
N N
L L
 A

What i need is an algorithm to get the following result:

R1, R2: M N L
R2: A
R3: M

This is -not- what i want, it has more repeating values for R than the above:

R1, R2, R3: M
R1, R2: N L
R2: A

I need to group in way that i get the most optimized groups of R. The least amount of groups of R the better so i get the largest sub collections.

Another example (with the most obvious result):

R1 R2 R3
-- -- --
M M A
V V B
L L C

Should result in:

R1, R2: M V L
R3: A B C

I need to do this in LINQ/C#.

Any solutions? Tips? Links?

Question 2

It's not clear what you are trying to minimize or maximize. Do you always want to list the largest set common to more than one R group? Suppose you have R1: ABCD; R2: ABCD; R3: ABC ? Then would you like to get R1,R2: ABCD; R3: ABC ?

Question 3

I want the least combinations for R. The more "R"s in a group, the better. In your example: Yes, you're right.

Question 4

You're "not what I want" has the most R's in a group. Do you mean you want the least number of total R's in all groups ? I think that's what you're trying to say..

Question 5

Did anyone else initially misread the title as "algorithm to optimize groping"?

Question 6

You're right. I guess my explanation isn't the best. By the more "R"s in a group i meant that ideally i have them all in 1 group. The less they repeat the better.

Question 7

I think since you want the least total R's, you want to reverse your K/V and use the values as keys so your initial list becomes:

M: R1, R2, R3
N: R1, R2
L: R1, R2
A: R2

Then you can say the count of members intersecting ( http://msdn.microsoft.com/en-us/library/system.linq.enumerable.intersect.aspx ) of M and N is 2, which means you get n-(2 sets * 2 members) number of R's, and if you intersect L you get n-(3*2) number of R's. and if you intersect A as well you get n-(4*1) number of R's, so obviously your best choice is the second intersection set. Unfortunately this is a naive implementation with O(n!) time because you have to interset all possible sets to find the greatest set of intersections which will yield the largest minimization of R's to portray all sets.

After you find the best optimization in this technique where the largest intersection count * sets crossed, remove all the R's in the intersection set from the sets you intersected, then repeat the terribly inefficient combinatoric intersection comparisons against the sets as they are left. Rinse and repeat the comparisons to find largest intersections and remove them from the sets until no Rs are left in any sets.

There is surely a more efficient comparison technique than to intersect each set combination as I'm suggesting. I'm just suggesting an algorithm that should work (I believe) which is likely the naive approach.

Jimmy Hoffa Jimmy Hoffa 16.2k3 gold badges71 silver badges80 bronze badges · Answer 1 · 2012-11-28 18:26:55Z

I think since you want the least total R's, you want to reverse your K/V and use the values as keys so your initial list becomes:

M: R1, R2, R3
N: R1, R2
L: R1, R2
A: R2

Then you can say the count of members intersecting ( http://msdn.microsoft.com/en-us/library/system.linq.enumerable.intersect.aspx ) of M and N is 2, which means you get n-(2 sets * 2 members) number of R's, and if you intersect L you get n-(3*2) number of R's. and if you intersect A as well you get n-(4*1) number of R's, so obviously your best choice is the second intersection set. Unfortunately this is a naive implementation with O(n!) time because you have to interset all possible sets to find the greatest set of intersections which will yield the largest minimization of R's to portray all sets.

After you find the best optimization in this technique where the largest intersection count * sets crossed, remove all the R's in the intersection set from the sets you intersected, then repeat the terribly inefficient combinatoric intersection comparisons against the sets as they are left. Rinse and repeat the comparisons to find largest intersections and remove them from the sets until no Rs are left in any sets.

There is surely a more efficient comparison technique than to intersect each set combination as I'm suggesting. I'm just suggesting an algorithm that should work (I believe) which is likely the naive approach.

Stack Exchange Network

Algorithm to optimize grouping

1 Answer 1

There is surely a more efficient comparison technique than to intersect each set combination as I'm suggesting. I'm just suggesting an algorithm that should work (I believe) which is likely the naive approach.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Algorithm to optimize grouping

1 Answer 1

There is surely a more efficient comparison technique than to intersect each set combination as I'm suggesting. I'm just suggesting an algorithm that should work (I believe) which is likely the naive approach.

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions