Algorithm to find minimum number of elements that occur in all sets

Question 1

I have n sets that each contain some elements, for example:

{2, 3}, {6, 3}, {2, 7, 3}, {7}

Now the goal is to find the minimum number of elements that occur in all sets. For this example it would be 2 ({3, 7}). Some more examples:

{2}, {4} - minimum is 2 ({2, 4})
{4, 15, 3}, {3, 2}, {24, 4, 2}, {24}, {15, 6} - minimum is 3 ({15, 2, 24} or {15, 3, 24} or {3, 24, 6})

I don't need to know the set of those minimum number of elements necessarily, just the size of that set is enough.

I've written a recursive algorithm myself already that goes over all possibilities, but the execution time really gets out of hand as the number of different elements rises. Is there an algorithm available (maybe already written but I couldn't find it) for this specific problem? I've looked at the set cover problem, it looks a bit like this one but it's not exactly the same.

Question 2

This could be a better fit on Code Review if you were to post your current code.

Question 3

Thinking aloud: I would take an iterative approach. You resultSet starts with all elements that are members of singleton sets. In your original example, 7. Then, I would remove 7 from all sets (removing {7} entirely rather than leaving it empty), and calculate the intersection of all remaining sets. If the intersection is empty, you find the smallest set, and add all of its elements to the result, and repeat. If it's not empty (in this case, it will be {3}), remove the intersection elements from all sets. Repeat until all original sets are "checked off".

Question 4

This might be more appropriate for cs.stackexchange.com who deal more with these kinds of cs/algorithm problems. Their answers are more language agnostic (psuedo-code and such), but should set you in the right direction with regards to the overall approach

Question 5

Never mind, my greedy approach takes too many elements. The intersection of all remaining sets contains more elements than strictly necessary. Of each such intersection, only one of its elements is necessary to "hit" all of the input sets. Picking one of them strategically is ... hard.

Question 6

The problem you are describing is the hitting set problem, and as you suggest is closely related to the set cover problem. Finding a hitting set is NP-hard, so it is no surprise that your implementation will be slow as the number of elements rises!

You can approximate the solution with various approaches such as a greedy algorithm, but this is very much a case of 'pick your battles': whatever approach you choose, you are trading correctness for time.

Question 7

To get from this problem to set cover, you have to reverse the relationship between sets and numbers. So instead of putting number 3 in sets 1, 2 and 3, you make set 3 contain numbers 1, 2 and 3. Then it's set cover.

Question 8

The problem is NP-complete. That means (look it up in CS) nobody has found an algorithm that is always fast, and likely never will.

You can try to keep the execution time lower. Remove all empty sets. If there is a set with a single element, like 24 in your example, that number goes to your solution because you need it, then you remove that number from the problem. If there is an element common to all sets then you take it and have the solution. If a set is a subset of another, remove the superset. If an element is only in one set, remove it. If an element always comes with the same other element then remove it. So far it’s easy.

Then you switch to your recursive approach. Count how many sets contain each element, and you want to try the ones contained in the largest number of sets first. Keep track of the best solution found and stop examining across paths that can’t give better results. For example if you have 100 sets and covered them with 10 numbers, then you only check numbers recursively that are in at least 12 sets because now you only care about nine or fewer numbers covering all sets.

Aaron Stockdill Aaron Stockdill 542 bronze badges · Accepted Answer · 2020-10-05 15:00:23Z

The problem you are describing is the hitting set problem, and as you suggest is closely related to the set cover problem. Finding a hitting set is NP-hard, so it is no surprise that your implementation will be slow as the number of elements rises!

You can approximate the solution with various approaches such as a greedy algorithm, but this is very much a case of 'pick your battles': whatever approach you choose, you are trading correctness for time.

To get from this problem to set cover, you have to reverse the relationship between sets and numbers. So instead of putting number 3 in sets 1, 2 and 3, you make set 3 contain numbers 1, 2 and 3. Then it's set cover.

Stack Exchange Network

Algorithm to find minimum number of elements that occur in all sets

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Algorithm to find minimum number of elements that occur in all sets

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions