Subset counting algorithm

Question 1

I have a following problem i want to solve efficiently :

Input: Set s of {Pass,Fail}^k vectors, m - minimum percent of vectors

Output : Set of Sets of indexes: Where each set Contains indexes of Pass word in given vectors, and number of vectors that contains Pass in those indexes is over m percent from total vectors.

Example: {(Pass,Fail,Pass,Fail),(Pass,Fail,Pass,Pass)} where k is 4 and number of set elements is 2 and m 60% , output will be {{0,2},{0},{2}}

The output is group of sets where every set contains vector indexes that for at least 60% of vectors value was Pass

Question 2

Could you clarify what the output is? At present, it's very hard to understand. For example, what does {1,2} in your example correspond to?

Question 3

I am confused about the example -- the way I read your question, the output would be [ #2 ( 0, 2, 3 ) ], where #2 indicates the list of indices is from the second set. The first set fails because only 50% of the tests Pass. How does {{0,2},{0},{2}} convey this information? Note -- I assumed 0-based indexing for the set elements.

Question 4

Questions about algorithm design are on-topic here.

Question 5

This should give you the indices. The next step is to construct the power set out of these.

http://www.google.com/#sclient=psy&hl=en&source=hp&q=python+power+set+generator&pbx=1&oq=python+power+set&aq=1v&aqi=g1g-v1&aql=&gs_sm=e&gs_upl=771l3682l0l6333l16l13l0l2l2l0l263l1455l4.6.2l12l0&bav=on.2,or.r_gc.r_pw.&fp=116a3ca570808e30&biw=1536&bih=790

You can look up what enumerate and list comprehensions and tuples are. Let me know if you are stuck translating this into C#.

s = ((True,False,True,False),(True,False,True,True))
indecies = [0] + [i+1 for i, vctr in enumerate(s) if sum(vctr) >= 0.6 * len(el)]
indecies
[0, 2]

Notes: A) You always need to prepend the 0 index, which represents the empty set. B) The way in which I prepended it here is inefficient, but you can find a better way. C) The complexity so far is proportional to the total number of grades. D) You can make 60% a parameter rather than a hard-coded constant. E) It was hard to parse out your requirements, so I assumed that you wanted to filter out the loosing vectors, and then compute a power set of the remaining indices.

Job Job 6,4593 gold badges35 silver badges54 bronze badges · Answer 1 · 2011-08-14 16:31:26Z

This should give you the indices. The next step is to construct the power set out of these.

http://www.google.com/#sclient=psy&hl=en&source=hp&q=python+power+set+generator&pbx=1&oq=python+power+set&aq=1v&aqi=g1g-v1&aql=&gs_sm=e&gs_upl=771l3682l0l6333l16l13l0l2l2l0l263l1455l4.6.2l12l0&bav=on.2,or.r_gc.r_pw.&fp=116a3ca570808e30&biw=1536&bih=790

You can look up what enumerate and list comprehensions and tuples are. Let me know if you are stuck translating this into C#.

s = ((True,False,True,False),(True,False,True,True))
indecies = [0] + [i+1 for i, vctr in enumerate(s) if sum(vctr) >= 0.6 * len(el)]
indecies
[0, 2]

Notes: A) You always need to prepend the 0 index, which represents the empty set. B) The way in which I prepended it here is inefficient, but you can find a better way. C) The complexity so far is proportional to the total number of grades. D) You can make 60% a parameter rather than a hard-coded constant. E) It was hard to parse out your requirements, so I assumed that you wanted to filter out the loosing vectors, and then compute a power set of the remaining indices.

Stack Exchange Network

Subset counting algorithm

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Subset counting algorithm

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions