Pairwise Combinations Generator

Question 1

Below is an algorithm to generate all pairwise combinations for a set of random variables. See here for an explanation of what constitutes a minimal all-pairs set. The algorithm below works but I feel that it is inefficient. Any suggestions would be welcome.

from operator import itemgetter
from itertools import combinations, product, zip_longest, chain, permutations
def pairwise(*variables, **kwargs):
 num_vars = len(variables)
 #Organize the variable states into pairs for comparisons
 indexed_vars = sorted(
 ([(index, v) for v in vals] for index, vals in enumerate(variables)),
 key=len,
 reverse=True)
 var_combos = combinations(range(num_vars), 2)
 var_products = (product(indexed_vars[i], indexed_vars[j]) for i, j in var_combos)
 vars = chain.from_iterable(zip_longest(*var_products))
 #Initialize the builder list
 builders = []
 seen_pairs = set()
 #Main algorithm loop
 for pair in vars:
 if not pair or pair in seen_pairs:
 continue
 d_pair = dict(pair)
 #Linear search through builders to find an empty slot
 #First pass: look for sets that have intersections
 for builder in builders:
 intersection = builder.keys() & d_pair.keys()
 if len(intersection) == 1:
 v = intersection.pop()
 if builder[v] == d_pair[v]:
 builder.update(d_pair)
 #Update seen pairs
 seen_pairs.update(combinations(builder.items(), 2))
 break
 else:
 #Second Pass
 for builder in builders:
 intersection = builder.keys() & d_pair.keys()
 if not len(intersection):
 builder.update(d_pair)
 #Update seen pairs
 seen_pairs.update(combinations(builder.items(), 2))
 break
 else:
 #No empty slots/pairs identified. Make a new builder
 builders.append(d_pair)
 seen_pairs.add(pair)
 #Fill in remaining values
 complete_set = set(range(num_vars))
 defaults = {var[0][0]: var[0][1] for var in indexed_vars}
 for builder in builders:
 if len(builder) == num_vars:
 yield tuple(val for index, val in sorted(builder.items()))
 else:
 for key in complete_set - builder.keys():
 builder[key] = defaults[key]
 yield tuple(val for index, val in sorted(builder.items()))

Usage Example:

u_vars = [
 [1, 2, 3, 4],
 ["A", "B", "C"],
 ["a", "b"],
 ]
result = list(pairwise(*u_vars))
print(result) #prints list of length 12

Question 2

This is not generating correct pairwise combinations.

Input data

[['Requests'],
 ['Create','Edit'],
 ['Contact Name','Email','Phone','Subject','Description','Status','Product Name','Request Owner','Created By','Modified By','Request Id','Resolution','To Address','Account Name','Priority','Channel','Category','Sub Category'],
 ['None','is','isn\'t','contains','doesn\'t contain','starts with','ends with','is empty','is not empty'],
 ['Contact Name','Email','Phone','Subject','Description','Status','Product Name','Request Owner','Mark Spam','Resolution','To Address','Due Date','Priority','Channel','Category','Sub Category']]

Result by Microsoft PICT

http://pastebin.com/cZdND9UA

Result by your code http://pastebin.com/EC6xv4vG

This generates only for NONE in most of cases such as this

Requests Create Account Name None Category Requests Create Account Name None Channel Requests Create Account Name None Description Requests Create Account Name None Due Date

Question 3

Hmmm... I'm not sure why this is. When I was using smaller inputs, it was working fine. Let me find out what's wrong with it.

Question 4

Apparently, the problem is nontrivial. I don't really have time to compare algorithms but here is a paper that I found useful: research.ibm.com/haifa/projects/verification/mdt/papers/…

Question 5

Every other generator except PICT, I have tested failed in large sets not in small sets. Yes the problem is nontrivial one. I also working on it. If I port that algorithm I will post it.

Question 6

That would be awesome. I think the main problem is the heuristic I'm using to determine the order in which I add items to the array. I reformulated the code (same algorithm, just neater code) in the answer.

Rajasankar Rajasankar 1112 bronze badges · Answer 1 · 2012-08-06 04:14:48Z

This is not generating correct pairwise combinations.

Input data

[['Requests'],
 ['Create','Edit'],
 ['Contact Name','Email','Phone','Subject','Description','Status','Product Name','Request Owner','Created By','Modified By','Request Id','Resolution','To Address','Account Name','Priority','Channel','Category','Sub Category'],
 ['None','is','isn\'t','contains','doesn\'t contain','starts with','ends with','is empty','is not empty'],
 ['Contact Name','Email','Phone','Subject','Description','Status','Product Name','Request Owner','Mark Spam','Resolution','To Address','Due Date','Priority','Channel','Category','Sub Category']]

Result by Microsoft PICT

http://pastebin.com/cZdND9UA

Result by your code http://pastebin.com/EC6xv4vG

This generates only for NONE in most of cases such as this

Requests Create Account Name None Category Requests Create Account Name None Channel Requests Create Account Name None Description Requests Create Account Name None Due Date

Hmmm... I'm not sure why this is. When I was using smaller inputs, it was working fine. Let me find out what's wrong with it.
Apparently, the problem is nontrivial. I don't really have time to compare algorithms but here is a paper that I found useful: research.ibm.com/haifa/projects/verification/mdt/papers/…
Every other generator except PICT, I have tested failed in large sets not in small sets. Yes the problem is nontrivial one. I also working on it. If I port that algorithm I will post it.
That would be awesome. I think the main problem is the heuristic I'm using to determine the order in which I add items to the array. I reformulated the code (same algorithm, just neater code) in the answer.

Stack Exchange Network

Pairwise Combinations Generator

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Pairwise Combinations Generator

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions