Below is an algorithm to generate all pairwise combinations for a set of random variables. See here for an explanation of what constitutes a minimal all-pairs set. The algorithm below works but I feel that it is inefficient. Any suggestions would be welcome.
from operator import itemgetter
from itertools import combinations, product, zip_longest, chain, permutations
def pairwise(*variables, **kwargs):
num_vars = len(variables)
#Organize the variable states into pairs for comparisons
indexed_vars = sorted(
([(index, v) for v in vals] for index, vals in enumerate(variables)),
key=len,
reverse=True)
var_combos = combinations(range(num_vars), 2)
var_products = (product(indexed_vars[i], indexed_vars[j]) for i, j in var_combos)
vars = chain.from_iterable(zip_longest(*var_products))
#Initialize the builder list
builders = []
seen_pairs = set()
#Main algorithm loop
for pair in vars:
if not pair or pair in seen_pairs:
continue
d_pair = dict(pair)
#Linear search through builders to find an empty slot
#First pass: look for sets that have intersections
for builder in builders:
intersection = builder.keys() & d_pair.keys()
if len(intersection) == 1:
v = intersection.pop()
if builder[v] == d_pair[v]:
builder.update(d_pair)
#Update seen pairs
seen_pairs.update(combinations(builder.items(), 2))
break
else:
#Second Pass
for builder in builders:
intersection = builder.keys() & d_pair.keys()
if not len(intersection):
builder.update(d_pair)
#Update seen pairs
seen_pairs.update(combinations(builder.items(), 2))
break
else:
#No empty slots/pairs identified. Make a new builder
builders.append(d_pair)
seen_pairs.add(pair)
#Fill in remaining values
complete_set = set(range(num_vars))
defaults = {var[0][0]: var[0][1] for var in indexed_vars}
for builder in builders:
if len(builder) == num_vars:
yield tuple(val for index, val in sorted(builder.items()))
else:
for key in complete_set - builder.keys():
builder[key] = defaults[key]
yield tuple(val for index, val in sorted(builder.items()))
Usage Example:
u_vars = [
[1, 2, 3, 4],
["A", "B", "C"],
["a", "b"],
]
result = list(pairwise(*u_vars))
print(result) #prints list of length 12
1 Answer 1
This is not generating correct pairwise combinations.
Input data
[['Requests'],
['Create','Edit'],
['Contact Name','Email','Phone','Subject','Description','Status','Product Name','Request Owner','Created By','Modified By','Request Id','Resolution','To Address','Account Name','Priority','Channel','Category','Sub Category'],
['None','is','isn\'t','contains','doesn\'t contain','starts with','ends with','is empty','is not empty'],
['Contact Name','Email','Phone','Subject','Description','Status','Product Name','Request Owner','Mark Spam','Resolution','To Address','Due Date','Priority','Channel','Category','Sub Category']]
Result by Microsoft PICT
Result by your code http://pastebin.com/EC6xv4vG
This generates only for NONE in most of cases such as this
Requests Create Account Name None Category Requests Create Account Name None Channel Requests Create Account Name None Description Requests Create Account Name None Due Date
-
\$\begingroup\$ Hmmm... I'm not sure why this is. When I was using smaller inputs, it was working fine. Let me find out what's wrong with it. \$\endgroup\$Joel Cornett– Joel Cornett2012年08月06日 17:24:52 +00:00Commented Aug 6, 2012 at 17:24
-
\$\begingroup\$ Apparently, the problem is nontrivial. I don't really have time to compare algorithms but here is a paper that I found useful: research.ibm.com/haifa/projects/verification/mdt/papers/… \$\endgroup\$Joel Cornett– Joel Cornett2012年08月07日 00:10:04 +00:00Commented Aug 7, 2012 at 0:10
-
\$\begingroup\$ Every other generator except PICT, I have tested failed in large sets not in small sets. Yes the problem is nontrivial one. I also working on it. If I port that algorithm I will post it. \$\endgroup\$Rajasankar– Rajasankar2012年08月07日 01:40:51 +00:00Commented Aug 7, 2012 at 1:40
-
\$\begingroup\$ That would be awesome. I think the main problem is the heuristic I'm using to determine the order in which I add items to the array. I reformulated the code (same algorithm, just neater code) in the answer. \$\endgroup\$Joel Cornett– Joel Cornett2012年08月07日 02:18:33 +00:00Commented Aug 7, 2012 at 2:18