Is it possible to make this grouped permutation algorithm more efficient?

Question 1

The following program starts with hundreds of permutations and shrinks them down to a small result set. Furthermore, I am only looking for the first valid result. What techniques can I use to make this more efficient?

#!/usr/bin/python3
from itertools import permutations, zip_longest
def get_result():
 checked = []
 for permutation in permutations_of_items_in_bins(items='aabbb', number_of_bins=3, items_per_bin=2):
 if permutation in checked:
 continue
 checked.append(permutation)
 if is_valid_result(permutation):
 return permutation
 # this is for debug only
 for x in sorted(checked):
 print(_format(x))
 return None
def is_valid_result(result):
 return False # in reality, this may return True
# from: http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks
def _get_chunks(n, iterable, padvalue=None):
 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
 return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)
def _format(iterable):
 return ','.join([''.join(c) for c in iterable])
def permutations_of_items_in_bins(items: str, items_per_bin: int, number_of_bins: int, pad='z'):
 # if not enough items to fit in all the bins, pad them with a value representing "empty"
 items = items.rjust(number_of_bins * items_per_bin, pad)
 # get all possible sort orders of the items
 _permutations = permutations(items)
 # that are unique
 unique_permutations = set(_permutations)
 # for each sort order, split them into bins in that order
 for permutation in unique_permutations:
 # split items into bins
 chunks = list(_get_chunks(items_per_bin, permutation))
 # item sort order in bins doesn't matter, so let's always sort the items in each bin
 normalized = [tuple(sorted(x)) for x in chunks]
 yield normalized
get_result()
"""
aa,bb,bz
aa,bz,bb
ab,ab,bz
ab,az,bb
ab,bb,az
ab,bz,ab
az,ab,bb
az,bb,ab
bb,aa,bz
bb,ab,az
bb,az,ab
bb,bz,aa
bz,aa,bb
bz,ab,ab
bz,bb,aa
"""

Question 2

Have you tried profiling it yet? Where are the hot spots?

Question 3

Techniques are no good without knowing what you need to fix. So do that first.

Question 4

I avoided profiling it because I want to rewrite this to other programming languages. I understand Python itertools uses optimization in C code, while other languages may not have that. But I thought the problem here was pretty obvious: the processing time goes up (exponentially?) as you increase the number of items, number of bins, and size of bins. I think @idan-arye's answer solves that problem brilliantly!

Question 5

This is my first time posting an algorithm question. Maybe there is a better way I could have explained that?

Question 6

So what if you're going to use different languages? If your algorithm is doing something it doesn't need to do, it will waste time in any language. Anything you fix in python will carry over to language X. As I said do this first, and do it again in whatever language.

Question 7

Instead of generation everything and removing the duplicates, generate them without duplicates in the first place:

def permutations_of_items_in_bins(items, items_per_bin, number_of_bins, pad='z'):
 # Convert items to a dictionary that tracks how many we have of each
 pile = {}
 for item in items:
 try:
 pile[item] += 1
 except KeyError:
 pile[item] = 1
 # Pad to make sure we can generate combinations of the required length
 num_missing = items_per_bin * number_of_bins - sum(pile.values())
 if 0 < num_missing:
 try:
 pile[pad] += num_missing
 except KeyError:
 pile[pad] = num_missing
 # We need this for iteration on possibilities and for ordering the bins
 unique_items = list(sorted(pile.keys()))
 # While each combination is yielded the pile is updated to remove the items in that combination
 def combinations_consuming(size=items_per_bin, items=unique_items):
 if size == 0:
 yield ()
 return
 items = items[:] # create a new list
 while items:
 item = items[0]
 if pile[item]:
 pile[item] -= 1
 for suffix in combinations_consuming(size - 1, items):
 yield (item, *suffix)
 pile[item] += 1
 items.pop(0)
 def generate_bins(num=number_of_bins):
 if num == 0:
 yield ()
 return
 for combination in combinations_consuming():
 # The nested call to generate_bins will not be able to use the items already in combination
 for suffix in generate_bins(num - 1):
 yield (combination, *suffix)
 yield from generate_bins()

Question 8

This is excellent! It even yields already-sorted results. Thank you! By any chance, does it use any algorithm or concept that has a name?

Idan Arye Idan Arye 12.1k33 silver badges43 bronze badges · Accepted Answer · 2016-10-27 01:47:26Z

Instead of generation everything and removing the duplicates, generate them without duplicates in the first place:

def permutations_of_items_in_bins(items, items_per_bin, number_of_bins, pad='z'):
 # Convert items to a dictionary that tracks how many we have of each
 pile = {}
 for item in items:
 try:
 pile[item] += 1
 except KeyError:
 pile[item] = 1
 # Pad to make sure we can generate combinations of the required length
 num_missing = items_per_bin * number_of_bins - sum(pile.values())
 if 0 < num_missing:
 try:
 pile[pad] += num_missing
 except KeyError:
 pile[pad] = num_missing
 # We need this for iteration on possibilities and for ordering the bins
 unique_items = list(sorted(pile.keys()))
 # While each combination is yielded the pile is updated to remove the items in that combination
 def combinations_consuming(size=items_per_bin, items=unique_items):
 if size == 0:
 yield ()
 return
 items = items[:] # create a new list
 while items:
 item = items[0]
 if pile[item]:
 pile[item] -= 1
 for suffix in combinations_consuming(size - 1, items):
 yield (item, *suffix)
 pile[item] += 1
 items.pop(0)
 def generate_bins(num=number_of_bins):
 if num == 0:
 yield ()
 return
 for combination in combinations_consuming():
 # The nested call to generate_bins will not be able to use the items already in combination
 for suffix in generate_bins(num - 1):
 yield (combination, *suffix)
 yield from generate_bins()

This is excellent! It even yields already-sorted results. Thank you! By any chance, does it use any algorithm or concept that has a name?

Stack Exchange Network

Is it possible to make this grouped permutation algorithm more efficient?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Is it possible to make this grouped permutation algorithm more efficient?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions