2

The following program starts with hundreds of permutations and shrinks them down to a small result set. Furthermore, I am only looking for the first valid result. What techniques can I use to make this more efficient?

#!/usr/bin/python3
from itertools import permutations, zip_longest
def get_result():
 checked = []
 for permutation in permutations_of_items_in_bins(items='aabbb', number_of_bins=3, items_per_bin=2):
 if permutation in checked:
 continue
 checked.append(permutation)
 if is_valid_result(permutation):
 return permutation
 # this is for debug only
 for x in sorted(checked):
 print(_format(x))
 return None
def is_valid_result(result):
 return False # in reality, this may return True
# from: http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks
def _get_chunks(n, iterable, padvalue=None):
 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
 return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)
def _format(iterable):
 return ','.join([''.join(c) for c in iterable])
def permutations_of_items_in_bins(items: str, items_per_bin: int, number_of_bins: int, pad='z'):
 # if not enough items to fit in all the bins, pad them with a value representing "empty"
 items = items.rjust(number_of_bins * items_per_bin, pad)
 # get all possible sort orders of the items
 _permutations = permutations(items)
 # that are unique
 unique_permutations = set(_permutations)
 # for each sort order, split them into bins in that order
 for permutation in unique_permutations:
 # split items into bins
 chunks = list(_get_chunks(items_per_bin, permutation))
 # item sort order in bins doesn't matter, so let's always sort the items in each bin
 normalized = [tuple(sorted(x)) for x in chunks]
 yield normalized
get_result()
"""
aa,bb,bz
aa,bz,bb
ab,ab,bz
ab,az,bb
ab,bb,az
ab,bz,ab
az,ab,bb
az,bb,ab
bb,aa,bz
bb,ab,az
bb,az,ab
bb,bz,aa
bz,aa,bb
bz,ab,ab
bz,bb,aa
"""
asked Oct 26, 2016 at 23:11
6
  • 1
    Have you tried profiling it yet? Where are the hot spots? Commented Oct 26, 2016 at 23:49
  • Techniques are no good without knowing what you need to fix. So do that first. Commented Oct 27, 2016 at 12:29
  • I avoided profiling it because I want to rewrite this to other programming languages. I understand Python itertools uses optimization in C code, while other languages may not have that. But I thought the problem here was pretty obvious: the processing time goes up (exponentially?) as you increase the number of items, number of bins, and size of bins. I think @idan-arye's answer solves that problem brilliantly! Commented Oct 27, 2016 at 15:25
  • This is my first time posting an algorithm question. Maybe there is a better way I could have explained that? Commented Oct 27, 2016 at 15:32
  • So what if you're going to use different languages? If your algorithm is doing something it doesn't need to do, it will waste time in any language. Anything you fix in python will carry over to language X. As I said do this first, and do it again in whatever language. Commented Oct 27, 2016 at 16:45

1 Answer 1

4

Instead of generation everything and removing the duplicates, generate them without duplicates in the first place:

def permutations_of_items_in_bins(items, items_per_bin, number_of_bins, pad='z'):
 # Convert items to a dictionary that tracks how many we have of each
 pile = {}
 for item in items:
 try:
 pile[item] += 1
 except KeyError:
 pile[item] = 1
 # Pad to make sure we can generate combinations of the required length
 num_missing = items_per_bin * number_of_bins - sum(pile.values())
 if 0 < num_missing:
 try:
 pile[pad] += num_missing
 except KeyError:
 pile[pad] = num_missing
 # We need this for iteration on possibilities and for ordering the bins
 unique_items = list(sorted(pile.keys()))
 # While each combination is yielded the pile is updated to remove the items in that combination
 def combinations_consuming(size=items_per_bin, items=unique_items):
 if size == 0:
 yield ()
 return
 items = items[:] # create a new list
 while items:
 item = items[0]
 if pile[item]:
 pile[item] -= 1
 for suffix in combinations_consuming(size - 1, items):
 yield (item, *suffix)
 pile[item] += 1
 items.pop(0)
 def generate_bins(num=number_of_bins):
 if num == 0:
 yield ()
 return
 for combination in combinations_consuming():
 # The nested call to generate_bins will not be able to use the items already in combination
 for suffix in generate_bins(num - 1):
 yield (combination, *suffix)
 yield from generate_bins()
answered Oct 27, 2016 at 1:47
1
  • This is excellent! It even yields already-sorted results. Thank you! By any chance, does it use any algorithm or concept that has a name? Commented Oct 27, 2016 at 15:18

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.