I wrote a 2D greedy bin packing algorithm using Python 3.6
Heres a quick summary:
The algorithm consists of two classes (which I will attach at the end of this file along with a link to my github repo): BinPack and BinTree. The BinTree class is a tree that represents a single 2D bin. The BinPack class is the main interface, ranks BinTrees by available space, and selects BinTrees for item insertion.
Item insertion into a BinTree is done by traversing down from the root node until an unoccupied node is found which can fit the item. Then that node is resized to the size of the item, and two child nodes are created (if necessary) the size of the remainder space on the left and the bottom of the resized node.
BinPack picks which BinTree to insert into using an AVL Tree. Each BinTree Tree maintains a value for its largest unoccupied child. These values are used to rank all the BinTrees Trees in the AVL Tree. Then the best fit heuristic is used to pick the Tree where an insertion would result in the smallest leftover space. If all the Trees are full a new Tree is created.
Here is an example of usage:
In [15]: import binpack
In [34]: pack = binpack.BinPack(bin_size=(4, 8))
In [16]: pack.insert((2,2), (2,4), (4,5), (10,5))
In [17]: pack.print_stats()
Out[17]:
{0: {'area': 32,
'efficiency': 0.875,
'height': 8,
'items': [(CornerPoint(x=0, y=0), Item(width=4, height=5)),
(CornerPoint(x=0, y=5), Item(width=2, height=2)),
(CornerPoint(x=2, y=5), Item(width=2, height=2))],
'width': 4},
1: {'area': 32,
'efficiency': 0.25,
'height': 8,
'items': [(CornerPoint(x=0, y=0), Item(width=2, height=4))],
'width': 4},
2: {'area': 32,
'efficiency': 0.625,
'height': 8,
'items': [(CornerPoint(x=0, y=0), Item(width=4, height=5))],
'width': 4},
3: {'area': 32,
'efficiency': 0.25,
'height': 8,
'items': [(CornerPoint(x=0, y=0), Item(width=2, height=4))],
'width': 4},
'oversized': [Item(width=10, height=5)]}
My Questions:
Is my BinTree class a good way of representing 2D bins? It seemed really intuitive it to me but it does create complications such as:
Inserting a (3,4) item then a (1,8) item into a (4,8) Tree the (1,8) item wont fit. The (3,4) insertion will cause the BinTree to be resized to (3,4) with a right child sized (1,4) and bottom child sized (4,4). This leaves no single node with the space for the (1,8) item.
I know that AVL Trees are recommended for ranking bins in one dimensional greedy binpack algorithms. But with two dimensions the largest available space becomes less relevant. One Tree could have a bigger largest child area but then a second Tree but the second Tree could have a smaller empty node that would fit the item better. Rather then ranking each bin, would it be better to rank all the unoccupied nodes of all the trees in the AVL tree?
Lastly, are there any huge code smells or bad ideas in my implementation? I've largely come up with this algorithm based on 1D greedy algorithm examples i've seen online. Everything I find talking about 2+ dimensions tends to be highly technical and focus on mathematical proofs over implementation details.
My Code:
BinTree
from typing import NamedTuple
from collections import deque
class CornerPoint(NamedTuple):
"""
namedtuple representing the top left corner of each
BinTree object.
"""
x: int
y: int
class Item(NamedTuple):
"""
namedtuple representing objects added to BinTree.
"""
width: int
height: int
class BinTree:
"""
Each BinTree instance has two children (left and bottom)
and width (int), height (int), and occupied (bool) properties.
"""
def __init__(self, dims: tuple = (4, 8)) -> None:
self.corner = CornerPoint(0, 0)
self.dims = dims
self.occupied = False
self.parent = None
self.right = None
self.bottom = None
if not self.occupied:
self.largest_child = tuple(self.dims)
else:
self.largest_child = None
def _check_dim(item_dim: int, bin_dim: int) -> bool:
"""
Checks if the item will fit the bin in the specified
dimension.
"""
pass
def insert(self, item: Item) -> bool:
"""
Recursive item insertion
Takes an Item namedtuple
Inserts recursively as a side-effect
Returns True or False if Item fit in bin
"""
if not self.occupied and item.width <= self.dims[0] and item.height <= self.dims[1]:
if self.dims[1] - item.height > 0:
self.bottom = BinTree(dims=[self.dims[0], self.dims[1]-item.height])
self.bottom.parent = self
if self.dims[0] - item.width > 0:
self.right = BinTree(dims=[self.dims[0]-item.width, item.height])
self.right.parent = self
self.dims = (item.width, item.height)
self.occupied = item
if self.right:
self.right.corner = CornerPoint(self.dims[0] + self.corner.x, self.corner.y)
if self.bottom:
self.bottom.corner = CornerPoint(self.corner.x, self.dims[1] + self.corner.y)
self.calc_largest_child()
return True
else:
if (self.right and self.right.largest_child[0] >= item.width and
self.right.largest_child[1] >= item.height):
self.right.insert(item)
elif (self.bottom and self.bottom.largest_child[0] >= item.width and
self.bottom.largest_child[1] >= item.height):
self.bottom.insert(item)
else:
return False
def calc_largest_child(self) -> None:
"""
Updates self.largest_child for each node recursively
back to the root node
"""
choices = []
if not self.occupied:
choices.append(self.dims)
else:
choices.append((0, 0))
if self.right:
choices.append(self.right.largest_child)
else:
choices.append((0, 0))
if self.bottom:
choices.append(self.bottom.largest_child)
else:
choices.append((0, 0))
self.largest_child = max(choices, key=lambda t: t[0]*t[1])
if self.parent:
self.parent.calc_largest_child()
def bin_stats(root: BinTree) -> dict:
"""
Returns a dictionary with compiled stats on the bin tree
"""
stats = {
'width': 0,
'height': 0,
'area': 0,
'efficiency': 1.0,
'items': [],
}
stack = deque([root])
while stack:
node = stack.popleft()
stats['width'] += node.dims[0]
if node.right:
stack.append(node.right)
stack = deque([root])
while stack:
node = stack.popleft()
stats['height'] += node.dims[1]
if node.bottom:
stack.append(node.bottom)
stats['area'] = stats['width'] * stats['height']
stack = deque([root])
occupied = 0
while stack:
node = stack.popleft()
if node.occupied:
stats['items'].append((node.corner, node.occupied))
occupied += node.dims[0]*node.dims[1]
if node.right:
stack.append(node.right)
if node.bottom:
stack.append(node.bottom)
stats['efficiency'] = occupied / stats['area']
return stats
Binpack:
from collections import deque
from operator import mul as product
from . import avl_tree
from . import bintree
class BinPack:
"""
Bin Ranking System. the product of BinTree.largest_child
is used as a key value in an AVL Tree for ranking the bins.
"""
def __init__(self, bin_size: tuple = (4, 8), sorting: bool = True):
self.bin_dict = {'oversized': []}
self.bin_size = bin_size
self.tree = avl_tree.AvlTree()
self.bin_count = 0
self.sorting = sorting
self._add_bin()
def _add_bin(self) -> None:
"""
private method. Creates a new BinTree and adds
it to bin_dict and the AVL tree.
"""
self.bin_dict[self.bin_count] = bintree.BinTree(list(self.bin_size))
bin_key = product(*self.bin_dict[self.bin_count].largest_child)
self.tree.insert(bin_key)
self.tree[bin_key].data.append(self.bin_count)
self.bin_count += 1
def insert(self, *items: tuple, heuristic: str = 'best_fit') -> None:
if self.sorting == True:
items = sorted(items, key=lambda a: product(*a), reverse=True)
for item in items:
if heuristic == 'best_fit':
self.best_fit(item)
elif heuristic == 'next_fit':
self.next_fit(item)
def _check_oversized(self, item: bintree.Item) -> bool:
"""
Catch oversized items
"""
if item[0] > self.bin_size[0] or item[1] > self.bin_size[1]:
self.bin_dict['oversized'].append(item)
return False
return True
def best_fit(self, item_dims: tuple) -> bool:
"""
Best Fit Bin Selection
Public method.
Selects optimal BinTree (or creates
a new BinTree) for item insertion using first fit.
Returns BinTree ID.
"""
item_area = product(*item_dims)
item = bintree.Item(*item_dims)
# Catch oversized items:
if self._check_oversized(item) == False:
return False
queue = deque([self.tree.root])
best_fit = None
best_fit_node = None
while queue:
current_node = queue.popleft()
current_bintree = self.bin_dict[current_node.data[-1]]
largest_child = current_bintree.largest_child
if (largest_child[0] >= item.width and
largest_child[1] >= item.height):
if not best_fit or largest_child < best_fit.largest_child:
best_fit = current_bintree
best_fit_node = current_node
if current_node.left:
queue.append(current_node.left)
else:
if current_node.right:
queue.append(current_node.right)
if best_fit:
best_fit.insert(item)
# delete and reinsert node to update position in tree
nodeid = best_fit_node.data[-1]
old_key = best_fit_node.key
new_key = product(*best_fit.largest_child)
self.tree.delete(key=old_key)
self.tree.insert(key=new_key)
self.tree[new_key].data.append(nodeid)
return True
else:
self._add_bin()
self.bin_dict[self.bin_count-1].insert(item)
return True
def next_fit(self, item_dims: tuple) -> bool:
"""
First Fit Bin Selection
Public method.
Selects optimal BinTree (or creates
a new BinTree) for item insertion using first fit.
Returns BinTree ID.
"""
item_area = product(*item_dims)
item = bintree.Item(*item_dims)
# Catch oversized items:
if self._check_oversized(item) == False:
return False
queue = deque([self.tree.root])
while queue:
current_node = queue.popleft()
current_bintree = self.bin_dict[current_node.data[-1]]
largest_child = current_bintree.largest_child
if (largest_child[0] >= item.width and
largest_child[1] >= item.height):
current_bintree.insert(item)
# delete and reinsert node to update position in tree
nodeid = current_node.data[-1]
old_key = current_node.key
new_key = product(*current_bintree.largest_child)
self.tree.delete(key=old_key)
self.tree.insert(key=new_key)
self.tree[new_key].data.append(nodeid)
return True
else:
if current_node.right:
queue.append(current_node.right)
else:
self._add_bin()
self.next_fit(item_dims)
return False
def print_stats(self) -> None:
"""
Returns layouts for all BinTrees
"""
result = {}
for key, bin in self.bin_dict.items():
if key != 'oversized':
result[key] = bintree.bin_stats(bin)
result['oversized'] = self.bin_dict['oversized']
return result
1 Answer 1
- Is my BinTree class a good way of representing 2D bins?
Yes, it seems a good way to me. But then you complain about prematurely committing to a horizontal guillotine cut when you introduce the mismatched (1,8) item. Fair enough. So don't commit. Either stick with current representation and bifurcate with each insertion, storing a pair of horiz. / vert. competing trees, or better, change the representation so it reports a set of maximal rectangles that could be stored. They overlap so the sum of those areas would exceed bin size, but that's OK as you'll only consume them one at a time. A representation that could be attractive for small item counts would be simply a set of shapes known to fit, and each new candidate insertion would be free to permute the insertion order. For large item counts summarization over a subset of items would be needed.
- Rather then ranking each bin, would it be better to rank all the unoccupied nodes of all the trees in the AVL tree?
Yes, I agree this would be a useful direction to explore. Your question boils down to "2D packing is hard", which is what makes it such an interesting problem. You're not limited to online packing, so your ranking could exploit the certain knowledge that a particular unoccupied shape will be forever wasted when it is known that none of your items are small enough to fit in it.
- smells
Looks good to me. A datastructure design relying on premature commitment seems the biggest sticking point.
Details:
Kudos on using type annotation.
In the BinTree ctor, I don't understand the constant assignment followed by a conditional:
self.occupied = False
...
if not self.occupied:
self.largest_child = tuple(self.dims)
else:
self.largest_child = None
Looks like some left over code after a bunch of edits.
from . import avl_tree
AVL is peripheral to the problem you're tackling. Pypi lists a couple of packages. You found them unsuitable? You could use a conda environment.yml to keep other folks' libraries out of your source repo.
You use docstrings, which is good! But your comments tend to be a bit redundant, telling us e.g. that a class declared to inherit from NamedTuple is a namedtuple, or that leading underscore _add_bin()
is a private method. Yes, we can see that. Use English comments to speak in broad generalities, and code to speak of details. I found the ctor comment too vague: "the product of BinTree.largest_child is used as a key value in an AVL Tree for ranking the bins." It leaves me with questions. What is product? What invariant does the AVL tree maintain? (Yes, I know it stays balanced, but it's not immediately obvious how a scalar sort order relates to 2D packing, so describe what we rank on. Eventually we see it is "area", but you named the identifier bin_key
.) The API for _add_bin()
is a bit odd - consider making the caller responsible for passing in a new BinTree. When you evaluate list(self.bin_size)
, I don't understand why you go back and forth between a 2-tuple, which is natural, and a 2-element list, which is not.
self.sorting = sorting
Consider naming this is_sorting
.
if self.sorting == True:
Leave off == True
. Do take care to run your code through flake8
, and follow its advice.
items = sorted(items, key=lambda a: product(*a), reverse=True)
You're evaluating for side effects, so this would more naturally be done in place:
sort(items, key=lambda a: product(*a), reverse=True)
In the signature, consider making heuristic: str = 'best_fit'
an Enum. Given that there's two heuristics, consider making it boolean.
The comment "Catch oversized items" is vague and doesn't describe the .append()
side effect.
In the best_fit()
comment, "Public method" is redundant, you told us that already in the signature. The "using first fit" remark is slightly surprising, as best is not first, plus it is vague. You had an opportunity to tell us whether height or width would be considered first. The comment "Returns BinTree ID." appears to validate the aphorism that "Comments lie!", given that the signature explains we're returning a boolean. The longer a comment rambles on and on, the greater a chance it gets out of sync with current code.
item_area = product(*item_dims)
This is a well-chosen identifier, much better than bin_key
.
# Catch oversized items:
if self._check_oversized(item) == False:
The comment is redundant, and probably "Reject" more accurately describes relationship with caller than "Catch" does. Prefer not c
over c == False
. Consider inverting the meaning of the API for that helper.
if (largest_child[0] >= item.width and
largest_child[1] >= item.height):
This is perfectly nice code. You've been doing a nice job of peeling off helpers. Consider putting this condition into a trivial predicate, so the code can become a self-documenting "if items fits in largest_child".
The while queue
loop is very nice. I wouldn't mind seeing a comment on it describing the invariant it preserves.
self.tree.delete(key=old_key)
I don't understand that line. It isn't obvious to me what happens when a tree contains a 2x3 and a 6x1 item. I'm willing to believe the Right Thing happens, but your comments have not yet pointed out that the tree maps from area to a sequence of items, nor what it means for an item to appear earlier or later in that sequence.
def next_fit(self, item_dims: tuple) -> bool:
"""
First Fit Bin Selection
Public method.
You're killing me! Which is it? Next? or First? Pick a word and stick with it. Also, same redundant "public" remark as above, and bool is not an ID.
Little bit of copy-n-paste suggests there might be an opportunity to Extract Helper, no biggie.
def print_stats(self) -> None:
...
return result
Hmmm, in other languages the compiler just emits a fatal diagnostic. What python tools would help us catch this?
Also, the name should be get_stats()
. Consider using sorted() and/or OrderedDict.
Overall, this is a nice piece of code, easy to read. You have clearly been paying attention to code quality.
-
\$\begingroup\$ Thank you so much for the extensive analysis. I have actually completely rewritten and expanded this project. It not includes maximal rectangle packing as well as a number of other packing methods and optimiations (everything described in A Thousand Ways To Pack The Bin - Jukka Jylänki). I got rid of the tree datastructure and replaced it with a list of all free sub areas rectangles of the bin. Its a lot simplier and allows me to easily search a bin based on a variety of heuristics. \$\endgroup\$Solomon Bothwell– Solomon Bothwell2018年01月08日 03:17:58 +00:00Commented Jan 8, 2018 at 3:17
-
\$\begingroup\$ Your post is still super helpful and you are pointing out a lot of subtle details that I wouldnt ordinarily think about. Thank you. \$\endgroup\$Solomon Bothwell– Solomon Bothwell2018年01月08日 03:22:38 +00:00Commented Jan 8, 2018 at 3:22
-
\$\begingroup\$ @SolomonBothwell:
It not includes
It now includes? (as well asIts ... simplier
-> It's simpler or More simple) \$\endgroup\$greybeard– greybeard2018年01月10日 05:47:01 +00:00Commented Jan 10, 2018 at 5:47
Explore related questions
See similar questions with these tags.
anything I could do
No.BinTree
is frequently used for binary tree - you're screwed for good trying to getbin
read as bin.) Start with a short&simple description of the task to accomplish - reference welcome.BinTree
is funny in documenting an attributeleft
going on to useright
. It bears a whiff of Guillotine - the examples near the bottom of Cutting stock problem may be an example where that limits result quality.) \$\endgroup\$