2D Bin Packing Algorithm Implementation

Question 1

I wrote a 2D greedy bin packing algorithm using Python 3.6

Heres a quick summary:

The algorithm consists of two classes (which I will attach at the end of this file along with a link to my github repo): BinPack and BinTree. The BinTree class is a tree that represents a single 2D bin. The BinPack class is the main interface, ranks BinTrees by available space, and selects BinTrees for item insertion.

Item insertion into a BinTree is done by traversing down from the root node until an unoccupied node is found which can fit the item. Then that node is resized to the size of the item, and two child nodes are created (if necessary) the size of the remainder space on the left and the bottom of the resized node.

BinPack picks which BinTree to insert into using an AVL Tree. Each BinTree Tree maintains a value for its largest unoccupied child. These values are used to rank all the BinTrees Trees in the AVL Tree. Then the best fit heuristic is used to pick the Tree where an insertion would result in the smallest leftover space. If all the Trees are full a new Tree is created.

Here is an example of usage:

In [15]: import binpack
In [34]: pack = binpack.BinPack(bin_size=(4, 8))
In [16]: pack.insert((2,2), (2,4), (4,5), (10,5))
In [17]: pack.print_stats()
Out[17]:
{0: {'area': 32,
 'efficiency': 0.875,
 'height': 8,
 'items': [(CornerPoint(x=0, y=0), Item(width=4, height=5)),
 (CornerPoint(x=0, y=5), Item(width=2, height=2)),
 (CornerPoint(x=2, y=5), Item(width=2, height=2))],
 'width': 4},
 1: {'area': 32,
 'efficiency': 0.25,
 'height': 8,
 'items': [(CornerPoint(x=0, y=0), Item(width=2, height=4))],
 'width': 4},
 2: {'area': 32,
 'efficiency': 0.625,
 'height': 8,
 'items': [(CornerPoint(x=0, y=0), Item(width=4, height=5))],
 'width': 4},
 3: {'area': 32,
 'efficiency': 0.25,
 'height': 8,
 'items': [(CornerPoint(x=0, y=0), Item(width=2, height=4))],
 'width': 4},
 'oversized': [Item(width=10, height=5)]}

My Questions:

Is my BinTree class a good way of representing 2D bins? It seemed really intuitive it to me but it does create complications such as:

Inserting a (3,4) item then a (1,8) item into a (4,8) Tree the (1,8) item wont fit. The (3,4) insertion will cause the BinTree to be resized to (3,4) with a right child sized (1,4) and bottom child sized (4,4). This leaves no single node with the space for the (1,8) item.
I know that AVL Trees are recommended for ranking bins in one dimensional greedy binpack algorithms. But with two dimensions the largest available space becomes less relevant. One Tree could have a bigger largest child area but then a second Tree but the second Tree could have a smaller empty node that would fit the item better. Rather then ranking each bin, would it be better to rank all the unoccupied nodes of all the trees in the AVL tree?
Lastly, are there any huge code smells or bad ideas in my implementation? I've largely come up with this algorithm based on 1D greedy algorithm examples i've seen online. Everything I find talking about 2+ dimensions tends to be highly technical and focus on mathematical proofs over implementation details.

My Code:

BinTree

from typing import NamedTuple
from collections import deque
class CornerPoint(NamedTuple):
 """
 namedtuple representing the top left corner of each
 BinTree object.
 """
 x: int
 y: int
class Item(NamedTuple):
 """
 namedtuple representing objects added to BinTree.
 """
 width: int
 height: int
class BinTree:
 """
 Each BinTree instance has two children (left and bottom)
 and width (int), height (int), and occupied (bool) properties.
 """
 def __init__(self, dims: tuple = (4, 8)) -> None:
 self.corner = CornerPoint(0, 0)
 self.dims = dims
 self.occupied = False
 self.parent = None
 self.right = None
 self.bottom = None
 if not self.occupied:
 self.largest_child = tuple(self.dims)
 else:
 self.largest_child = None
 def _check_dim(item_dim: int, bin_dim: int) -> bool:
 """
 Checks if the item will fit the bin in the specified
 dimension.
 """
 pass
 def insert(self, item: Item) -> bool:
 """
 Recursive item insertion
 Takes an Item namedtuple
 Inserts recursively as a side-effect
 Returns True or False if Item fit in bin
 """
 if not self.occupied and item.width <= self.dims[0] and item.height <= self.dims[1]:
 if self.dims[1] - item.height > 0:
 self.bottom = BinTree(dims=[self.dims[0], self.dims[1]-item.height])
 self.bottom.parent = self
 if self.dims[0] - item.width > 0:
 self.right = BinTree(dims=[self.dims[0]-item.width, item.height])
 self.right.parent = self
 self.dims = (item.width, item.height)
 self.occupied = item
 if self.right:
 self.right.corner = CornerPoint(self.dims[0] + self.corner.x, self.corner.y)
 if self.bottom:
 self.bottom.corner = CornerPoint(self.corner.x, self.dims[1] + self.corner.y)
 self.calc_largest_child()
 return True
 else:
 if (self.right and self.right.largest_child[0] >= item.width and
 self.right.largest_child[1] >= item.height):
 self.right.insert(item)
 elif (self.bottom and self.bottom.largest_child[0] >= item.width and
 self.bottom.largest_child[1] >= item.height):
 self.bottom.insert(item)
 else:
 return False
 def calc_largest_child(self) -> None:
 """
 Updates self.largest_child for each node recursively
 back to the root node
 """
 choices = []
 if not self.occupied:
 choices.append(self.dims)
 else:
 choices.append((0, 0))
 if self.right:
 choices.append(self.right.largest_child)
 else:
 choices.append((0, 0))
 if self.bottom:
 choices.append(self.bottom.largest_child)
 else:
 choices.append((0, 0))
 self.largest_child = max(choices, key=lambda t: t[0]*t[1])
 if self.parent:
 self.parent.calc_largest_child()
def bin_stats(root: BinTree) -> dict:
 """
 Returns a dictionary with compiled stats on the bin tree
 """
 stats = {
 'width': 0,
 'height': 0,
 'area': 0,
 'efficiency': 1.0,
 'items': [],
 }
 stack = deque([root])
 while stack:
 node = stack.popleft()
 stats['width'] += node.dims[0]
 if node.right:
 stack.append(node.right)
 stack = deque([root])
 while stack:
 node = stack.popleft()
 stats['height'] += node.dims[1]
 if node.bottom:
 stack.append(node.bottom)
 stats['area'] = stats['width'] * stats['height']
 stack = deque([root])
 occupied = 0
 while stack:
 node = stack.popleft()
 if node.occupied:
 stats['items'].append((node.corner, node.occupied))
 occupied += node.dims[0]*node.dims[1]
 if node.right:
 stack.append(node.right)
 if node.bottom:
 stack.append(node.bottom)
 stats['efficiency'] = occupied / stats['area']
 return stats

Binpack:

from collections import deque
from operator import mul as product
from . import avl_tree
from . import bintree
class BinPack:
 """
 Bin Ranking System. the product of BinTree.largest_child
 is used as a key value in an AVL Tree for ranking the bins.
 """
 def __init__(self, bin_size: tuple = (4, 8), sorting: bool = True):
 self.bin_dict = {'oversized': []}
 self.bin_size = bin_size
 self.tree = avl_tree.AvlTree()
 self.bin_count = 0
 self.sorting = sorting
 self._add_bin()
 def _add_bin(self) -> None:
 """
 private method. Creates a new BinTree and adds
 it to bin_dict and the AVL tree.
 """
 self.bin_dict[self.bin_count] = bintree.BinTree(list(self.bin_size))
 bin_key = product(*self.bin_dict[self.bin_count].largest_child)
 self.tree.insert(bin_key)
 self.tree[bin_key].data.append(self.bin_count)
 self.bin_count += 1
 def insert(self, *items: tuple, heuristic: str = 'best_fit') -> None:
 if self.sorting == True:
 items = sorted(items, key=lambda a: product(*a), reverse=True)
 for item in items:
 if heuristic == 'best_fit':
 self.best_fit(item)
 elif heuristic == 'next_fit':
 self.next_fit(item)
 def _check_oversized(self, item: bintree.Item) -> bool:
 """
 Catch oversized items
 """
 if item[0] > self.bin_size[0] or item[1] > self.bin_size[1]:
 self.bin_dict['oversized'].append(item)
 return False
 return True
 def best_fit(self, item_dims: tuple) -> bool:
 """
 Best Fit Bin Selection
 Public method.
 Selects optimal BinTree (or creates
 a new BinTree) for item insertion using first fit.
 Returns BinTree ID.
 """
 item_area = product(*item_dims)
 item = bintree.Item(*item_dims)
 # Catch oversized items:
 if self._check_oversized(item) == False:
 return False
 queue = deque([self.tree.root])
 best_fit = None
 best_fit_node = None
 while queue:
 current_node = queue.popleft()
 current_bintree = self.bin_dict[current_node.data[-1]]
 largest_child = current_bintree.largest_child
 if (largest_child[0] >= item.width and
 largest_child[1] >= item.height):
 if not best_fit or largest_child < best_fit.largest_child:
 best_fit = current_bintree
 best_fit_node = current_node
 if current_node.left:
 queue.append(current_node.left)
 else:
 if current_node.right:
 queue.append(current_node.right)
 if best_fit:
 best_fit.insert(item)
 # delete and reinsert node to update position in tree
 nodeid = best_fit_node.data[-1]
 old_key = best_fit_node.key
 new_key = product(*best_fit.largest_child)
 self.tree.delete(key=old_key)
 self.tree.insert(key=new_key)
 self.tree[new_key].data.append(nodeid)
 return True
 else:
 self._add_bin()
 self.bin_dict[self.bin_count-1].insert(item)
 return True
 def next_fit(self, item_dims: tuple) -> bool:
 """
 First Fit Bin Selection
 Public method.
 Selects optimal BinTree (or creates
 a new BinTree) for item insertion using first fit.
 Returns BinTree ID.
 """
 item_area = product(*item_dims)
 item = bintree.Item(*item_dims)
 # Catch oversized items:
 if self._check_oversized(item) == False:
 return False
 queue = deque([self.tree.root])
 while queue:
 current_node = queue.popleft()
 current_bintree = self.bin_dict[current_node.data[-1]]
 largest_child = current_bintree.largest_child
 if (largest_child[0] >= item.width and
 largest_child[1] >= item.height):
 current_bintree.insert(item)
 # delete and reinsert node to update position in tree
 nodeid = current_node.data[-1]
 old_key = current_node.key
 new_key = product(*current_bintree.largest_child)
 self.tree.delete(key=old_key)
 self.tree.insert(key=new_key)
 self.tree[new_key].data.append(nodeid)
 return True
 else:
 if current_node.right:
 queue.append(current_node.right)
 else:
 self._add_bin()
 self.next_fit(item_dims)
 return False
 def print_stats(self) -> None:
 """
 Returns layouts for all BinTrees
 """
 result = {}
 for key, bin in self.bin_dict.items():
 if key != 'oversized':
 result[key] = bintree.bin_stats(bin)
 result['oversized'] = self.bin_dict['oversized']
 return result

https://github.com/ssbothwell/BinPack

Question 2

META: I noticed im getting lots of views and no replies. Is there anything I could do to improve the post?

Question 3

((Usually, I just parenthesise META comment(part)s.) (anything I could do No. BinTree is frequently used for binary tree - you're screwed for good trying to get bin read as bin.) Start with a short&simple description of the task to accomplish - reference welcome. BinTree is funny in documenting an attribute left going on to use right. It bears a whiff of Guillotine - the examples near the bottom of Cutting stock problem may be an example where that limits result quality.)

Question 4

Is my BinTree class a good way of representing 2D bins?

Yes, it seems a good way to me. But then you complain about prematurely committing to a horizontal guillotine cut when you introduce the mismatched (1,8) item. Fair enough. So don't commit. Either stick with current representation and bifurcate with each insertion, storing a pair of horiz. / vert. competing trees, or better, change the representation so it reports a set of maximal rectangles that could be stored. They overlap so the sum of those areas would exceed bin size, but that's OK as you'll only consume them one at a time. A representation that could be attractive for small item counts would be simply a set of shapes known to fit, and each new candidate insertion would be free to permute the insertion order. For large item counts summarization over a subset of items would be needed.

Rather then ranking each bin, would it be better to rank all the unoccupied nodes of all the trees in the AVL tree?

Yes, I agree this would be a useful direction to explore. Your question boils down to "2D packing is hard", which is what makes it such an interesting problem. You're not limited to online packing, so your ranking could exploit the certain knowledge that a particular unoccupied shape will be forever wasted when it is known that none of your items are small enough to fit in it.

smells

Looks good to me. A datastructure design relying on premature commitment seems the biggest sticking point.

Details:

Kudos on using type annotation.

In the BinTree ctor, I don't understand the constant assignment followed by a conditional:

 self.occupied = False
 ...
 if not self.occupied:
 self.largest_child = tuple(self.dims)
 else:
 self.largest_child = None

Looks like some left over code after a bunch of edits.

from . import avl_tree

AVL is peripheral to the problem you're tackling. Pypi lists a couple of packages. You found them unsuitable? You could use a conda environment.yml to keep other folks' libraries out of your source repo.

You use docstrings, which is good! But your comments tend to be a bit redundant, telling us e.g. that a class declared to inherit from NamedTuple is a namedtuple, or that leading underscore _add_bin() is a private method. Yes, we can see that. Use English comments to speak in broad generalities, and code to speak of details. I found the ctor comment too vague: "the product of BinTree.largest_child is used as a key value in an AVL Tree for ranking the bins." It leaves me with questions. What is product? What invariant does the AVL tree maintain? (Yes, I know it stays balanced, but it's not immediately obvious how a scalar sort order relates to 2D packing, so describe what we rank on. Eventually we see it is "area", but you named the identifier bin_key.) The API for _add_bin() is a bit odd - consider making the caller responsible for passing in a new BinTree. When you evaluate list(self.bin_size), I don't understand why you go back and forth between a 2-tuple, which is natural, and a 2-element list, which is not.

 self.sorting = sorting

Consider naming this is_sorting.

 if self.sorting == True:

Leave off == True. Do take care to run your code through flake8, and follow its advice.

 items = sorted(items, key=lambda a: product(*a), reverse=True)

You're evaluating for side effects, so this would more naturally be done in place:

 sort(items, key=lambda a: product(*a), reverse=True)

In the signature, consider making heuristic: str = 'best_fit' an Enum. Given that there's two heuristics, consider making it boolean.

The comment "Catch oversized items" is vague and doesn't describe the .append() side effect.

In the best_fit() comment, "Public method" is redundant, you told us that already in the signature. The "using first fit" remark is slightly surprising, as best is not first, plus it is vague. You had an opportunity to tell us whether height or width would be considered first. The comment "Returns BinTree ID." appears to validate the aphorism that "Comments lie!", given that the signature explains we're returning a boolean. The longer a comment rambles on and on, the greater a chance it gets out of sync with current code.

 item_area = product(*item_dims)

This is a well-chosen identifier, much better than bin_key.

 # Catch oversized items:
 if self._check_oversized(item) == False:

The comment is redundant, and probably "Reject" more accurately describes relationship with caller than "Catch" does. Prefer not c over c == False. Consider inverting the meaning of the API for that helper.

 if (largest_child[0] >= item.width and
 largest_child[1] >= item.height):

This is perfectly nice code. You've been doing a nice job of peeling off helpers. Consider putting this condition into a trivial predicate, so the code can become a self-documenting "if items fits in largest_child".

The while queue loop is very nice. I wouldn't mind seeing a comment on it describing the invariant it preserves.

 self.tree.delete(key=old_key)

I don't understand that line. It isn't obvious to me what happens when a tree contains a 2x3 and a 6x1 item. I'm willing to believe the Right Thing happens, but your comments have not yet pointed out that the tree maps from area to a sequence of items, nor what it means for an item to appear earlier or later in that sequence.

def next_fit(self, item_dims: tuple) -> bool:
 """
 First Fit Bin Selection
 Public method.

You're killing me! Which is it? Next? or First? Pick a word and stick with it. Also, same redundant "public" remark as above, and bool is not an ID.

Little bit of copy-n-paste suggests there might be an opportunity to Extract Helper, no biggie.

def print_stats(self) -> None:
 ...
 return result

Hmmm, in other languages the compiler just emits a fatal diagnostic. What python tools would help us catch this?

Also, the name should be get_stats(). Consider using sorted() and/or OrderedDict.

Overall, this is a nice piece of code, easy to read. You have clearly been paying attention to code quality.

Question 5

Thank you so much for the extensive analysis. I have actually completely rewritten and expanded this project. It not includes maximal rectangle packing as well as a number of other packing methods and optimiations (everything described in A Thousand Ways To Pack The Bin - Jukka Jylänki). I got rid of the tree datastructure and replaced it with a list of all free sub areas rectangles of the bin. Its a lot simplier and allows me to easily search a bin based on a variety of heuristics.

Question 6

Your post is still super helpful and you are pointing out a lot of subtle details that I wouldnt ordinarily think about. Thank you.

Question 7

@SolomonBothwell: It not includes It now includes? (as well as Its ... simplier -> It's simpler or More simple)

J_H J_H 41.7k3 gold badges38 silver badges157 bronze badges · Accepted Answer · 2018-01-06 19:21:43Z

Is my BinTree class a good way of representing 2D bins?

Yes, it seems a good way to me. But then you complain about prematurely committing to a horizontal guillotine cut when you introduce the mismatched (1,8) item. Fair enough. So don't commit. Either stick with current representation and bifurcate with each insertion, storing a pair of horiz. / vert. competing trees, or better, change the representation so it reports a set of maximal rectangles that could be stored. They overlap so the sum of those areas would exceed bin size, but that's OK as you'll only consume them one at a time. A representation that could be attractive for small item counts would be simply a set of shapes known to fit, and each new candidate insertion would be free to permute the insertion order. For large item counts summarization over a subset of items would be needed.

Rather then ranking each bin, would it be better to rank all the unoccupied nodes of all the trees in the AVL tree?

Yes, I agree this would be a useful direction to explore. Your question boils down to "2D packing is hard", which is what makes it such an interesting problem. You're not limited to online packing, so your ranking could exploit the certain knowledge that a particular unoccupied shape will be forever wasted when it is known that none of your items are small enough to fit in it.

smells

Looks good to me. A datastructure design relying on premature commitment seems the biggest sticking point.

Details:

Kudos on using type annotation.

In the BinTree ctor, I don't understand the constant assignment followed by a conditional:

 self.occupied = False
 ...
 if not self.occupied:
 self.largest_child = tuple(self.dims)
 else:
 self.largest_child = None

Looks like some left over code after a bunch of edits.

from . import avl_tree

AVL is peripheral to the problem you're tackling. Pypi lists a couple of packages. You found them unsuitable? You could use a conda environment.yml to keep other folks' libraries out of your source repo.

You use docstrings, which is good! But your comments tend to be a bit redundant, telling us e.g. that a class declared to inherit from NamedTuple is a namedtuple, or that leading underscore _add_bin() is a private method. Yes, we can see that. Use English comments to speak in broad generalities, and code to speak of details. I found the ctor comment too vague: "the product of BinTree.largest_child is used as a key value in an AVL Tree for ranking the bins." It leaves me with questions. What is product? What invariant does the AVL tree maintain? (Yes, I know it stays balanced, but it's not immediately obvious how a scalar sort order relates to 2D packing, so describe what we rank on. Eventually we see it is "area", but you named the identifier bin_key.) The API for _add_bin() is a bit odd - consider making the caller responsible for passing in a new BinTree. When you evaluate list(self.bin_size), I don't understand why you go back and forth between a 2-tuple, which is natural, and a 2-element list, which is not.

 self.sorting = sorting

Consider naming this is_sorting.

 if self.sorting == True:

Leave off == True. Do take care to run your code through flake8, and follow its advice.

 items = sorted(items, key=lambda a: product(*a), reverse=True)

You're evaluating for side effects, so this would more naturally be done in place:

 sort(items, key=lambda a: product(*a), reverse=True)

In the signature, consider making heuristic: str = 'best_fit' an Enum. Given that there's two heuristics, consider making it boolean.

The comment "Catch oversized items" is vague and doesn't describe the .append() side effect.

In the best_fit() comment, "Public method" is redundant, you told us that already in the signature. The "using first fit" remark is slightly surprising, as best is not first, plus it is vague. You had an opportunity to tell us whether height or width would be considered first. The comment "Returns BinTree ID." appears to validate the aphorism that "Comments lie!", given that the signature explains we're returning a boolean. The longer a comment rambles on and on, the greater a chance it gets out of sync with current code.

 item_area = product(*item_dims)

This is a well-chosen identifier, much better than bin_key.

 # Catch oversized items:
 if self._check_oversized(item) == False:

The comment is redundant, and probably "Reject" more accurately describes relationship with caller than "Catch" does. Prefer not c over c == False. Consider inverting the meaning of the API for that helper.

 if (largest_child[0] >= item.width and
 largest_child[1] >= item.height):

This is perfectly nice code. You've been doing a nice job of peeling off helpers. Consider putting this condition into a trivial predicate, so the code can become a self-documenting "if items fits in largest_child".

The while queue loop is very nice. I wouldn't mind seeing a comment on it describing the invariant it preserves.

 self.tree.delete(key=old_key)

I don't understand that line. It isn't obvious to me what happens when a tree contains a 2x3 and a 6x1 item. I'm willing to believe the Right Thing happens, but your comments have not yet pointed out that the tree maps from area to a sequence of items, nor what it means for an item to appear earlier or later in that sequence.

def next_fit(self, item_dims: tuple) -> bool:
 """
 First Fit Bin Selection
 Public method.

You're killing me! Which is it? Next? or First? Pick a word and stick with it. Also, same redundant "public" remark as above, and bool is not an ID.

Little bit of copy-n-paste suggests there might be an opportunity to Extract Helper, no biggie.

def print_stats(self) -> None:
 ...
 return result

Hmmm, in other languages the compiler just emits a fatal diagnostic. What python tools would help us catch this?

Also, the name should be get_stats(). Consider using sorted() and/or OrderedDict.

Overall, this is a nice piece of code, easy to read. You have clearly been paying attention to code quality.

Thank you so much for the extensive analysis. I have actually completely rewritten and expanded this project. It not includes maximal rectangle packing as well as a number of other packing methods and optimiations (everything described in A Thousand Ways To Pack The Bin - Jukka Jylänki). I got rid of the tree datastructure and replaced it with a list of all free sub areas rectangles of the bin. Its a lot simplier and allows me to easily search a bin based on a variety of heuristics.
Your post is still super helpful and you are pointing out a lot of subtle details that I wouldnt ordinarily think about. Thank you.
@SolomonBothwell: It not includes It now includes? (as well as Its ... simplier -> It's simpler or More simple)

Stack Exchange Network

2D Bin Packing Algorithm Implementation

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

2D Bin Packing Algorithm Implementation

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions