Python Knapsack problem: greedy

Question 1

A tourist wants to make a good trip at the weekend with his friends. They will go to the mountains to see the wonders of nature, so he needs to pack well for the trip. He has a good knapsack for carrying things, but knows that he can carry a maximum of only 4kg in it and it will have to last the whole day. He creates a list of what he wants to bring for the trip but the total weight of all items is too much. He then decides to add columns to his initial list detailing their weights and a numerical value representing how important the item is for the trip.

I decided to solve the knapsack problem by a greedy algorithm. I am not sure of the need of the Item class, maybe another data structure is better?

from __future__ import division
class Item:
 def __init__(self, weight, value):
 self.weight = weight
 self.value = value
 self.convenience = value / weight
 def __repr__(self):
 return "Item({}, {})".format(self.weight, self.value)
class Knapsack:
 def __init__(self, capience, items_inside=[]):
 self.capience = capience
 self.items_inside = items_inside
 def current_weight(self):
 return sum(item.weight for item in self.items_inside)
 def is_full(self):
 return self.current_weight() == self.capience
 def can_incorporate(self, item):
 return self.current_weight() + item.weight <= self.capience
 def fill(self, items):
 while not self.is_full():
 self.items_inside.append(
 max((item for item in items if self.can_incorporate(item)),
 key= lambda item: item.convenience))
 def __repr__(self):
 return "Knapsack({}, {})".format(self.capience, self.items_inside)
def main():
 bag = Knapsack(54)
 bag.fill([
 Item(12, 8),
 Item(20, 13),
 Item(4, 2),
 Item(1, 0.1)])
 print(bag)
if __name__ == "__main__":
 main()

Question 2

A couple of possible bugs:

It results in this:

Knapsack(54, [Item(12, 8), Item(12, 8), Item(12, 8), Item(12, 8), Item(4, 2), Item(1, 0.1), Item(1, 0.1)])

How can you put one item in the knapsack four times over? And if you could, would you want four shovels just because one shovel is very useful? You never take anything out of the list of items once you have incorporated it.
is_full and can_incorporate both do very similar things, and interact badly in the fill function - what happens if the knapsack is not full, but it cannot incorporate any of the remaining items? It will crash trying to return the max of an empty list - or if that didn't happen, it would loop forever testing whether it could incorporate the next best thing, but never being able to. You could stay with while self.can_incorporate(item) and get rid of is_full().
When you initialise the Knapsack() object, you can pass a long list of items, which go straight into the items_inside without any capience checks - you can bypass the capacity limit and overload it. Then when you fill() it, it will crash.
Nothing stops you from just adding things to the items_inside list regardless of whether the Knapsack can take it, and it can't test for that afterwards. I feel like adding something to a full knapsack should raise an exception, like removing something from an empty list raises an exception. (e.g. putting a vehicle into the knapsack is an exceptional condition - you don't do it deliberately).

I'm no design expert for sure; but I feel a bit like you're losing the heart of the problem code by squishing the algorithm into almost a one-liner and overwhelming it with surrounding methods that are only there to support the algorithm.

I know that's what support methods are for, and in a way it's neat and tidy. But in another way, it's noisy and unclear.

I don't know what I'd change, exactly, and I'm left bikeshedding at things like:

capience isn't a word I can find on Google, what about capacity? Maybe capience is a fine word, but capacity seems like volume, so maybe weight_limit?
What about contents instead of items_inside?
Why current_weight instead of weight?
What about making current_weight a property so you have the same calling style as capience, instead of one being a property and one being a function call
It seems a lot of effort to calculate the remaining list every time in the fill loop, and calculate the sum of weights every time. You could sort the list of items by convenience, pop() them from the list, or iterate through it only once.
You pass fill a list of items, but don't know which ones it fills and which ones it doesn't.

So I've ended up rewriting it myself, and it's no shorter, and probably no clearer. But it might address some of these points. The Knapsack manages its own contents as a "private" variable (_contents), so it can protect against overload, and that means loading an item has its own method. Fill() only runs through the list once. Initialising the Knapsack can't overload it.

from __future__ import division
class Item:
 def __init__(self, weight, value):
 self.weight = weight
 self.value = value
 self.convenience = value / weight
 def __repr__(self):
 return "Item(weight={}, value={})".format(self.weight, self.value)
class Knapsack:
 def __init__(self, weight_limit, initial_items=[]):
 self.weight_limit = weight_limit
 self._contents = []
 self.fill(initial_items)
 def weight(self):
 return sum(item.weight for item in self._contents) 
 weight = property(fget=weight)
 def incorporate(self, item):
 if self.weight + item.weight > self.weight_limit:
 raise Exception("Knapsack Overload Exception: {} is too heavy".format(item))
 self._contents.append(item)
 def fill(self, items):
 items = sorted(items, key=lambda item: item.convenience, reverse=True)
 rejected_items = []
 for item in items:
 try: self.incorporate(item)
 except: rejected_items.append(item)
 return rejected_items
 def __repr__(self):
 return "Knapsack({}, {})".format(self.weight_limit, self._contents)
def main():
 bag = Knapsack(54)
 print bag.fill([
 Item(12, 8),
 Item(20, 13),
 Item(4, 2),
 Item(200,1),
 Item(1, 0.1)])
 print(bag)
if __name__ == "__main__":
 main()

(I notice that I haven't even considered your actual question about whether the Item class is necessary! I don't know, it seems useful. Although having an item name might be good).

TessellatingHeckler TessellatingHeckler 6714 silver badges9 bronze badges · Answer 1 · 2015-06-24 03:10:47Z

A couple of possible bugs:

It results in this:

Knapsack(54, [Item(12, 8), Item(12, 8), Item(12, 8), Item(12, 8), Item(4, 2), Item(1, 0.1), Item(1, 0.1)])

How can you put one item in the knapsack four times over? And if you could, would you want four shovels just because one shovel is very useful? You never take anything out of the list of items once you have incorporated it.
is_full and can_incorporate both do very similar things, and interact badly in the fill function - what happens if the knapsack is not full, but it cannot incorporate any of the remaining items? It will crash trying to return the max of an empty list - or if that didn't happen, it would loop forever testing whether it could incorporate the next best thing, but never being able to. You could stay with while self.can_incorporate(item) and get rid of is_full().
When you initialise the Knapsack() object, you can pass a long list of items, which go straight into the items_inside without any capience checks - you can bypass the capacity limit and overload it. Then when you fill() it, it will crash.
Nothing stops you from just adding things to the items_inside list regardless of whether the Knapsack can take it, and it can't test for that afterwards. I feel like adding something to a full knapsack should raise an exception, like removing something from an empty list raises an exception. (e.g. putting a vehicle into the knapsack is an exceptional condition - you don't do it deliberately).

I'm no design expert for sure; but I feel a bit like you're losing the heart of the problem code by squishing the algorithm into almost a one-liner and overwhelming it with surrounding methods that are only there to support the algorithm.

I know that's what support methods are for, and in a way it's neat and tidy. But in another way, it's noisy and unclear.

I don't know what I'd change, exactly, and I'm left bikeshedding at things like:

capience isn't a word I can find on Google, what about capacity? Maybe capience is a fine word, but capacity seems like volume, so maybe weight_limit?
What about contents instead of items_inside?
Why current_weight instead of weight?
What about making current_weight a property so you have the same calling style as capience, instead of one being a property and one being a function call
It seems a lot of effort to calculate the remaining list every time in the fill loop, and calculate the sum of weights every time. You could sort the list of items by convenience, pop() them from the list, or iterate through it only once.
You pass fill a list of items, but don't know which ones it fills and which ones it doesn't.

So I've ended up rewriting it myself, and it's no shorter, and probably no clearer. But it might address some of these points. The Knapsack manages its own contents as a "private" variable (_contents), so it can protect against overload, and that means loading an item has its own method. Fill() only runs through the list once. Initialising the Knapsack can't overload it.

from __future__ import division
class Item:
 def __init__(self, weight, value):
 self.weight = weight
 self.value = value
 self.convenience = value / weight
 def __repr__(self):
 return "Item(weight={}, value={})".format(self.weight, self.value)
class Knapsack:
 def __init__(self, weight_limit, initial_items=[]):
 self.weight_limit = weight_limit
 self._contents = []
 self.fill(initial_items)
 def weight(self):
 return sum(item.weight for item in self._contents) 
 weight = property(fget=weight)
 def incorporate(self, item):
 if self.weight + item.weight > self.weight_limit:
 raise Exception("Knapsack Overload Exception: {} is too heavy".format(item))
 self._contents.append(item)
 def fill(self, items):
 items = sorted(items, key=lambda item: item.convenience, reverse=True)
 rejected_items = []
 for item in items:
 try: self.incorporate(item)
 except: rejected_items.append(item)
 return rejected_items
 def __repr__(self):
 return "Knapsack({}, {})".format(self.weight_limit, self._contents)
def main():
 bag = Knapsack(54)
 print bag.fill([
 Item(12, 8),
 Item(20, 13),
 Item(4, 2),
 Item(200,1),
 Item(1, 0.1)])
 print(bag)
if __name__ == "__main__":
 main()

(I notice that I haven't even considered your actual question about whether the Item class is necessary! I don't know, it seems useful. Although having an item name might be good).

Stack Exchange Network

Python Knapsack problem: greedy

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Python Knapsack problem: greedy

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions