This is the knapsack problem from rosettacode.org:
A tourist wants to make a good trip at the weekend with his friends. They will go to the mountains to see the wonders of nature, so he needs to pack well for the trip. He has a good knapsack for carrying things, but knows that he can carry a maximum of only 4kg in it and it will have to last the whole day. He creates a list of what he wants to bring for the trip but the total weight of all items is too much. He then decides to add columns to his initial list detailing their weights and a numerical value representing how important the item is for the trip.
(The list of items, together with their weight in decagrams and their value, is given in the code below.)
Which items does the tourist carry in his knapsack so that their total weight does not exceed 400 decagrams [4 kg], and their total value is maximised?
My solution is below. This seems almost seems to be too little. Just sort by their efficiency (value/weight) and add the most efficient item until capacity is reached.
from itertools import takewhile
ITEMS = (
("map", 9, 150), ("compass", 13, 35), ("water", 153, 200), ("sandwich", 50, 160),
("glucose", 15, 60), ("tin", 68, 45), ("banana", 27, 60), ("apple", 39, 40),
("cheese", 23, 30), ("beer", 52, 10), ("suntan cream", 11, 70), ("camera", 32, 30),
("t-shirt", 24, 15), ("trousers", 48, 10), ("umbrella", 73, 40),
("waterproof trousers", 42, 70), ("waterproof overclothes", 43, 75),
("note-case", 22, 80), ("sunglasses", 7, 20), ("towel", 18, 12),
("socks", 4, 50), ("book", 30, 10),
)
def item_efficiency(item):
name, weight, value = item
return float(value)/float(weight)
def pack_bag(item):
name, weight, value = item
pack_bag.max_weight -= weight
return pack_bag.max_weight > 0
pack_bag.max_weight = 400 # static variable implementation
# pack the most efficient item until pack_bag.max_weight is reached.
pack = list(takewhile(pack_bag, reversed(sorted(ITEMS, key=item_efficiency))))
# print output
for item in pack:
print item[0]
table = zip(*pack)
print "Total Value: %i" % sum(table[2])
print "Total Weight: %i" % sum(table[1])
I understand that this solution doesn't scale if capacity is multidimensional. This gives the correct answer but I haven't seen any solution like it. It could be because I am exporting a lot of the workload to sorted
, reversed
, and takewhile
.
1 Answer 1
1. Code review
There's no documentation. What do the functions do, what arguments do they take, and what do they return?
The code is not portable to Python 3.
The program has code at the top level, which makes it inconvenient to test from the interactive interpreter, because as soon as it imports the module the code runs. Better to encapsulate the top-level code within a function.
The program uses a global variable,
pack_bag.max_weight
. Global variables make code inconvenient to test, because you have to get the globals into the right state before running each test case. It is more convenient (and more robust) to encapsulate all the state inside a function or object.item_efficiency
doesn't use thename
of the item. It's conventional to indicate unused values in tuple unpacking by using the name_
. Alternatively, you could use acollections.namedtuple
and then you'd be able to writeitem.weight
anditem.value
.Use
sorted(..., reverse=True)
rather thanreversed(sorted(...))
.As noted by Janne Karila in comments,
itertools.takewhile
stops as soon as it encounters an item for whichpack_bag
returnsFalse
, but there might be (less efficient, but lighter) items which can still be packed. This could be fixed by usingfilter
instead oftakewhile
.
2. Revised code
from collections import namedtuple
Item = namedtuple('Item', 'name weight value'.split())
ITEMS = [
Item("map", 9, 150),
Item("compass", 13, 35),
Item("water", 153, 200),
Item("sandwich", 50, 160),
Item("glucose", 15, 60),
Item("tin", 68, 45),
Item("banana", 27, 60),
Item("apple", 39, 40),
Item("cheese", 23, 30),
Item("beer", 52, 10),
Item("suntan cream", 11, 70),
Item("camera", 32, 30),
Item("t-shirt", 24, 15),
Item("trousers", 48, 10),
Item("umbrella", 73, 40),
Item("waterproof trousers", 42, 70),
Item("waterproof overclothes", 43, 75),
Item("note-case", 22, 80),
Item("sunglasses", 7, 20),
Item("towel", 18, 12),
Item("socks", 4, 50),
Item("book", 30, 10),
]
def efficiency(item):
"""Return efficiency of item (its value per unit weight)."""
return float(item.value) / float(item.weight)
def packing(items=ITEMS, max_weight=400):
"""Return a list of items with the maximum value, subject to the
constraint that their combined weight must not exceed max_weight.
"""
def pack(item):
# Attempt to pack item; return True if successful.
if item.weight <= pack.max_weight:
pack.max_weight -= item.weight
return True
else:
return False
pack.max_weight = max_weight
return list(filter(pack, sorted(items, key=efficiency, reverse=True)))
Note that in order to allow pack
to modify max_weight
each time it is called, I've stored the value as an attribute. This is a slightly ugly workaround that's necessary in Python 2. In Python 3 one would just use a nonlocal
statement to give the pack
access to the variable max_weight
.
3. Algorithm
Your program implements the well-known "greedy approximation algorithm" for the knapsack problem (first described by George Dantzig in 1957).
As the name says, this is an approximation algorithm: that is, it does not always return the best solution. For example, when max_weight=80
the greedy algorithm (as implemented above) picks the following items, with total value 430:
map 9 150
glucose 15 60
suntan cream 11 70
note-case 22 80
sunglasses 7 20
socks 4 50
TOTAL 68 430
But the following choice of items has total value 445:
map 9 150
compass 13 35
glucose 15 60
suntan cream 11 70
note-case 22 80
socks 4 50
TOTAL 74 445
So even in small examples like this, you really do need to use the tabular method (aka "dynamic programming") to get the best answer. See this answer for one way to implement it in Python.
Explore related questions
See similar questions with these tags.
>
in front of the paragraphs) \$\endgroup\$