I wrote a code in Python to solve Knapsack problem using branch and bound. I tested it with the case from Rosetta and it outputs correctly. But this is my first time to write this kind of code, I am feeling unconfident. Could you please review my code and give me some tips to improve it?
A tourist wants to make a good trip at the weekend with his friends. They will go to the mountains to see the wonders of nature, so he needs to pack well for the trip. He has a good knapsack for carrying things, but knows that he can carry a maximum of only 4kg in it and it will have to last the whole day. He creates a list of what he wants to bring for the trip but the total weight of all items is too much. He then decides to add columns to his initial list detailing their weights and a numerical value representing how important the item is for the trip.
from operator import truediv
data_item = ['map', 'compass', 'water', 'sandwich', 'glucose', 'tin', 'banana',\
'apple', 'cheese', 'beer', 'suntan', 'camera', 'T', 'trousers', 'umbrella', 'w t', 'w o', \
'note-case', 'sunglasses', 'towel', 'socks', 'book']
data_weight = [9, 13, 153, 50, 15, 68, 27, 39, 23, 52, 11, 32, 24, 48, 73, 42, 43, 22, 7, 18, 4, 30]
data_value = [150, 35, 200, 160, 60, 45, 60, 40, 30, 10, 70, 30, 15, 10, 40, 70, 75, 80, 20, 12, 50, 10]
data_eff = map(truediv, data_value, data_weight)
order = [i[0] for i in sorted(enumerate(data_eff), key=lambda x:x[1], reverse=True)]
#sort data based on their 'efficiency', i.e. value/weight
data_eff = [data_eff[i] for i in order]
data_weight = [data_weight[i] for i in order]
data_value = [data_value[i] for i in order]
data_item = [data_item[i] for i in order]
max_weight = 400
class State(object):
def __init__(self, level, benefit, weight, token):
#token = list marking if a task is token. ex. [1, 0, 0] means item0 token, item1 non-token, item2 non-token
#available = list marking all tasks available, i.e. not explored yet
self.level = level
self.benefit = benefit
self.weight = weight
self.token = token
self.available = self.token[:self.level]+[1]*(len(data_value)-level)
self.ub = self.upperbound()
def upperbound(self): #define upperbound using fractional knaksack
upperbound = 0 #initial upperbound
weight_accumulate = 0 #accumulated weight used to stop the upperbound summation
for i in range(len(data_weight)):
if data_weight[i] * self.available[i] <= max_weight - weight_accumulate:
weight_accumulate += data_weight[i] * self.available[i]
upperbound += data_value[i] * self.available[i]
else:
upperbound += data_value[i] * (max_weight - weight_accumulate) / data_weight[i] * self.available[i]
break
return upperbound
def develop(self):
level = self.level + 1
if self.weight + data_weight[self.level] <= max_weight: #if not overweighted, give left child
left_weight = self.weight + data_weight[self.level]
left_benefit = self.benefit + data_value[self.level]
left_token = self.token[:self.level]+[1]+self.token[self.level+1:]
left_child = State(level, left_benefit, left_weight, left_token)
else: left_child = None
#anyway, give right child
right_child = State(level, self.benefit, self.weight, self.token)
if left_child != None:
return [left_child, right_child]
else: return [right_child]
Root = State(0, 0, 0, [0]*len(data_value)) #start with nothing
waiting_States = [] #list of States waiting to be explored
current_state = Root
while current_state.level < len(data_value):
waiting_States.extend(current_state.develop())
waiting_States.sort(key=lambda x: x.ub) #sort the waiting list based on their upperbound
current_state = waiting_States.pop() #explor the one with largest upperbound
best_solution = current_state
best_item = []
for i in range(len(best_solution.token)):
if (best_solution.token[i] == 1):
best_item.append(data_item[i])
print "Total weight: ", best_solution.weight
print "Total Value: ", best_solution.benefit
print "Items:", best_item
-
1\$\begingroup\$ You are welcome! You're question is perfectly on topic, interesting and the title is meaningful. \$\endgroup\$Caridorc– Caridorc2015年06月23日 10:16:41 +00:00Commented Jun 23, 2015 at 10:16
-
\$\begingroup\$ I tried solving this problem in a much more naive way: codereview.stackexchange.com/questions/94478/… \$\endgroup\$Caridorc– Caridorc2015年06月23日 18:30:42 +00:00Commented Jun 23, 2015 at 18:30
-
1\$\begingroup\$ Thank you very much Caridorc! I have read your code, and I think it is doing a different thing than the mine. For example, my code computes the combination of items to maximize their total value while keeping the total weight <= max_weight, and each item has only one copy. Your code is looking for the combination whose total weight == max_weight (capience), and it can take the same item several times. Am I right? \$\endgroup\$Lilianna– Lilianna2015年06月24日 02:45:08 +00:00Commented Jun 24, 2015 at 2:45
-
\$\begingroup\$ Lilliana you are right about the differenze, my programma goes about the problem in a simpler but less effective way. \$\endgroup\$Caridorc– Caridorc2015年06月24日 07:59:48 +00:00Commented Jun 24, 2015 at 7:59
-
1\$\begingroup\$ Caridorc: I see. But it gave me other ideas to think about the problem. \$\endgroup\$Lilianna– Lilianna2015年06月24日 10:23:11 +00:00Commented Jun 24, 2015 at 10:23
1 Answer 1
Instead of sorting values, taking the corresponding indices and using them to have the values of a different array in the order of your choice, you could generate tuples containing all the relevant information (name, weight, value) and sort them with a function of your choice.
This reduces :
data_eff = map(truediv, data_value, data_weight)
order = [i[0] for i in sorted(enumerate(data_eff), key=lambda x:x[1], reverse=True)]
#sort data based on their 'efficiency', i.e. value/weight
data_eff = [data_eff[i] for i in order]
data_weight = [data_weight[i] for i in order]
data_value = [data_value[i] for i in order]
data_item = [data_item[i] for i in order]
to a more efficient and more concise :
data_sorted = sorted(zip(data_item, data_weight, data_value), key=lambda (i,w,v):v//w, reverse=True)
You could them split this into 3 arrays but I am not sure it is worth the pain.
Indeed, a few things can be done in a more concise way now like :
for i, (item, w, v) in enumerate(data_sorted):
if w * self.available[i] <= max_weight - weight_accumulate:
weight_accumulate += w * self.available[i]
upperbound += v * self.available[i]
else:
upperbound += v * (max_weight - weight_accumulate) / w * self.available[i]
break
and re-improved using zip :
for avail, (item, w, v) in zip(self.available, data_sorted):
if w * avail <= max_weight - weight_accumulate:
weight_accumulate += w * avail
upperbound += v * avail
else:
upperbound += v * (max_weight - weight_accumulate) / w * avail
break
Also,
best_solution = current_state
best_item = []
for i in range(len(best_solution.token)):
if (best_solution.token[i] == 1):
best_item.append(data_item[i])
can be rewritten using zip list comprehension and the data_sorted
array defined previously :
best_item = [item for tok, (item, _, _) in zip(current_state.token, data_sorted) if tok == 1]
Now, to comment on the style, you should use is
to compare to None as per the style guide PEP8.
After changing these and many other details (provided by the tool pep8 checking for various style issues such as spacing, line length, etc), I end up with :
data_item = ['map', 'compass', 'water', 'sandwich', 'glucose', 'tin', 'banana',
'apple', 'cheese', 'beer', 'suntan', 'camera', 'T', 'trousers',
'umbrella', 'w t', 'w o', 'note-case', 'sunglasses', 'towel',
'socks', 'book']
data_weight = [9, 13, 153, 50, 15, 68, 27, 39, 23, 52, 11, 32, 24, 48, 73, 42,
43, 22, 7, 18, 4, 30]
data_value = [150, 35, 200, 160, 60, 45, 60, 40, 30, 10, 70, 30, 15, 10, 40,
70, 75, 80, 20, 12, 50, 10]
data_sorted = sorted(zip(data_item, data_weight, data_value),
key=lambda (i, w, v): v//w, reverse=True)
max_weight = 400
class State(object):
def __init__(self, level, benefit, weight, token):
# token = list marking if a task is token. ex. [1, 0, 0] means
# item0 token, item1 non-token, item2 non-token
# available = list marking all tasks available, i.e. not explored yet
self.level = level
self.benefit = benefit
self.weight = weight
self.token = token
self.available = self.token[:self.level]+[1]*(len(data_sorted)-level)
self.ub = self.upperbound()
def upperbound(self): # define upperbound using fractional knaksack
upperbound = 0 # initial upperbound
# accumulated weight used to stop the upperbound summation
weight_accumulate = 0
for avail, (_, wei, val) in zip(self.available, data_sorted):
if wei * avail <= max_weight - weight_accumulate:
weight_accumulate += wei * avail
upperbound += val * avail
else:
upperbound += val * (max_weight - weight_accumulate) / wei * avail
break
return upperbound
def develop(self):
level = self.level + 1
_, weight, value = data_sorted[self.level]
left_weight = self.weight + weight
if left_weight <= max_weight: # if not overweighted, give left child
left_benefit = self.benefit + value
left_token = self.token[:self.level]+[1]+self.token[level:]
left_child = State(level, left_benefit, left_weight, left_token)
else:
left_child = None
# anyway, give right child
right_child = State(level, self.benefit, self.weight, self.token)
return ([] if left_child is None else [left_child]) + [right_child]
Root = State(0, 0, 0, [0] * len(data_sorted)) # start with nothing
waiting_States = [] # list of States waiting to be explored
current_state = Root
while current_state.level < len(data_sorted):
waiting_States.extend(current_state.develop())
# sort the waiting list based on their upperbound
waiting_States.sort(key=lambda x: x.ub)
# explore the one with largest upperbound
current_state = waiting_States.pop()
best_item = [item for tok, (item, _, _)
in zip(current_state.token, data_sorted) if tok == 1]
print "Total weight: ", current_state.weight
print "Total Value: ", current_state.benefit
print "Items:", best_item
There are still many things to be improved but I guess it gives you a better basis to start with.
As a take-away advice : almost every time you deal with indices in Python, you are doing it wrong. Using the proper data structure and the proper method can lead you back to the right way.
Edit : A few things I have forgotten :
maybe
available
does not need to be part of the instances and could be provided directly toupperbound
which does not need to be an instance method anymore.in
upperbound
,max_weight - weight_accumulated
is a more interesting value than each ofmax_weight
andweight_accumulated
individually. Maybe it could make sens to work directly on that value.
Then you'd get something like :
class State(object):
def __init__(self, level, benefit, weight, token):
# token = list marking if a task is token. ex. [1, 0, 0] means
# item0 token, item1 non-token, item2 non-token
# available = list marking all tasks available, i.e. not explored yet
self.level = level
self.benefit = benefit
self.weight = weight
self.token = token
self.ub = State.upperbound(self.token[:self.level]+[1]*(len(data_sorted)-level))
@staticmethod
def upperbound(available): # define upperbound using fractional knaksack
upperbound = 0 # initial upperbound
# accumulated weight used to stop the upperbound summation
remaining = max_weight
for avail, (_, wei, val) in zip(available, data_sorted):
wei2 = wei * avail # i could not find a better name
if wei2 <= remaining:
remaining -= wei2
upperbound += val * avail
else:
upperbound += val * remaining / wei2
break
return upperbound
-
\$\begingroup\$ Thank you very much Josay! It is exactly what I was looking for, to solve the problem very elegantly and in a pythonic way. \$\endgroup\$Lilianna– Lilianna2015年06月25日 09:38:40 +00:00Commented Jun 25, 2015 at 9:38