Python Knapsack problem using branch and bound algorithm

Question 1

I wrote a code in Python to solve Knapsack problem using branch and bound. I tested it with the case from Rosetta and it outputs correctly. But this is my first time to write this kind of code, I am feeling unconfident. Could you please review my code and give me some tips to improve it?

A tourist wants to make a good trip at the weekend with his friends. They will go to the mountains to see the wonders of nature, so he needs to pack well for the trip. He has a good knapsack for carrying things, but knows that he can carry a maximum of only 4kg in it and it will have to last the whole day. He creates a list of what he wants to bring for the trip but the total weight of all items is too much. He then decides to add columns to his initial list detailing their weights and a numerical value representing how important the item is for the trip.

from operator import truediv
data_item = ['map', 'compass', 'water', 'sandwich', 'glucose', 'tin', 'banana',\
'apple', 'cheese', 'beer', 'suntan', 'camera', 'T', 'trousers', 'umbrella', 'w t', 'w o', \
'note-case', 'sunglasses', 'towel', 'socks', 'book']
data_weight = [9, 13, 153, 50, 15, 68, 27, 39, 23, 52, 11, 32, 24, 48, 73, 42, 43, 22, 7, 18, 4, 30]
data_value = [150, 35, 200, 160, 60, 45, 60, 40, 30, 10, 70, 30, 15, 10, 40, 70, 75, 80, 20, 12, 50, 10]
data_eff = map(truediv, data_value, data_weight)
order = [i[0] for i in sorted(enumerate(data_eff), key=lambda x:x[1], reverse=True)]
#sort data based on their 'efficiency', i.e. value/weight
data_eff = [data_eff[i] for i in order]
data_weight = [data_weight[i] for i in order]
data_value = [data_value[i] for i in order]
data_item = [data_item[i] for i in order]
max_weight = 400
class State(object):
 def __init__(self, level, benefit, weight, token):
 #token = list marking if a task is token. ex. [1, 0, 0] means item0 token, item1 non-token, item2 non-token
 #available = list marking all tasks available, i.e. not explored yet
 self.level = level
 self.benefit = benefit
 self.weight = weight
 self.token = token
 self.available = self.token[:self.level]+[1]*(len(data_value)-level)
 self.ub = self.upperbound()
 def upperbound(self): #define upperbound using fractional knaksack
 upperbound = 0 #initial upperbound
 weight_accumulate = 0 #accumulated weight used to stop the upperbound summation
 for i in range(len(data_weight)):
 if data_weight[i] * self.available[i] <= max_weight - weight_accumulate:
 weight_accumulate += data_weight[i] * self.available[i]
 upperbound += data_value[i] * self.available[i]
 else:
 upperbound += data_value[i] * (max_weight - weight_accumulate) / data_weight[i] * self.available[i]
 break
 return upperbound
 def develop(self):
 level = self.level + 1
 if self.weight + data_weight[self.level] <= max_weight: #if not overweighted, give left child
 left_weight = self.weight + data_weight[self.level]
 left_benefit = self.benefit + data_value[self.level]
 left_token = self.token[:self.level]+[1]+self.token[self.level+1:]
 left_child = State(level, left_benefit, left_weight, left_token)
 else: left_child = None
 #anyway, give right child
 right_child = State(level, self.benefit, self.weight, self.token)
 if left_child != None: 
 return [left_child, right_child]
 else: return [right_child]
Root = State(0, 0, 0, [0]*len(data_value)) #start with nothing
waiting_States = [] #list of States waiting to be explored
current_state = Root
while current_state.level < len(data_value):
 waiting_States.extend(current_state.develop())
 waiting_States.sort(key=lambda x: x.ub) #sort the waiting list based on their upperbound
 current_state = waiting_States.pop() #explor the one with largest upperbound
best_solution = current_state
best_item = []
for i in range(len(best_solution.token)):
 if (best_solution.token[i] == 1):
 best_item.append(data_item[i])
print "Total weight: ", best_solution.weight
print "Total Value: ", best_solution.benefit
print "Items:", best_item

Question 2

You are welcome! You're question is perfectly on topic, interesting and the title is meaningful.

Question 3

I tried solving this problem in a much more naive way: codereview.stackexchange.com/questions/94478/…

Question 4

Thank you very much Caridorc! I have read your code, and I think it is doing a different thing than the mine. For example, my code computes the combination of items to maximize their total value while keeping the total weight <= max_weight, and each item has only one copy. Your code is looking for the combination whose total weight == max_weight (capience), and it can take the same item several times. Am I right?

Question 5

Lilliana you are right about the differenze, my programma goes about the problem in a simpler but less effective way.

Question 6

Caridorc: I see. But it gave me other ideas to think about the problem.

Question 7

Instead of sorting values, taking the corresponding indices and using them to have the values of a different array in the order of your choice, you could generate tuples containing all the relevant information (name, weight, value) and sort them with a function of your choice.

This reduces :

data_eff = map(truediv, data_value, data_weight)
order = [i[0] for i in sorted(enumerate(data_eff), key=lambda x:x[1], reverse=True)]
#sort data based on their 'efficiency', i.e. value/weight
data_eff = [data_eff[i] for i in order]
data_weight = [data_weight[i] for i in order]
data_value = [data_value[i] for i in order]
data_item = [data_item[i] for i in order]

to a more efficient and more concise :

data_sorted = sorted(zip(data_item, data_weight, data_value), key=lambda (i,w,v):v//w, reverse=True)

You could them split this into 3 arrays but I am not sure it is worth the pain.

Indeed, a few things can be done in a more concise way now like :

for i, (item, w, v) in enumerate(data_sorted):
 if w * self.available[i] <= max_weight - weight_accumulate:
 weight_accumulate += w * self.available[i]
 upperbound += v * self.available[i]
 else:
 upperbound += v * (max_weight - weight_accumulate) / w * self.available[i]
 break

and re-improved using zip :

for avail, (item, w, v) in zip(self.available, data_sorted):
 if w * avail <= max_weight - weight_accumulate:
 weight_accumulate += w * avail
 upperbound += v * avail
 else:
 upperbound += v * (max_weight - weight_accumulate) / w * avail
 break

Also,

best_solution = current_state
best_item = []
for i in range(len(best_solution.token)):
 if (best_solution.token[i] == 1):
 best_item.append(data_item[i])

can be rewritten using zip list comprehension and the data_sorted array defined previously :

best_item = [item for tok, (item, _, _) in zip(current_state.token, data_sorted) if tok == 1]

Now, to comment on the style, you should use is to compare to None as per the style guide PEP8.

After changing these and many other details (provided by the tool pep8 checking for various style issues such as spacing, line length, etc), I end up with :

data_item = ['map', 'compass', 'water', 'sandwich', 'glucose', 'tin', 'banana',
 'apple', 'cheese', 'beer', 'suntan', 'camera', 'T', 'trousers',
 'umbrella', 'w t', 'w o', 'note-case', 'sunglasses', 'towel',
 'socks', 'book']
data_weight = [9, 13, 153, 50, 15, 68, 27, 39, 23, 52, 11, 32, 24, 48, 73, 42,
 43, 22, 7, 18, 4, 30]
data_value = [150, 35, 200, 160, 60, 45, 60, 40, 30, 10, 70, 30, 15, 10, 40,
 70, 75, 80, 20, 12, 50, 10]
data_sorted = sorted(zip(data_item, data_weight, data_value),
 key=lambda (i, w, v): v//w, reverse=True)
max_weight = 400
class State(object):
 def __init__(self, level, benefit, weight, token):
 # token = list marking if a task is token. ex. [1, 0, 0] means
 # item0 token, item1 non-token, item2 non-token
 # available = list marking all tasks available, i.e. not explored yet
 self.level = level
 self.benefit = benefit
 self.weight = weight
 self.token = token
 self.available = self.token[:self.level]+[1]*(len(data_sorted)-level)
 self.ub = self.upperbound()
 def upperbound(self): # define upperbound using fractional knaksack
 upperbound = 0 # initial upperbound
 # accumulated weight used to stop the upperbound summation
 weight_accumulate = 0
 for avail, (_, wei, val) in zip(self.available, data_sorted):
 if wei * avail <= max_weight - weight_accumulate:
 weight_accumulate += wei * avail
 upperbound += val * avail
 else:
 upperbound += val * (max_weight - weight_accumulate) / wei * avail
 break
 return upperbound
 def develop(self):
 level = self.level + 1
 _, weight, value = data_sorted[self.level]
 left_weight = self.weight + weight
 if left_weight <= max_weight: # if not overweighted, give left child
 left_benefit = self.benefit + value
 left_token = self.token[:self.level]+[1]+self.token[level:]
 left_child = State(level, left_benefit, left_weight, left_token)
 else:
 left_child = None
 # anyway, give right child
 right_child = State(level, self.benefit, self.weight, self.token)
 return ([] if left_child is None else [left_child]) + [right_child]
Root = State(0, 0, 0, [0] * len(data_sorted)) # start with nothing
waiting_States = [] # list of States waiting to be explored
current_state = Root
while current_state.level < len(data_sorted):
 waiting_States.extend(current_state.develop())
 # sort the waiting list based on their upperbound
 waiting_States.sort(key=lambda x: x.ub)
 # explore the one with largest upperbound
 current_state = waiting_States.pop()
best_item = [item for tok, (item, _, _)
 in zip(current_state.token, data_sorted) if tok == 1]
print "Total weight: ", current_state.weight
print "Total Value: ", current_state.benefit
print "Items:", best_item

There are still many things to be improved but I guess it gives you a better basis to start with.

As a take-away advice : almost every time you deal with indices in Python, you are doing it wrong. Using the proper data structure and the proper method can lead you back to the right way.

Edit : A few things I have forgotten :

maybe available does not need to be part of the instances and could be provided directly to upperbound which does not need to be an instance method anymore.
in upperbound, max_weight - weight_accumulated is a more interesting value than each of max_weight and weight_accumulated individually. Maybe it could make sens to work directly on that value.

Then you'd get something like :

class State(object):
 def __init__(self, level, benefit, weight, token):
 # token = list marking if a task is token. ex. [1, 0, 0] means
 # item0 token, item1 non-token, item2 non-token
 # available = list marking all tasks available, i.e. not explored yet
 self.level = level
 self.benefit = benefit
 self.weight = weight
 self.token = token
 self.ub = State.upperbound(self.token[:self.level]+[1]*(len(data_sorted)-level))
 @staticmethod
 def upperbound(available): # define upperbound using fractional knaksack
 upperbound = 0 # initial upperbound
 # accumulated weight used to stop the upperbound summation
 remaining = max_weight
 for avail, (_, wei, val) in zip(available, data_sorted):
 wei2 = wei * avail # i could not find a better name
 if wei2 <= remaining:
 remaining -= wei2
 upperbound += val * avail
 else:
 upperbound += val * remaining / wei2
 break
 return upperbound

Question 8

Thank you very much Josay! It is exactly what I was looking for, to solve the problem very elegantly and in a pythonic way.

SylvainD SylvainDSylvainD 29.7k1 gold badge49 silver badges93 bronze badges · Accepted Answer · 2015-06-24 16:11:30Z

Instead of sorting values, taking the corresponding indices and using them to have the values of a different array in the order of your choice, you could generate tuples containing all the relevant information (name, weight, value) and sort them with a function of your choice.

This reduces :

data_eff = map(truediv, data_value, data_weight)
order = [i[0] for i in sorted(enumerate(data_eff), key=lambda x:x[1], reverse=True)]
#sort data based on their 'efficiency', i.e. value/weight
data_eff = [data_eff[i] for i in order]
data_weight = [data_weight[i] for i in order]
data_value = [data_value[i] for i in order]
data_item = [data_item[i] for i in order]

to a more efficient and more concise :

data_sorted = sorted(zip(data_item, data_weight, data_value), key=lambda (i,w,v):v//w, reverse=True)

You could them split this into 3 arrays but I am not sure it is worth the pain.

Indeed, a few things can be done in a more concise way now like :

for i, (item, w, v) in enumerate(data_sorted):
 if w * self.available[i] <= max_weight - weight_accumulate:
 weight_accumulate += w * self.available[i]
 upperbound += v * self.available[i]
 else:
 upperbound += v * (max_weight - weight_accumulate) / w * self.available[i]
 break

and re-improved using zip :

for avail, (item, w, v) in zip(self.available, data_sorted):
 if w * avail <= max_weight - weight_accumulate:
 weight_accumulate += w * avail
 upperbound += v * avail
 else:
 upperbound += v * (max_weight - weight_accumulate) / w * avail
 break

Also,

best_solution = current_state
best_item = []
for i in range(len(best_solution.token)):
 if (best_solution.token[i] == 1):
 best_item.append(data_item[i])

can be rewritten using zip list comprehension and the data_sorted array defined previously :

best_item = [item for tok, (item, _, _) in zip(current_state.token, data_sorted) if tok == 1]

Now, to comment on the style, you should use is to compare to None as per the style guide PEP8.

After changing these and many other details (provided by the tool pep8 checking for various style issues such as spacing, line length, etc), I end up with :

data_item = ['map', 'compass', 'water', 'sandwich', 'glucose', 'tin', 'banana',
 'apple', 'cheese', 'beer', 'suntan', 'camera', 'T', 'trousers',
 'umbrella', 'w t', 'w o', 'note-case', 'sunglasses', 'towel',
 'socks', 'book']
data_weight = [9, 13, 153, 50, 15, 68, 27, 39, 23, 52, 11, 32, 24, 48, 73, 42,
 43, 22, 7, 18, 4, 30]
data_value = [150, 35, 200, 160, 60, 45, 60, 40, 30, 10, 70, 30, 15, 10, 40,
 70, 75, 80, 20, 12, 50, 10]
data_sorted = sorted(zip(data_item, data_weight, data_value),
 key=lambda (i, w, v): v//w, reverse=True)
max_weight = 400
class State(object):
 def __init__(self, level, benefit, weight, token):
 # token = list marking if a task is token. ex. [1, 0, 0] means
 # item0 token, item1 non-token, item2 non-token
 # available = list marking all tasks available, i.e. not explored yet
 self.level = level
 self.benefit = benefit
 self.weight = weight
 self.token = token
 self.available = self.token[:self.level]+[1]*(len(data_sorted)-level)
 self.ub = self.upperbound()
 def upperbound(self): # define upperbound using fractional knaksack
 upperbound = 0 # initial upperbound
 # accumulated weight used to stop the upperbound summation
 weight_accumulate = 0
 for avail, (_, wei, val) in zip(self.available, data_sorted):
 if wei * avail <= max_weight - weight_accumulate:
 weight_accumulate += wei * avail
 upperbound += val * avail
 else:
 upperbound += val * (max_weight - weight_accumulate) / wei * avail
 break
 return upperbound
 def develop(self):
 level = self.level + 1
 _, weight, value = data_sorted[self.level]
 left_weight = self.weight + weight
 if left_weight <= max_weight: # if not overweighted, give left child
 left_benefit = self.benefit + value
 left_token = self.token[:self.level]+[1]+self.token[level:]
 left_child = State(level, left_benefit, left_weight, left_token)
 else:
 left_child = None
 # anyway, give right child
 right_child = State(level, self.benefit, self.weight, self.token)
 return ([] if left_child is None else [left_child]) + [right_child]
Root = State(0, 0, 0, [0] * len(data_sorted)) # start with nothing
waiting_States = [] # list of States waiting to be explored
current_state = Root
while current_state.level < len(data_sorted):
 waiting_States.extend(current_state.develop())
 # sort the waiting list based on their upperbound
 waiting_States.sort(key=lambda x: x.ub)
 # explore the one with largest upperbound
 current_state = waiting_States.pop()
best_item = [item for tok, (item, _, _)
 in zip(current_state.token, data_sorted) if tok == 1]
print "Total weight: ", current_state.weight
print "Total Value: ", current_state.benefit
print "Items:", best_item

There are still many things to be improved but I guess it gives you a better basis to start with.

As a take-away advice : almost every time you deal with indices in Python, you are doing it wrong. Using the proper data structure and the proper method can lead you back to the right way.

Edit : A few things I have forgotten :

maybe available does not need to be part of the instances and could be provided directly to upperbound which does not need to be an instance method anymore.
in upperbound, max_weight - weight_accumulated is a more interesting value than each of max_weight and weight_accumulated individually. Maybe it could make sens to work directly on that value.

Then you'd get something like :

class State(object):
 def __init__(self, level, benefit, weight, token):
 # token = list marking if a task is token. ex. [1, 0, 0] means
 # item0 token, item1 non-token, item2 non-token
 # available = list marking all tasks available, i.e. not explored yet
 self.level = level
 self.benefit = benefit
 self.weight = weight
 self.token = token
 self.ub = State.upperbound(self.token[:self.level]+[1]*(len(data_sorted)-level))
 @staticmethod
 def upperbound(available): # define upperbound using fractional knaksack
 upperbound = 0 # initial upperbound
 # accumulated weight used to stop the upperbound summation
 remaining = max_weight
 for avail, (_, wei, val) in zip(available, data_sorted):
 wei2 = wei * avail # i could not find a better name
 if wei2 <= remaining:
 remaining -= wei2
 upperbound += val * avail
 else:
 upperbound += val * remaining / wei2
 break
 return upperbound

Thank you very much Josay! It is exactly what I was looking for, to solve the problem very elegantly and in a pythonic way.

Stack Exchange Network

Python Knapsack problem using branch and bound algorithm

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Python Knapsack problem using branch and bound algorithm

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions