Monty Hall Optimization in Python 3.5

Question 1

This is for homework but the deadline is over and I know the implementation is correct because it concurs with the math. At this point I just want to check my implementation for potential optimizations in terms of speed, memory usage, clarity, modularity, and anything else you guys might think of. (I know there is generally a trade-off between computation speed and memory usage, so if you have comments on that I appreciate any insight you might have)

The specs of the function are that it should be flexible as to how many doors and trials we want, and whether or not we're switching doors. We also assume that the guest always chooses the first door. This is the only non-variable component we're allowed.

For example, would using np.array make the code faster than using lists? I didn't think it'd make a difference because they're very small, but that could be wrong.

In general, what could I do better?

def monty_hall (num_doors, trials, switch):
 #count number of successful trials
 successes = 0
 for i in range(trials):
 #determine behind which door the prize is
 doors = [i for i in range(1, num_doors+1)]
 prize_door = np.random.choice(doors)
 #determine which door the host opens if guest switches
 host_doors = [i for i in range(2, num_doors+1)]
 if prize_door != 1: 
 host_doors.remove(prize_door)
 host_choice = np.random.choice(host_doors)
 #determine which door the guest chooses if switching
 doors.remove(1); doors.remove(host_choice)
 guest_switch = np.random.choice(doors)
 #we assume the guest is always choosing the first door
 if prize_door == 1 and switch == 0:
 successes += 1
 elif prize_door != 1 and switch: 
 if guest_switch == prize_door:
 successes += 1
 probability = successes / trials
 return probability

Question 2

everything looks pretty pythonic and linear to me. But is there any reason you used numpy random choice rather than the builtin python random choice function?

Question 3

@Mr.goosberry, thank you; as far as np.random I just go with the assumption that the numpy library is better optimized than the builtin python math functions, mostly because I've been told so, and since I don't know enough about the low-level implementation it seems like the safe thing to do. That's why I was asking about np.array instead of lists also.

Question 4

Also a short description of the Monty Hall problem would help, because I got confused for a while when the host opens a door and when the user is allowed to switch.

Question 5

If you were using python 2.x, there would be no need to do doors = [i for i in range(1, num_doors+1)], because range just returns a list. So just do:

doors = range(1, num_doors+1)

(similar for host_doors).

Since you are using python 3.x, range will return a range object, but you can just call list on it:

doors = list(range(1, num_doors+1))

Minor point: there is no need for the intermediate variable probability, just return successes / trials.

Python has an official style-guide, PEP8, which recommends not having unnecessary blank lines (such as between your if and else block).

Also, you should add one space in front between # and the actual comment.

It is also customary to use _ as a loop variable that is unused.

You can make your code slightly more efficient by only drawing the user's switched choice when actually switching:

def monty_hall (num_doors, trials, switch):
 successes = 0
 for _ in range(trials):
 # determine behind which door the prize is
 doors = list(range(1, num_doors+1))
 prize_door = np.random.choice(doors)
 # door the guest chooses. (Fixed to 1)
 guest_choice = 1
 doors.remove(guest_choice)
 # determine which door the host opens before guest is allowed to switch
 # this will be faster for larger lists than copying the `doors` list
 host_choice = prize_door
 while host_choice == prize_door:
 host_choice = np.random.choice(doors)
 doors.remove(host_choice)
 # determine which door the guest chooses if switching
 if switch:
 guest_choice = np.random.choice(doors)
 # check win condition
 if guest_choice == prize_door:
 successes += 1
 return successes / trials

It might actually be faster (for larger number of doors it will be), not to copy the doors for the host choice, but just take a random door from doors and take another one if it is the prize_door:

host_choice = prize_door
while host_choice == prize_door:
 host_choice = np.random.choice(doors)
doors.remove(host_choice)

For small lists this will be better:

host_doors = doors[:]
host_doors.remove(prize_door)
host_choice = np.random.choice(host_doors)
doors.remove(host_choice)

Overall the most costly part of this code is probably the removes, which can be quite costly for large lists if removing near the beginning (like for guest_choice). It would be better to store the doors as a set (for which deleting is O(1)). But that makes choosing one door harder (you have to do np.random.choice(tuple(doors)) everytime you draw). You should time both approaches and see what works best for your data.

Question 6

Thank you for your answer. I wrote python 3.5 in the title but I could've mentioned it again in the body of the question. I didn't know python 2.x's range returned a list, I guess once i got comfortable with list comprehension I just started using it everywhere. Algorithmically speaking, is there anything I could optimize? For example, redudant logic or something long those lines. Or maybe something better than removing items from a list?

Question 7

@jeremyradcliff Oops, not very attentive today. Yeah, there is some minor changes I would propose. I'll edit them in in a sec.

Question 8

Thank you very much for adding those details, I'm going to time the different approaches you mention and see which one is better.

Graipher Graipher 41.6k7 gold badges70 silver badges134 bronze badges · Accepted Answer · 2016-09-13 07:51:10Z

If you were using python 2.x, there would be no need to do doors = [i for i in range(1, num_doors+1)], because range just returns a list. So just do:

doors = range(1, num_doors+1)

(similar for host_doors).

Since you are using python 3.x, range will return a range object, but you can just call list on it:

doors = list(range(1, num_doors+1))

Minor point: there is no need for the intermediate variable probability, just return successes / trials.

Python has an official style-guide, PEP8, which recommends not having unnecessary blank lines (such as between your if and else block).

Also, you should add one space in front between # and the actual comment.

It is also customary to use _ as a loop variable that is unused.

You can make your code slightly more efficient by only drawing the user's switched choice when actually switching:

def monty_hall (num_doors, trials, switch):
 successes = 0
 for _ in range(trials):
 # determine behind which door the prize is
 doors = list(range(1, num_doors+1))
 prize_door = np.random.choice(doors)
 # door the guest chooses. (Fixed to 1)
 guest_choice = 1
 doors.remove(guest_choice)
 # determine which door the host opens before guest is allowed to switch
 # this will be faster for larger lists than copying the `doors` list
 host_choice = prize_door
 while host_choice == prize_door:
 host_choice = np.random.choice(doors)
 doors.remove(host_choice)
 # determine which door the guest chooses if switching
 if switch:
 guest_choice = np.random.choice(doors)
 # check win condition
 if guest_choice == prize_door:
 successes += 1
 return successes / trials

It might actually be faster (for larger number of doors it will be), not to copy the doors for the host choice, but just take a random door from doors and take another one if it is the prize_door:

host_choice = prize_door
while host_choice == prize_door:
 host_choice = np.random.choice(doors)
doors.remove(host_choice)

For small lists this will be better:

host_doors = doors[:]
host_doors.remove(prize_door)
host_choice = np.random.choice(host_doors)
doors.remove(host_choice)

Overall the most costly part of this code is probably the removes, which can be quite costly for large lists if removing near the beginning (like for guest_choice). It would be better to store the doors as a set (for which deleting is O(1)). But that makes choosing one door harder (you have to do np.random.choice(tuple(doors)) everytime you draw). You should time both approaches and see what works best for your data.

Thank you for your answer. I wrote python 3.5 in the title but I could've mentioned it again in the body of the question. I didn't know python 2.x's range returned a list, I guess once i got comfortable with list comprehension I just started using it everywhere. Algorithmically speaking, is there anything I could optimize? For example, redudant logic or something long those lines. Or maybe something better than removing items from a list?
@jeremyradcliff Oops, not very attentive today. Yeah, there is some minor changes I would propose. I'll edit them in in a sec.
Thank you very much for adding those details, I'm going to time the different approaches you mention and see which one is better.

Stack Exchange Network

Monty Hall Optimization in Python 3.5

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Monty Hall Optimization in Python 3.5

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions