Oct. 25, 2020
Author - manisar
Though generally described for 3 doors, the more generalized version as explained in this page can help us understand why it works.
In the tool given on here, we can modify the total number of doors (n), the doors selected by the host (y), then the doors opened from those selected doors (x), and the total number of winning doors (t).
The Monty Hall problem is a very simple yet very interesting paradoxical question. I gave the wrong answer when I came across it for the first time.
I decided to dive deeper, and came up with this generalized tool that makes the computer play against itself thousands of times, and visually shows its results for different number of doors and the different strategies. By tweaking the number of total, opened and winning doors below and looking at the doors and the results, we get both the proof and a feel of why it must be true. If you want to look at the equations first, check The Monty Hall Paradox Probability Equations.
A quick recap from Wikipedia...
The Monty Hall problem is a brain teaser, in the form of a probability puzzle, loosely based on the American television game show Let's Make a Deal and named after its original host, Monty Hall.
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?
So, basically there are three strategies for the player to select from (have a look at The General Case on this page).
If you hit the Play button below, the background python service plays this game applying all the three strategies separately. It does 100 runs of 1000 iterations each, and shows the winning frequency according to all the three strategies. Try it!
Run it for different numbers of doors as well. After a few runs and comparing the results between the three strategies in each run, we start to see a pattern in their relative strengths (e.g. strategy 3 winning probability is directly proportional to the size of the y-group (brown doors) as a whole - the bigger the y-group, the more are its winning chances).
See if you can start predicting the relative ratio of wins in the three strategies. It's even possible to generalize the formulae to include multiple winning doors (t). Elsewhere on this page, you'll find the formulae for finding this ratio for any n, y, x and t.
|Description of Doors|
|Door Opened by Host||Door Selected by Host||Door Selected by Player||Door not Selected by Host or Player|
|Strategy 1 Wins|
|Strategy 2 Wins|
|Strategy 3 Wins|
|Strategy 1 Wins:Strategy 2 Wins:Strategy 3 Wins|
The key is to identify the fact that the non-winning door is not opened from all the doors at random, instead it is opened (and thus removed) from a specific proper subset of two doors that excludes the player's initial choice. This gives that set an unfair advantage, and that is what this is all about.
For more insight, read the If You are still Confused section in The Monty Hall Paradox Probability Equations.
By generalizing this problem for the number of doors, we'll see how we can get a feel of how it works.
If, instead of 3 doors, let's say we have n doors. And the player selects one door. Now, before opening this door, the host selects a subset of doors from n (say, y) and then opens x doors from these y doors. Now the question for the player, in the general form, would be to select one from the following three strategies:
By looking at 2 and 3 above, it becomes quite clear why 3 is better than 2. It's simply because y-x is a smaller number than n-x, and hence selecting a door from y-x doors will have greater chance of winning.
But why is 1 the poorest choice?
The answer is that a group of doors (here, y) has a bigger probability of winning than a single door. The group y has a winning chance of y/n, as compared to 1/n for a single door. Now, if no door has been opened in y, within y, a single door within y has a winning probability of 1/y. So, the combined winning probability of a door in y becomes y/n * 1/y = 1/n. So far so good.
But if x doors have been opened from y doors with no prizes behind them, all the winning probability of the y group now comes to the remaining y-x doors in y. So, now, the winning probability of a single door in y becomes y/n * 1/(y-x) = (y/(y-x)) * 1/n. We see that this number is bigger than 1/n. Hence, strategy 3 is better than 1.
Finally, if the user just randomly bets on a door from all the remaining unopened doors (strategy 2), his chances are simply 1/n-x. A little algebra between (y/(y-x)) * 1/n, 1/n-x and 1/n shows the relative magnitude of each, which is : Strategy 1 < Strategy 2 < Strategy 3. In fact, I'm leaving this as an exercise for the reader to verify that the actual winning ratio among the three strategies as shown by the tool is same as 1/n : 1/n-x : (y/(y-x)) * 1/n (for number of winning doors t > 1, t replaces 1 in the numerator of each of these). Or, if you are in a hurry, have a quick look here.
If you make n = 10 and y = 9, and x = 8 (with t = 1) in the tool on this page, you'll clearly see why selecting a door from the y subset is the smartest choice! And then hit Play to seal the deal.