Copyright 2017 Jason Ross, All Rights Reserved

User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
 
Here's what you could have won!

The Problem

You are a contestant on a game show, and are allowed to select one of three doors. Behind one of the doors there is a car that you can win, and behind the other two doors are goats (You don’t get to keep the goats!). You choose a door and then the host, who knows what is behind every door, opens another door to reveal a goat. You are then given the chance to change your selection to the remaining door.

Should you stick with your original selection, or choose the other door instead?

A Little Background

The Monty Hall Problem is a classic probability problem based on the US TV programme “Let’s Make A Deal”, hosted by Monty Hall. It was originally posed in the “American Statistician” in 1975, and became famous when it appeared in Marilyn vos Savant’s “Ask Marilyn” column in the US “Parade” magazine in 1990. I first saw it as the “Three Prisoners Problem” in one of Martin Gardner’s many books, which I’ve been a fan of since I discovered one in a book store when I was about 12. I’m an engineer – this sort of thing is normal for me!

Looking A Little Closer

There are some other conditions that we need to clarify about the problem:

  • The host knows where the car is, and will always open a door that reveals a goat.
  • The host will never open the door that you’ve chosen.
  • The host always gives you the chance to change your selection to the door that remains closed.

Initially it doesn’t seem like changing will make any difference, after all your chances to start with were 1:3, and now that there are only two boxes left it seems like the chances of winning the car are probably 1:2 whichever box you choose. Why would the host showing you a door that has a goat behind it affect what is behind the door you’ve already chosen? This is what I thought when I first read the puzzle, before school had taught me how probability can get very strange very quickly.

In Marilyn vos Savant’s original article she argued that you should choose the other door, and by doing so you would improve your chances of winning the car to 2:3. This sounds somewhat counter-intuitive if not just plain wrong, and apparently she received thousands of letters saying she was wrong. If you look through the Wikipedia article on the puzzle at https://en.wikipedia.org/wiki/Monty_Hall_problem you’ll see a variety of ways of calculating the probability ranging from simple solutions through to Bayesian analysis. The way I was taught was pretty much the simple solution:

With your original selection, you have a 1:3 chance of winning the car. The other two doors have a 2:3 chance of concealing the car.

When one of those two doors is opened to reveal a goat, it doesn’t change the probability: the door you originally chose still has a 1:3 chance of being the door with the car. The other two (one of which is now open) still have a 2:3 chance of concealing the car. But of those two, one door is open and that doesn’t have the car behind it, so there’s a 2:3 chance that the other door is the one concealing the car. Therefore changing your selection increases your odds to 2:3.

Conclusion. Maybe.

So that’s that, you should always change your selection and then the chance of winning the car is 2:3.

Or is it? Maybe you’re still not convinced – it took me a while – and logical arguments are all very well, but a practical demonstration is always nice if only to prove the theory correct.

The best demonstration might be to repeatedly simulate the whole process of selecting one door, opening another and then seeing how often the contestant wins the car when they keep the same selection. Then we can run the simulation again altering it slightly so that the contestant changes their selection.

There are a couple of other things to consider for the simulation: firstly it needs to be run somewhere more powerful than the web server you’re reading this on. Secondly it should be run in a way so that this web site can’t influence the results. Luckily there’s a way to manage this; JavaScript. Running a script in the browser relieves the web server of having to do the job, and if the script is on a remote machine it’s not affected by the server.

The script below implements the simulation and, assuming you let this site run JavaScript, will run when you press the "Run Simulation" button further down the page:


function chooseRandomIndex() {
    // Choose a random index in the range 0 to 2 inclusive.
    return Math.floor(Math.random() * 3);
}

function runSingleSimulation(changeSelection) {
    // Run the simulation once. If "changeSelection" is true, then the contestant will
    // change their mind after the host opens a door. If it is false, the contestant
    // will stick with the same door.

    // Decide where the car is hidden.
    var carIndex = chooseRandomIndex();

    // Let the "contestant" choose a door.
    var userIndex = chooseRandomIndex();

    // Choose a door to open to the contestant. This can't be either:
    // 1. The door with the car behind it
    // or
    // 2. The door that the contestant has chosen
    //
    // To do this, create an array of the possible indexes, then filter
    // out the ones we can't use.
    var validOpenDoorIndexes = [0, 1, 2].filter(value => {
        return (value != carIndex) && (value != userIndex);
    });

    // Now choose a door at random to open to the contestant
    var openedDoorIndex = validOpenDoorIndexes[Math.floor(Math.random() * validOpenDoorIndexes.length)];

    // If the contestant is changing their mind, implement that here
    if (changeSelection) {
        // There is only one door left that the contestant can choose.
        // This is the door that is neither:
        // 1. The door that the contestant initially chose
        // nor
        // 2. The door that was opened
        var remainingDoorIndex = [0, 1, 2].filter(value => {
            return (value != userIndex) && (value != openedDoorIndex);
        })[0];

        userIndex = remainingDoorIndex;
    }

    // Return whether the contestant won the car (true) or lost (false).
    return (userIndex == carIndex);
}

function runMontyHallSimulation(timesToRun, changeSelection) {
    var wins = 0;

    for (var count = 0; count < timesToRun; ++count) {
        if (runSingleSimulation(changeSelection)) {
            ++wins;
        }
    }

    return wins;
}

As you can see from the comments most of the simulation is in the runSingleSimulation function, which follows these steps:

  1. Randomly choose the door that has the car behind it.
  2. Let the contestant choose a door. This is done using a random number generator.
  3. Choose the door to open to the contestant. This can’t be the door with the car, or the door that the contestant has chosen. If the user has already chosen the door with the car, then there will be two possible doors that can be opened. In this case chose a door at random from these two, otherwise open the only door that fits these criteria.
  4. Allow the contestant to change their selection. This is done by passing a parameter to indicate whether the user always changes their choice, or always sticks with their original choice. If the user in changing their selection, calculate the one remaining door that they can choose, and set their selection to that.

The runSingleSimulation function is called multiple times from runMontyHallSimulation which is configured, called and timed from the start function, and called when the page loads. The same number of simulations are run where the contestant changes their selection as are run where the contestant sticks with their original selection.

There are a few other functions in the script, but they’re utility functions geared around updating the page when the simulations are complete.

Remember this script is running on your browser – all of the selections are being done by your random number generator, specific to your operating system and browser. This site’s web server knows nothing about what happens on your machine – it’s not using cookies or anything else – and it’s certainly not controlling anything beyond the regular JavaScript. Refresh the page and the counts will change – it’s all done on your machine.

The Simulation

Contestant changed their choice Contestant DID NOT change their choice
Total simulation count:
Games won: ( %) ( %)
Total duration to run simulation: ms ms

Summary

There are bound to be some who will look at the results and say “That’s just anecdotal evidence, and the plural of anecdote is not data!”, but then I was taught that 10000 tests is enough for statistical significance, this code runs 200000 tests, and that quote is wrong anyway (Dan Nguyen’s blog shows the original).

The most important thing is that the evidence provided by the simulation seems to confirm the theory that changing the selected door increases the chance of winning the car to 2:3. If the results were drastically different, that would point to either the theory or the simulation being wrong, and would have needed further investigation. In either case the simulation would have provided value, and saves a lot of time over the alternative of actually playing the game.

When you’re faced with a situation where you’re not sure of the potential outcomes, it’s often best to write a simulation in software. You can re-run the simulation easily and quickly, making changes to see what effect external factors have. Simulations can range in complexity from a coin toss to financial predictions, and they're almost always an easier and cheaper way than just proceeding with no idea of the possibilities. Just don’t expect to win a car, or even a goat.