One of our most popular blog posts and Youtube videos (by far) is how to do Monte Carlo simulation in Excel. In the blog post we go through a lot of detail about what Monte Carlo Simulation is. If you are interested in reading that first, then you can find the blog post on Monte Carlo Simulation in Excel here. In the post we use a normal bell curve to determine the spread of different probabilities.
But that brings up lots of questions. Including this one.
Or this one...
Thanks for the question Dave & Andy. Here is your article and video.
What is a Poisson Distribution Anyways
Thought you would never ask.
Confusing enough? Yeah... me too 🙂
Here is what you need to know.
x = 0, 1, 2, 3, ... means that our data points are "count" data. That means that it is positive integers. This is also known as discrete data. The data points go from 0 to infinity.
lambda is simply the average
Poisson was a French mathematician, and amongst the many contributions he made, proposed the Poisson distribution, with the example of modelling the number of soldiers accidentally injured or killed from kicks by horses. This distribution became useful as it models events, particularly uncommon events.
So... Let's Set Up the Model
Step 1 - Your Data
As I mentioned earlier, the biggest data points that drive the Poisson distribution is the Count Data and the Mean. Recognize that this is the implied mean that you want to use to build the hypothetical curve.
Step 2a - Poisson Distribution Curve
There are two steps here. First is setting up the Poisson Distribution Cumulative curve. And then adjusting for boundaries. Which is 0% and 100%.
So let's dive in. The Poisson Distribution curve is set up this way. The "TRUE" clause is to set this as being cumulative. Copy and paste down.
Step 2b - Adjust for Boundaries
Here is the thing... The only reason that we reached 100% on line 45 or so was because of rounding. If we had let this go to more and more decimal points, it wouldn't be 100%. Also, there is no zero 0%. We need bookends for our VLOOKUP. 0% to 100%. We are looking up probability values and there is no exact match. So we adjust. Like a facebook meme by your ex... adjusting her crown. meh. 🙂
Step 3 - Monte Carlo Simulation - Finally
Next up we set up 1,000 iterations. Realize that this could have been 40 iterations or 1,000,000 iterations, I just chose 1,000. Monte Carlo was driven out of modeling in Vegas. Probabilities aren't a straight line. It's not that you lose 60% every time. You go cold for a bit, then hot for a bit. They need to figure out risks hour by hour.. So they can figure out the risks for the long haul.
So you may ask. Yo Rick? (I reach back to my "yo" youth when I envision people reading my shit)... So yeah... "Yo Rick... why is this really useful when I have figured out the probabilities in everything you have shared so far"?
Wow. That was bold. Almost rude. It's like you were speaking to my soul.
Here is why. Various items in your model may include various distribution curves. Repeat visitors in a month may be a Poisson distribution. This may drive into other costs that may be a uniform distribution. Fixed costs are fixed. But Revenue may be a triangle distribution. You model all of these together and get a true feeling of the probability of you being profitable or not. Model this day over day, week over week, or month over month to get a clearer idea of the volatility of your model or business.
That's the power. Yahtzee !!!
So let's move on.
Use the VLOOKUP() function looks up the probabilities based on the RAND() random function that brings back a random number between zero and one.
Step 4 - Summary Statistics
So while I am typing this... I am watching some Netflix something or other. Girl on TV says... "there is a couch in my room. Stay there until you pass out." War dream quickly follows. Sounds like a good choice. So I will wrap this up.
So where do you go from here? You have a Poisson curve from the mean and count data. How do you summarize all of this in a way that is meaningful? First off, there are a few things that are obvious when they are stated, but they need to be stated in order for them to be obvious. Like.. if you are running 1k observations... then everything will end up close to the average, Rick... so why not use the average?
Because you may not have 1k customers a day, week, month. And even your customer count per day, week, month may not be 1k. You will have variation that is explainable. Risk that is explainable. Financial and operational modelling that can be wiser. As a financial or Excel modeller. A person that slings data for a living. There are intricacies within every business problem that requires nimbleness and readiness.
That's who most of us are. Slinging data for a living. So let's summarize
So that's it. Run some summary statistics on the 1k or so iterations. Because RAND() is a volatile function, it will update every time you press the enter button. Also, think about how this will fit you business case.
Is this interesting? Like more of these long form text/video posts? Jordan and I are considering a monthly membership program where we share this info "behind the wall". Let me know if you like this stuff.