Presentation is loading. Please wait.

Presentation is loading. Please wait.

Barry L. Nelson Northwestern University

Similar presentations


Presentation on theme: "Barry L. Nelson Northwestern University"— Presentation transcript:

1 Barry L. Nelson Northwestern University
…adapted from… The MORE Plot: Displaying Measures Of Risk & Error from Simulation Output Barry L. Nelson Northwestern University

2 The risk myth: No one will understand
The answer goes something like this: Remember statistics class? I didn’t think so. You can argue about whether statistics is just hard, or the way we teach it makes it hard, but either way the main messages get lost. READ DILBERT…. For simulation I think the most important message that gets lost is the difference between risk and error. But I can make you an expert in 5 minutes, then show you way it is so important.

3 Risk for the masses Likely Unlikely 273 199 378
Suppose we have run a simulation and one of the outputs is the number of barrels, in thousands, of a particular chemical that we need annually. This number depends on a complex host of things: demand for our product, yield loss, etc. We might be interested in how much to stock or on whether we should pay for an option to get more at a fixed price later in the year. We get a histogram because we simulated yearly need for this chemical, and simulated many years, just like we simulated many games of Jai Alai. Just like in Jai Alai, there are at least two questions: How many barrels should we expect to use, and have we done enough simulation to really answer that question? Humans love to average so let’s drop in the sample average. Clearly we could need much more or much less than the average, so let’s also mark off a big chunk of the possible need, and label it in an easy to understand way. Right away we get what I believe is the important insight: that the future is pretty variable and our needs can be within a wide range. In baseball, a player’s batting average last year is a meaningful historical statistic. But a simulation is not trying to create history, most often it is trying to say something about what will happen in the future and whether we can live with that; the average doesn’t tell us. But have we done enough simulation to be confident in making any decision yet? As a final embellishment let’s put in a measure of error on each of those arrow heads. These say that we are highly confident the arrow head belongs SOMEWHERE in each interval, we just aren’t sure where. Do you think we done enough simulation to make a decision yet? So just like in the Jai Alai simulation, let’s do some more runs.

4 Risk for the masses Likely Unlikely 273 199 378 5th percentile of
the observed data 95th percentile of the observed data Suppose we have run a simulation and one of the outputs is the number of barrels, in thousands, of a particular chemical that we need annually. This number depends on a complex host of things: demand for our product, yield loss, etc. We might be interested in how much to stock or on whether we should pay for an option to get more at a fixed price later in the year. We get a histogram because we simulated yearly need for this chemical, and simulated many years, just like we simulated many games of Jai Alai. Just like in Jai Alai, there are at least two questions: How many barrels should we expect to use, and have we done enough simulation to really answer that question? Humans love to average so let’s drop in the sample average. Clearly we could need much more or much less than the average, so let’s also mark off a big chunk of the possible need, and label it in an easy to understand way. Right away we get what I believe is the important insight: that the future is pretty variable and our needs can be within a wide range. In baseball, a player’s batting average last year is a meaningful historical statistic. But a simulation is not trying to create history, most often it is trying to say something about what will happen in the future and whether we can live with that; the average doesn’t tell us. But have we done enough simulation to be confident in making any decision yet? As a final embellishment let’s put in a measure of error on each of those arrow heads. These say that we are highly confident the arrow head belongs SOMEWHERE in each interval, we just aren’t sure where. Do you think we done enough simulation to make a decision yet? So just like in the Jai Alai simulation, let’s do some more runs.

5 Risk for the masses Likely Unlikely 273 199 378 5th percentile of
the observed data 95th percentile of the observed data Suppose we have run a simulation and one of the outputs is the number of barrels, in thousands, of a particular chemical that we need annually. This number depends on a complex host of things: demand for our product, yield loss, etc. We might be interested in how much to stock or on whether we should pay for an option to get more at a fixed price later in the year. We get a histogram because we simulated yearly need for this chemical, and simulated many years, just like we simulated many games of Jai Alai. Just like in Jai Alai, there are at least two questions: How many barrels should we expect to use, and have we done enough simulation to really answer that question? Humans love to average so let’s drop in the sample average. Clearly we could need much more or much less than the average, so let’s also mark off a big chunk of the possible need, and label it in an easy to understand way. Right away we get what I believe is the important insight: that the future is pretty variable and our needs can be within a wide range. In baseball, a player’s batting average last year is a meaningful historical statistic. But a simulation is not trying to create history, most often it is trying to say something about what will happen in the future and whether we can live with that; the average doesn’t tell us. But have we done enough simulation to be confident in making any decision yet? As a final embellishment let’s put in a measure of error on each of those arrow heads. These say that we are highly confident the arrow head belongs SOMEWHERE in each interval, we just aren’t sure where. Do you think we done enough simulation to make a decision yet? So just like in the Jai Alai simulation, let’s do some more runs. Confidence interval (95%) for the 5th percentile Confidence interval (95%) for the 95th percentile

6 Nelson’s Method: What does this really produce?
Build the confidence interval for the b-th percentile using the OBSERVED b1(b2) percentile

7 Taking a Look n lower sample upper 100 0.0071 1 0.0929 9 500 0.0309 15 0.0691 35 1000 0.0365 36 0.0635 64 2000 0.0404 81 0.0596 119 10000 0.0457 457 0.0543 543

8 As we simulate more… 280 Don’t move much 190 380
Here is the result we get if we run the simulation for many more years. Notice that uncertainty about the future does not disappear; you can’t simulate away risk. But we do improve our estimate of future uncertainty by running the simulation longer. With this information we can balance the various costs associated with the decision and do something rational. “Use MOE to get MOR” means use measures of error to get measures of risk. The big box is a measure of future risk, and that is probably what you need to support your decision. The little intervals are measure of error; they tell us if we have done enough simulation. BTW, nothing beyond Stat 100 was used to build this plot. Now my research colleagues in the audience are very nervous. They have questions like what did you assume about the data? What if I am only interested in upside risk? What if I want to change the definition of “likely?” All valid questions that entirely miss the point: If a plot like this was our default, our starting point in displaying simulation output, then it would encourage people to consider risk and help them decide when they have a good estimate of it. 190 280 380 Interval widths shrink

9 What happens when you simulate more?
Let’s do a little bit more practice before looking at an example. Here are results from a simulation of an order fulfillment system looking at the time from order receipt to delivery, which I have called “cycle time.” We want to decide how long to promise when we take orders on our web site so we have very little chance of being late. And just like in the eye doctor’s office, will ask you if each chart is better or worse. The first plot is the initial simulation we ran. Have we run long enough yet? What did you look at to decide a promise date? How about now? Now? What would you promise? I hope I have convinced you that the idea that no one can understand risk, and how it relates to how long we ran the simulation, is a myth. But I haven’t answered the original question: How much risk could an IE miss if an IE did miss risk?

10 What sells? Here is one thing I learned about the magazine business: Who is on the cover matters for some titles. If, for instance, you get this cover [Aniston], then the magazines fly out of the pockets. But this cover [Fowler], not so much. And because it is the same cover at all stores, pretty much the same thing happens at all stores for these kinds of titles. Thus there can be big system-wide swings in sales. Where does that get reflected in the long-run average profit? Here are simulation results for two titles with the same weekly demand distribution, except that one of them does not have this common cover effect while the other does. You are now all experts at looking at risk, but just to help a bit more I’ll drop in a highlight at 0 profit. Both titles would end up with the same weekly stocking quantity, as they should, since their long-run average profits store by store are the same. But there is a lot more cash flow risk when there is a common cover effect. If I am unprepared for these big swings, if they are unexpected, then a few bad weeks might cause me to quickly abandon my “optimal” stocking policy thinking it must be wrong. And that could be a big mistake, particularly if I use an ad hoc fix that results in an unknown loss of potential profit over the long run. That is a perfectly good simulation gone bad because we did not also measure risk.

11 What sells? Here is one thing I learned about the magazine business: Who is on the cover matters for some titles. If, for instance, you get this cover [Aniston], then the magazines fly out of the pockets. But this cover [Fowler], not so much. And because it is the same cover at all stores, pretty much the same thing happens at all stores for these kinds of titles. Thus there can be big system-wide swings in sales. Where does that get reflected in the long-run average profit? Here are simulation results for two titles with the same weekly demand distribution, except that one of them does not have this common cover effect while the other does. You are now all experts at looking at risk, but just to help a bit more I’ll drop in a highlight at 0 profit. Both titles would end up with the same weekly stocking quantity, as they should, since their long-run average profits store by store are the same. But there is a lot more cash flow risk when there is a common cover effect. If I am unprepared for these big swings, if they are unexpected, then a few bad weeks might cause me to quickly abandon my “optimal” stocking policy thinking it must be wrong. And that could be a big mistake, particularly if I use an ad hoc fix that results in an unknown loss of potential profit over the long run. That is a perfectly good simulation gone bad because we did not also measure risk. 11

12 What is our estimate of the likelihood that the next observation falls within the endpoints? (Called a prediction interval) May be worth noting that SLAM used to display the entire empirical has a default display sort of like this, except no CI’s on the mean and %tiles.

13 CONCLUSIONS MORE displays difference between a CI on the mean and a Prediction Interval (for subsequent observations) MORE shows effects of simulation sample size on predictors Precise probabilistic statements about the values calculated are elusive


Download ppt "Barry L. Nelson Northwestern University"

Similar presentations


Ads by Google