Output Data Analysis. Introduction Most serious disadvantage of simulation –With a stochastic simulation, you don ’ t get exact “ answers ” from a run.

Slides:



Advertisements
Similar presentations
Introduction into Simulation Basic Simulation Modeling.
Advertisements

Estimating a Population Proportion
ETM 607 – Output Analysis: Estimation of Absolute Performance
Modeling & Simulation. System Models and Simulation Framework for Modeling and Simulation The framework defines the entities and their Relationships that.
Queueing Models and Ergodicity. 2 Purpose Simulation is often used in the analysis of queueing models. A simple but typical queueing model: Queueing models.
G. Alonso, D. Kossmann Systems Group
 1  Outline  Model  problem statement  detailed ARENA model  model technique  Output Analysis.
Output analyses for single system
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Output Data Analysis. How to analyze simulation data? simulation –computer based statistical sampling experiment –estimates are just particular realizations.
Analysis of Simulation Experiments
Output Analysis and Experimentation for Systems Simulation.
#9 SIMULATION OUTPUT ANALYSIS Systems Fall 2000 Instructor: Peter M. Hahn
Lecture 9 Output Analysis for a Single Model. 2  Output analysis is the examination of data generated by a simulation.  Its purpose is to predict the.
1 Validation and Verification of Simulation Models.
4. Review of Basic Probability and Statistics
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling.
Inferences About Process Quality
1 Simulation Modeling and Analysis Output Analysis.
Lab 01 Fundamentals SE 405 Discrete Event Simulation
Monté Carlo Simulation MGS 3100 – Chapter 9. Simulation Defined A computer-based model used to run experiments on a real system.  Typically done on a.
Graduate Program in Engineering and Technology Management
1 Terminating Statistical Analysis By Dr. Jason Merrick.
Analysis of Simulation Results Andy Wang CIS Computer Systems Performance Analysis.
1 1 Slide Statistical Inference n We have used probability to model the uncertainty observed in real life situations. n We can also the tools of probability.
Simulation Output Analysis
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Chapter 6 Statistical Analysis of Output from Terminating Simulations.
Fall CSC 446/546 Part 10: Estimation of Absolute Performance.
SIMULATION MODELING AND ANALYSIS WITH ARENA
 1  Outline  stages and topics in simulation  generation of random variates.
Steady-State Statistical Analysis By Dr. Jason Merrick.
Verification & Validation
Chapter 4 – Modeling Basic Operations and Inputs  Structural modeling: what we’ve done so far ◦ Logical aspects – entities, resources, paths, etc. 
Chapter 11 Output Analysis for a Single Model Banks, Carson, Nelson & Nicol Discrete-Event System Simulation.
Random Sampling, Point Estimation and Maximum Likelihood.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Chapter 11 Output Analysis for a Single Model Banks, Carson, Nelson & Nicol Discrete-Event System Simulation.
MS 305 Recitation 11 Output Analysis I
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
1 Statistical Distribution Fitting Dr. Jason Merrick.
Analysis of Simulation Results Chapter 25. Overview  Analysis of Simulation Results  Model Verification Techniques  Model Validation Techniques  Transient.
0 K. Salah 2. Review of Probability and Statistics Refs: Law & Kelton, Chapter 4.
10.1: Confidence Intervals – The Basics. Introduction Is caffeine dependence real? What proportion of college students engage in binge drinking? How do.
Chapter 2 – Fundamental Simulation ConceptsSlide 1 of 46 Chapter 2 Fundamental Simulation Concepts.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
1 Terminating Statistical Analysis By Dr. Jason Merrick.
SIMULATION OF A SINGLE-SERVER QUEUEING SYSTEM
Chapter 10 Verification and Validation of Simulation Models
Simulation & Confidence Intervals COMP5416 Advanced Network Technologies.
Network Simulation Motivation: r learn fundamentals of evaluating network performance via simulation Overview: r fundamentals of discrete event simulation.
1 OUTPUT ANALYSIS FOR SIMULATIONS. 2 Introduction Analysis of One System Terminating vs. Steady-State Simulations Analysis of Terminating Simulations.
Sampling and estimation Petter Mostad
Output Analysis for Simulation
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Building Valid, Credible & Appropriately Detailed Simulation Models
Chapter 1 What is Simulation?. Fall 2001 IMSE643 Industrial Simulation What’s Simulation? Simulation – A broad collection of methods and applications.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
K. Salahpp.1 Chapter 9 Output Analysis for Single Systems.
Sampling Distributions and Estimation
CPSC 531: System Modeling and Simulation
Chapter 10 Verification and Validation of Simulation Models
Statistical Methods Carey Williamson Department of Computer Science
Carey Williamson Department of Computer Science University of Calgary
MECH 3550 : Simulation & Visualization
MECH 3550 : Simulation & Visualization
Building Valid, Credible & Appropriately Detailed Simulation Models
Modeling and Simulation: Exploring Dynamic System Behaviour
Presentation transcript:

Output Data Analysis

Introduction Most serious disadvantage of simulation –With a stochastic simulation, you don ’ t get exact “ answers ” from a run –Two different runs of same model gives different numerical results Thus, the output performance measures are really observations from their probability distributions We then ask questions about this distribution –Mean (expected value): E(average WIP) –Variance: Var(average WIP) –Probabilities: P(average WIP > 250) –Quantiles: What value of x is such that P(avg. WIP > x)  0.02? Random input: Random numbers Random variates Simulation model/code Random output: Performance measures

Introduction Interpreting simulation output requires statistical analysis of output data Failure to recognize and deal with randomness in simulation output can lead to serious errors, misinterpretation, and bad decisions You must take care to use appropriate statistical methods, since simulation output data are usually nonstationary, autocorrelated, and non-normal, contrary to assumptions behind classical IID statistical methods Enhanced computer power and speed is making it much easier to carry out appropriate simulation studies and analyze the output properly

Statistical Nature of Simulation Output Let Y 1, Y 2,... be an output process from a single simulation run –Y i = delay in queue of ith arriving customer –Y i = production in ith hour in a factory Y i ’ s are random variables that are generally neither independent nor identically distributed (nor normally distributed), so classical IID normal-theory statistical methods don ’ t apply directly to the Y i ’ s Let y 11, y 12,..., y 1m be a realization of the random variables Y 1, Y 2,..., Y m resulting from making a single simulation run of length m observations, using a particular stream of underlying U(0, 1) random numbers. If we use a separate stream of random numbers for another simulation run of this same length, we get a realization y 21, y 22,..., y 2m that is independent of, but identically distributed to, y 11, y 12,..., y 1m.

Statistical Nature of Simulation Output Make n such independent runs, each using “ fresh ” random numbers, to get y 11, y 12,..., y 1m y 21, y 22,..., y 2m... y n1, y n2,..., y nm Within a row, the observations are not IID Across the ith column, the observations are IID realizations of the random variable Y i (but still not necessarly normally distributed) You can compute a summary measure within a run, and then the summary measures across the runs are IID (but still not necessarily normally distributed)

Statistical Nature of Simulation Output ReplicationNumber Served Finish Time (Hours) Average Delay in Queue (Mins) Average Queue Length Proportion of Customers Delayed < 5 mins Example: Bank with 5 tellers, one FIFO queue, open 9am-5pm, flush out before stopping, interarrivals ~ expo (mean = 1 min.), service times ~ expo (mean = 4 min.) Summary measures from 10 runs (replications) clearly show there’s variation across runs; you need appropriate statistical analysis

Types of Output Performance Measures What do you want to know about the system? –Average time in system –Worst (longest) time in system –Average, worst time in queue(s) –Average, worst, best number of “ good ” pieces produced per day –Variability (standard deviation, range) of number of parts produced per day –Average, maximum number of parts in the system (WIP) –Average, maximum length of queue(s) –Proportion of time a machine is down, up and busy, up and idle You should ask the same questions of the model/code Think ahead in modeling, asking for additional output performance measures can change how the simulation code is written, or even how the system is modeled

Simple Queueing Model Customer arrivals Customer departures Server Customers in queue Customer in service We want to determine –Average number of customers in queue –Proportion of time server is busy We may also want to determine –Average time customers spend in queue Question: If we also need the average time in queue, how does this affect the way the model is coded, and the data structures, etc.?

Types of Output Performance Measures As simulation progresses through time, there are typically two kinds of output processes that are observed Discrete-time process: There is a natural “ first ” observation, “ second ” observation, etc. — but can only observe them when they “ happen ” –S i = time in system of ith part produced, i  {1, 2,...} –Suppose there are M parts produced during the simulation

Typical Discrete-Time Output Performance Measures Other examples of discrete-time processes are D i = delay of ith customer in queue Y i = throughput (production) during ith hour B i = 1 if caller i gets a busy signal, 0 otherwise

Continuous-Time Process This is when you can jump into system at any point in real and continuous time to take a “ snapshot ”, there is no natural “ first ” or “ second ” observation Q(t) = number of parts in a particular queue at time t  ) B(t) = 1 or 0 depending on whether the server is busy or not If you run the simulation for T units of simulated time, you will observe.

Typical Continuous-Time Output Performance Measures Other examples of continuous-time processes are N(t) = number of parts in shop at time t (WIP) D(t) = number of machines down at time t

Transient and Steady-State Behavior of a Stochastic Process Suppose Y 1, Y 2,... is a discrete-time stochastic (output process) Let F i (y | I) = P(Y i  y | I) be the (cumulative) transient distribution of the process at (discrete) time i –In general, F i depends on both the time i and the initial condition I –Corresponding transient density functions look like

If there is a CDF F(y) such that F i (y | I)  F(y) as i  +  for all y and for all initial conditions I, then F(y) is the steady-state distribution of the output process –F(y) may or may not exist –F(y) must be independent of the initial conditions (same for all I) Roughly speaking, if there is a time index k such that for i > k F i (y | I)  F(y) in some sense, then we say that the process is “ in steady state ” after time k –Even though the distribution of the Y i ’ s after time k is not appreciably changing, observations on the Y i ’ s could still have large variance and thus “ bounce around ” a lot –Even in steady state, the Y i ’ s are generally not independent, and could be heavily (auto)correlated Transient and Steady-State Behavior of a Stochastic Process

Steady-state distribution does not depend on initial conditions, but the nature and rate of convergence of the transient distributions can depend heavily on the initial conditions Example 9.1: M/M/1 queue, E[delay in queue] as a function of number of customers s present initially (λ =1, ω = 10/9, ρ = 0.9) Transient and Steady-State Behavior of a Stochastic Process

Example 9.3: Inventory system, E(cost in month i) when initial inventory is 57 Transient and Steady-State Behavior of a Stochastic Process

Types of Simulations with Regard to Output Analysis Terminating: Parameters to be estimated are defined relative to specific initial and stopping conditions that are part of the model –There is a “ natural ” and realistic way to model both the initial and stopping conditions –Output performance measures generally depend on both the initial and stopping conditions

Types of Simulations with Regard to Output Analysis Nonterminating: There is no natural and realistic event that terminates the model –Interested in “ long-run ” behavior characteristic of “ normal ” operation –If the performance measure of interest is a characteristic of a steady-state distribution of the process, it is a steady-state parameter of the model Theoretically, this does not depend on initial conditions Practically, you must ensure that run is long enough so that initial-condition effects have dissipated –Not all nonterminating systems are steady-state: there could be a periodic “ cycle ” in the long run, giving rise to steady-state cycle parameters

Types of Simulations with Regard to Output Analysis Terminating vs. steady-state simulations: –Which is appropriate? Depends on goals of the study Statistical analysis for terminating simulations is a lot easier –Is steady-state relevant at all? It may be as in the following examples 24 hours/day manufacturing Global telecommunications networks Computer networks

Some Examples Physical model Terminating estimandSteady-state estimand Single-server queue Expected average delay in queue of first 25 customers, given empty-and- idle initial Conditions Long-run expected delay in queue of a customer Manufacturing system Expected daily production, given some number of workpieces in process initially Expected long-run daily production Reliability system Expected lifetime, or probability that it lasts at least a year, given all components initially are new and working Probably not sensible Battlefield model Probability that attacking force loses half its strength before defending force loses half its strength Probably not sensible

Statistical Analysis for Terminating Simulations Make n IID replications of a terminating simulation –Same initial conditions for each replication –Same terminating event for each replication –But, separate random numbers for each replication Let X j be a summary measure of interest from the jth replication, like X j = the average delay in queue of all customers in the jth replication Then X 1, X 2,..., X n are IID random variables, you can apply classical statistical analysis to them –Rely on central limit theorem to justify normality assumption even though it ’ s generally not true So, basic strategy is replication of the whole simulation some number n of times One simulation run is a sample of size one (not worth much statistically)

What About Classical Statistics? Classical statistical methods don ’ t work directly within a simulation run, due to autocorrelation that is usually present Example: Delays in a queue of individual jobs: D 1, D 2, D 3,..., D m We want to estimate μ = E[average delay of the first m jobs] The reason is that Corr (D i, D i+1 ) ≠ 0 in general Unbiasedness of variance estimators follow from independence of data, which is not true in simulation Therefore, we cannot find confidence intervals or test hypothesis for μ

What About Classical Statistics? Usual situation in simulation –There is positive autocorrelation, Corr(D i, D i + l ) > 0, which causes Intuition –D i + 1 is pretty much the same as D i –D i ’ s are more stable than if they were independent –Their sample variance is understated Thus, you must take care to estimate variances properly –Do not have too much faith in your point estimates –Do not believe your simulation results too much without proper statistical analysis

Estimating Means Suppose you want to estimate some parameter μ of the process Often (not always), μ = E[something] –μ = E[average delay in queue of m customers] –μ = E[time-average WIP] –μ = E[machine utilization] –μ = P(average delay > 5 minutes) = E[I (5,∞) (average delay)] You can get a point estimate = 12.3, but how close is to the true unknown value of μ? Customary, a useful method for assessing precision of estimator is to determine a confidence interval for μ –Pick confidence level 1 – α (typically 0.90, 0.95, etc.) –Use simulation output to construct an interval [A, B] that covers μ with probability 1 – α Interpretation: 100(1 – α)% of the time, the interval will cover μ

Common approach to statistical analysis of simulation output –You can ’ t do “ classical ” (IID, unbiased) statistics within a simulation run –Make modifications to get back to classical statistics In terminating simulations, this is conceptually easy –Make n independent replications of the whole simulation –Let X j be the performance measure from the jth replication X j = average of the delays in queue X j = time-average WIP X j = utilization of a bottleneck machine –Then X 1, X 2,..., X n are IID and unbiased for μ = E[X j ] Apply classical statistics to X j ’ s, not to observations within a run Estimating Means

Important points –The “ basic ingredients ” to the statistical analysis are the performance measures from the different, independent replications –One whole simulation run is only a “ sample ” of size one (not worth much) Confidence Intervals

Example: Given n = 10 replications of single-server queue –X j = average delay in queue from the jth replication –X j ’s: 2.02, 0.73, 3.20, 6.23, 1.76, 0.47, 3.89, 5.45, 1.44, 1.23 –Want 90% confidence interval, i.e., α = 0.10 – = 2.64, S 2 (10) = 3.96, t 9, 0.95 = –Approximate 90% confidence interval is 2.64 ± 1.15, or [1.49, 3.79] Other examples in the text –Inventory model –Estimation of proportions Estimating Means and CI’s

Why “Approximate” 90% Confidence Interval? The CI formula assumes X j ’ s are normally distributed, this is never true, but it does not really matter Central-Limit Theorem –As n (number of replications) grows, coverage probability → 1 – α Robustness study with this model –Know μ = 2.12 = d(25|s = 0) from queueing theory –Pick an n (like n = 10) –Make 500 “ 90% ” confidence intervals (total # of runs = 10 × 500) –Count % of these 500 intervals covering μ = 2.12 nEstimated coverage (want 90%) % 86% 89% 92%

Bad news is that actual coverages can depend (a lot) on the model –Reliability model: –Components fail independently –Times T i to failure in components ~ Weibull (α = 0.5, β = 1.0) –Time to system failure = T = min(T 1, max(T 2, T 3 )) –μ = E[time to system failure] = 0.78 nEstimated coverage (want 90%) % 75% 80% 84% Why “Approximate” 90% Confidence Interval?

Obtaining a Specified Precision If the number n of replication is chosen too small, the confidence intervals might be too wide to be useful M/M/1 example –90% confidence interval from n = 10 replications is 2.64 ± 1.15 –Half width (1.15) is 44% of point estimate (2.64) –Equivalently: 2.64 ± 44%(2.64) is not very precise

Two versions of what “small enough” might mean (more details in text) –Absolute precision Specify β > 0, choose n big enough so that  (α, n) < β Requires at least some knowledge of context to set meaningful β –Relative precision Specify γ (0 < γ < 1), choose n big enough so that You don’t need to know much about context to set meaningful γ Some notes –Above description leaves out a few technical details (see textbook) –“Fixes” robustness issue, as β or γ → 0, coverage probability → 1 –  –It can be dangerous for small β or γ, the required n increases quadratically as β or γ decrease –May be difficult to automate with a simulation language, depending on modeling constructs available, what automated statistical capabilities present, and what access the user has to internal software variables Obtaining a Specified Precision

Estimating Other Measures of Performance Sometimes can miss important system/model characteristics if we look only at averages Other measures may also be important, like proportions, variances, quantiles Proportions –Compare two operating policies for queueing system with five servers vs. Performance measureFive queuesSingle queue Average delay in queue Average number in queue(s) Number of delays  20 min minutes minutes

Variances (or Standard Deviations) & Quantiles Variances (or Standart Deviations) Interested in process variability X j = daily production of good items Want to estimate Make n replications, compute S(n) as before Confidence interval on (use chi-square distribution) Quantiles Inventory system, X j = maximum inventory level during the horizon Want to determine storage capacity that is sufficient with probability 0.98 Want to find x such that P(X j  x) = 0.98 One approach (more details in text) –Make n replications, observe X 1, X 2,..., X n –Sort the X j ’ s into increasing order –Estimate x to be a value below which are 98% of the X j ’ s

Choosing Initial Conditions For terminating simulations, the initial conditions can affect the output performance measure, so the simulations should be initialized appropriately Example: Suppose you want to estimate expected average delay in queue of bank customers who arrive and complete their delay between noon and 1:00pm Bank is likely to be crowded already at noon, so starting empty and idle at noon will probably bias the results low Two possible remedies –If bank actually opens at 9:00am, start the simulation empty and idle, let it run for 3 simulated hours, clear the statistical accumulators, and observe statistics for the next simulated hour –Take data in the field on number of customers present at noon, fit a (discrete) distribution to it, and draw from this distribution to initialize the simulation at time 0 = noon. Draw independently from this distribution to initialize multiple replications. –Note that this could be difficult in simulation software, depending on the modeling constructs available

Statistical Analysis for Steady-State Parameters Much more difficult problem than analysis for terminating simulations You may want to estimate for discrete and continuous-time processes Basic design problem: (1) many short runs vs. (2) one long run GoodSimple (like terminating) Get IID data Less point-estimator bias No restarts BadPoint-estimator bias (initial transient) “ Sample ” of size 1 Hard to get variance estimate Short runs Single long run

The Problem of the Initial Transient If steady-state is the goal, initial conditions will generally bias the results of the simulation for some initial period of time Most common technique is to warm up the model, also called initial-data deletion Identify an index l (for discrete-time processes) or time t 0 (for continuous- time processes) beyond which the output appears not to be drifting any more –Clear statistical accumulators at that time –Start over with data collection, and “ count ” only data observed past that point After warmup period, observations will still have variance, so they will bounce around but they are just not “ trending ” any more There is facility for doing this in most simulation software (but the user must specify the warmup period)

The Problem of the Initial Transient The challenge is to identifying a warmup period that is long enough, yet no so long as to be excessively wasteful of data (see text for details and examples) –Some statistical analysis tests have been devised –Most practical (and widely-used) method is to make plots, perhaps averaged across and within replications to dampen the noise, and “ eyeball ” a cutoff –If there are multiple output processes, and if they disagree about what the appropriate warmup period is, a decision must be made whether to use different warmups for each process, or to use the same one for each process (which would have to be the maximum of the individual warmups) A different approach is to try to find “ smarter ” initialization states or distributions that are “ closer ” to steady-state than something like “ empty and idle ” –There has been some research on how to find such initial states/distributions –“ Priming ” the model initially with entities may be tricky in simulation software