Automating the Analysis of Simulation Output Data Stewart Robinson, Katy Hoad, Ruth Davies ORGS Meeting, 4th October 2007

Slides:



Advertisements
Similar presentations
Modellierung großer Netze in der Logistik SFB 559 Initial Transient Period Detection Using Parallel Replications F. Bause, M. Eickhoff LS Informatik IV,
Advertisements

CSE 330: Numerical Methods
Automating the Analysis of Simulation Output Data Katy Hoad Stewart Robinson, Ruth Davies, Mark Elder Funded by EPSRC and SIMUL8.
Experimental Design, Response Surface Analysis, and Optimization
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
G. Alonso, D. Kossmann Systems Group
Estimation of Sample Size
Output analyses for single system
The Comparison of the Software Cost Estimating Methods
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Output Data Analysis. How to analyze simulation data? simulation –computer based statistical sampling experiment –estimates are just particular realizations.
Research Methods in MIS: Sampling Design
Automating estimation of warm-up length Katy Hoad, Stewart Robinson, Ruth Davies Warwick Business School WSC08 The AutoSimOA Project A 3 year, EPSRC funded.
The AutoSimOA Project Katy Hoad, Stewart Robinson, Ruth Davies Warwick Business School WSC 07 A 3 year, EPSRC funded project in collaboration with SIMUL8.
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Automating estimation of warm-up length Katy Hoad, Stewart Robinson, Ruth Davies Warwick Business School Simulation Workshop - April 2008 The AutoSimOA.
Automated Analysis of Simulation Output Data and the AutoSimOA Project
Automating the Analysis of Simulation Output Data Katy Hoad, Stewart Robinson, Ruth Davies SSIG Meeting, 24th October 2007
Classification of Discrete Event Simulation Models and Output Data: Creating a Sufficient Model Set. Katy Hoad Stewart Robinson,
The AutoSimOA Project Katy Hoad, Stewart Robinson, Ruth Davies Warwick Business School OR49 Sept 07 A 3 year, EPSRC funded project in collaboration with.
Automating the Analysis of Simulation Output Data Stewart Robinson, Katy Hoad, Ruth Davies OR48, September 2006.
Automating the Analysis of Simulation Output Data Katy Hoad Stewart Robinson, Ruth Davies, Mark Elder Funded by EPSRC and SIMUL8.
1 Validation and Verification of Simulation Models.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling.
SIMULATION. Simulation Definition of Simulation Simulation Methodology Proposing a New Experiment Considerations When Using Computer Models Types of Simulations.
1 Simulation Modeling and Analysis Output Analysis.
191 Drawing Statistical Inference from Simulation Runs......the "fun" stuff!
Automating The Selection of a Simulation Warm-up Period Stewart Robinson, Katy Hoad, Ruth Davies Warwick Business School Cardiff University - October 2008.
Monté Carlo Simulation MGS 3100 – Chapter 9. Simulation Defined A computer-based model used to run experiments on a real system.  Typically done on a.
AutoSimOA : A Framework for Automated Analysis of Simulation Output Stewart Robinson Katy Hoad, Ruth Davies Funded by.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
1 Terminating Statistical Analysis By Dr. Jason Merrick.
Interval estimation ASW, Chapter 8 Economics 224, Notes for October 8, 2008.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
Introduction to Statistical Inferences
Determining Sample Size
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Diploma in Statistics Introduction to Regression Lecture 2.21 Introduction to Regression Lecture Review of Lecture 2.1 –Homework –Multiple regression.
Verification & Validation
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Confidence Interval Estimation
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
© 2003, Carla Ellis Simulation Techniques Overview Simulation environments emulation exec- driven sim trace- driven sim stochastic sim Workload parameters.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Managerial Economics Demand Estimation & Forecasting.
Chapter 10 Verification and Validation of Simulation Models
1 OUTPUT ANALYSIS FOR SIMULATIONS. 2 Introduction Analysis of One System Terminating vs. Steady-State Simulations Analysis of Terminating Simulations.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 15: Sample size and Power Marshall University Genomics.
Stats Methods at IC Lecture 3: Regression.
Computer aided teaching of statistics: advantages and disadvantages
OPERATING SYSTEMS CS 3502 Fall 2017
Inference for Least Squares Lines
Confidence Interval Estimation
CPSC 531: System Modeling and Simulation
Chapter 10 Verification and Validation of Simulation Models
Professor S K Dubey,VSM Amity School of Business
Statistical Methods Carey Williamson Department of Computer Science
Objective of This Course
Discrete Event Simulation - 4
Confidence Interval Estimation
Carey Williamson Department of Computer Science University of Calgary
MECH 3550 : Simulation & Visualization
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

Automating the Analysis of Simulation Output Data Stewart Robinson, Katy Hoad, Ruth Davies ORGS Meeting, 4th October

The Problem Prevalence of simulation software: ‘easy-to-develop’ models and use by non-experts. Simulation software generally have very limited facilities for directing/advising on simulation experiments. Main exception is directing scenario selection through ‘optimisers’. With a lack of the necessary skills and support, it is highly likely that simulation users are using their models poorly.

The Problem Despite continued theoretical developments in simulation output analysis, little is being put into practical use. There are 3 factors that seem to inhibit the adoption of output analysis methods: Limited testing of methods Requirement for detailed statistical knowledge Methods generally not implemented in simulation software (AutoMod/AutoStat is an exception) A solution would be to provide an automated output ‘Analyser’.

An Automated Output Analyser Simulation model Warm-up analysis Run-length analysis Replications analysis Use replications or long-run? Recommendation possible? Recommend- ation Output data Analyser Obtain more output data For this project the Analyser looks at: Warm-up Run-length Number of replications Scenario analysis could be added.

A 3 year, EPSRC funded project in collaboration with SIMUL8 Corporation. The AutoSimOA Project Objectives To determine the most appropriate methods for automating simulation output analysis To determine the effectiveness of the analysis methods To revise the methods where necessary in order to improve their effectiveness and capacity for automation To propose a procedure for automated output analysis of warm-up, replications and run-length Only looking at analysis of a single scenario

The AutoSimOA Project WORK CARRIED OUT: 1.Literature review of warm-up, replications and run-length methods. 2.Creation of a representative and sufficient set of models / data output for testing chosen simulation output analysis methods. 3.Development of an automated algorithm for estimating the number of replications to run. 4.Selection and testing of warm-up methods from the literature.

Part 1. Creation of models and data sets

AIMS:  Provide a representative and sufficient set of models / data output for use in discrete event simulation research.  Use models / data sets to test the chosen simulation output analysis methods in the AutoSimOA Project..

Group A …Group N Group B Auto Correlation Normality Cycling/Seasonality Terminating Non-terminating Steady state In/out of control Transient

Model characteristics  Deterministic or random  Significant pre- determined model changes (by time)  Dynamic internal changes i.e. ‘feed- back’  Empty-to-empty pattern  Initial transient (warm-up)  Out of control trend ρ≥1  Cycle  Auto-correlation  Statistical distribution Output data characteristics

Artificial Data: Construct data which resembles real model output with known values for some specific attribute. Example: Known mean and variance. Example data: AR(1) with N(0,1) errors. Real Models: Collect range of models created in “real circumstances”. Examples: Swimming Pool complex: average number in system Production Line Manufacturing Plant: through-put / hour Fast Food Store: average queuing time

Part 2. Automating analysis of number of replications

Introduction Initial Setup:  Any warm-up problems already dealt with.  Run length (m) decided upon.  Modeller decided to use multiple replications to obtain better estimate of mean performance. Multiple replications performed by changing the random number streams used by the model and re-running the simulation. Output data from model Response measure of interest = summary statistic from rep1 = summary statistic from repN N replications

QUESTION IS… How many replications are needed? Limiting factors: computing time and expense. If performing N replications achieves a sufficient estimate of mean performance: > N replications: Unnecessary use of computer time and money. < N replications: Inaccurate results → incorrect decisions.

4 main methods found in the literature for choosing N: 1. Rule of Thumb Run at least 3 to 5 replications. Advantage: Very simple. Disadvantage: Does not use characteristics of model output. No measured precision level.

2. Simple Graphical Method Plot Cumulative mean -v- number of replications Visually select point where cumulative mean line becomes “flat”. Use this as N. Advantages: Simple Uses output of interest in decision. Disadvantages: Subjective No measured precision level.

3. Confidence Interval Method User decides size of error they can tolerate. Run increasing numbers of replications, Construct Confidence Intervals around sequential cumulative mean of output variable until desired precision achieved. Advantages: Relies upon statistical inference to determine number of replications required. Allows the user to tailor accuracy of output results to their particular requirement or purpose for that model and result. Disadvantage: Many simulation users do not have the skills to apply such an approach.

4. Prediction Formula Decide size of error ε that can be can tolerated. Run ≥ 2 replications - estimate variance s 2. Solve to predict N. Check desired precision achieved – if not recalculate N with new estimate of variance. Advantages: Simple. Uses output of interest in decision. Provides specified precision. Disadvantage: Can be very inaccurate especially for small number of replications. If variance estimate low underestimate N If variance estimate high overestimate N

AUTOMATE Confidence Interval Method: Algorithm interacts with simulation model sequentially.

is the student t value for n-1 df and a significance of 1-α, s n is the estimate of the standard deviation, calculated using results X i (i = 1 to n) of the n current replications. Where n is the current number of replications carried out, We define the precision, d n, as the ½ width of the Confidence Interval expressed as a percentage of the cumulative mean: is the cumulative mean, ALGORITHM DEFINITIONS

Stopping Criteria Simplest method: Stop when d n 1st found to be ≤ desired precision, d required, and recommend that number of replications, Nsol, to the user. Problem: Data series could prematurely converge, by chance, to incorrect estimate of the mean, with precision d required, then diverge again. ‘Look-ahead’ procedure: When d n 1st found to be ≤ d required, algorithm performs set number of extra replications, to check that precision remains ≤ d required.

‘Look-ahead’ procedure kLimit = ‘look ahead’ value. Actual number of replications checked ahead is a function of this user defined value: Function relates ‘look ahead’ period length with current value of n.

Nsol Nsol + f(kLimit) f(kLimit) Precision ≤ 5% 95% confidence limits Cumulative mean, Replication Algorithm

Precision≤ 5% Precision> 5% Precision ≤ 5% f(kLimit) Nsol 2 Nsol 2 + f(kLimit) Nsol 1

24 artificial data sets created: Left skewed, symmetric, right skewed; Varying values of relative standard deviation (stdev/mean). Advantage: true mean and variance known. Artificial data set: 100 sequences of 2000 data values. 8 real models selected. Different lengths of ‘look ahead’ period looked at: kLimit values = 0 (i.e. no ‘look ahead’ period), 5, 10, 25. d required value kept constant at 5%. TESTING METHODOLOGY

5 performance measures 1.Coverage of the true mean 2.Bias 3.Absolute Bias 4.Average Nsol value 5.Comparison of 4. with Theoretical Nsol value For real models: ‘true’ mean & variance values - estimated from whole sets of output data (3000 to data points).

Results Nsol values for individual algorithm runs are very variable. Average Nsol values for 100 runs per model close to the theoretical values of Nsol. Normality assumption appears robust. Using a ‘look ahead’ period improves performance of the algorithm.

Mean bias significantly different to zero Failed in coverage of true mean Mean est. Nsol significantly different to theoretical Nsol (>3) No ‘look- ahead’ period Proportion of Artificial models 4/242/249/18 Proportion of Real models 1/8 3/5 kLimit = 5 Proportion of Artificial models 1/2401/18 Proportion of Real models 000

% decrease in absolute mean bias kLimit = 0 to kLimit = 5 kLimit = 5 to kLimit = 10 kLimit = 10 to kLimit = 25 Artificial Models 8.76%0.07%0.26% Real Models 10.45%0.14%0.33% Impact of different look ahead periods on performance of algorithm

Number of times the Nsol value changes (out of 100 runs of the algorithm per model) because of the lengthening of the ‘look ahead’ period. Model ID kLimit = 0 to kLimit = 5 kLimit = 5 to kLimit = 10 kLimit = 10 to kLimit = 25 R1000 R3200 R52401 R82441 A53013 A62663 A15100 A A A243700

Model ID kLimitNsolTheoretical Nsol (approx) Mean estimate significantly different to the true mean? A904112Yes 5120No A Yes 5718No R70310Yes 58No R4036Yes 57No R80345Yes 546No Examples of changes in Nsol & improvement in estimate of true mean

INCORPORATING A FAIL SAFE INTO THE ALGORITHM Problem: If model runs ‘slowly’ the algorithm could take an unacceptable amount of time to reach the set precision. “Fail Safe” - warn the user when a model may require a ‘long time’ to reach d required. At each iteration of the algorithm estimate Nsol using : Only as accurate as current estimate of st.dev and mean. Can be very inaccurate for small n.

Aid to user: Report approx. time to algorithm termination User judgment: Let algorithm progress naturally or terminate prematurely. A range of typical behaviour of Nsol* values

Extended Algorithm Proposal: Cumulative mean line should be reasonably ‘flat’. Extra stability criteria added into algorithm: –Algorithm ‘draws’ two parallel lines - Inner Precision Limits (IPLs) - around the cumulative mean line. –IPLs: Defined as a percentage of the d required value. –Stability criteria violated: If cumulative mean crosses either IPL within ‘look ahead’ period. Tested on the real and artificial models.

Stability Criteria Results Causes final Nsol recommendation to be associated with a much smaller precision than user requested… …but does not significantly reduce bias. Causes algorithm to be unnecessarily complicated. Can cause confusion in the user. Equivalent results produced by setting a smaller precision (d required ) - much more easily understood by user. Hence: Extra stability criteria dropped from replication algorithm.

Replication Work Discussion kLimit default value set to 5. Initial number of replications set to 3. ‘Fail safe’ - Aid for user to decide to prematurely end the algorithm. Stability criteria did not significantly enhance algorithm performance – dropped. Multiple response variables - Algorithm run with each response - use maximum estimated value for Nsol. Different scenarios - advisable to repeat algorithm every few scenarios to check that precision has not degraded significantly. Inclusion into SIMUL8 package: Full explanations of algorithm and results.

Summary Of Replications Work Selection and automation of Confidence Interval Method for estimating the number of replications to be run in a simulation. Algorithm created with ‘look ahead’ period - efficient and performs well on wide selection of artificial and real model output. ‘Black box’ - fully automated and does not require user intervention.

Part 3. WORK IN PROGRESS Automating estimation of warm-up length

The Initial Bias Problem Model may not start in a “typical” state. This may cause initial bias in the output. Many methods proposed for dealing with initial bias: e.g. Initial steady state conditions; run model for ‘long’ time… This project uses: Deletion of the initial transient data by specifying a warm-up period.

Question is: How do you estimate the length of the warm-up period required?

5 main types of methods: 1. Graphical Methods. 2. Heuristic Approaches. 3. Statistical Methods. 4. Initialisation Bias Tests. 5. Hybrid Methods.

Literature search – 42 methods Summary of methods and literature references on project web site:

Creation of artificial data sets with initial bias. Aim: Controllable & comparable data for testing warm-up methods. Create initial bias Create steady state

1.Artificial Initial Bias Functions i)LENGTH OF BIAS FUNCTION n = total data length Truncation point = L = initial bias proportion (%) * n / 100 Set initial bias proportion value to: 10%, 20%, 40% of total data size, n. Three Criteria:

ii) SEVERITY OF BIAS FUNCTION Set maximum value of bias fn, a(t), so that max |a(t)| t≤L = M×Q Q = difference between steady state mean and 1st (if bias fn +ve) or 99th (if bias fn –ve) percentile of the steady state data. M = relative maximum bias – user set: 1, 2, 5 M ≥ 1 → bias significantly separate from steady state data → easier to detect. M ≤ 1 → bias absorbed into steady state data variance → harder to detect.

iii)SHAPE OF BIAS FUNCTION Mean Shift: Linear: Quadratic: Exponential: Oscillating (decreasing):

2. Artificial Steady State Functions i) Constant steady state variance ii) Error Terms: Normal or Exponential distribution; Using L’Ecuyer RNG iii) Auto-Correlation: No AutoCorrelation; AR(1); AR(2); AR(4); MA(2); ARMA(5,5). iv) Superpostion: Bias Fn added onto end of steady state function: E.g.

PROJECT OVERVIEW Created set of artificial and “real” model data including warm-up bias functions. Created replication algorithm. Currently: Testing warm-up methods.

ACKNOWLEDGMENTS This work is part of the Automating Simulation Output Analysis (AutoSimOA) project that is funded by the UK (EPSRC) Engineering and Physical Sciences Research Council (EP/D033640/1). The work is being carried out in collaboration with SIMUL8 Corporation, who are also providing sponsorship for the project. Stewart Robinson, Katy Hoad, Ruth Davies ORGS Meeting, 4th October