Confidence Interval Estimation in System Dynamics Models

Slides:



Advertisements
Similar presentations
October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
Advertisements

Is cross-fertilization good or bad?: An analysis of Darwin’s Zea Mays Data By Jamie Chatman and Charlotte Hsieh.
OUTLIER, HETEROSKEDASTICITY,AND NORMALITY
Probability & Statistical Inference Lecture 7 MSc in Computing (Data Analytics)
Design of Engineering Experiments - Experiments with Random Factors
Chapter Seventeen HYPOTHESIS TESTING
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Part 2b Parameter Estimation CSE717, FALL 2008 CUBS, Univ at Buffalo.
Sample size computations Petter Mostad
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Resampling techniques
CH7 Distribution Free Inference: Computer-Intensive Techniques 1.Random Sampling 2.Bootstrap sampling 3.Bootstrap Testing.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Today Concepts underlying inferential statistics
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Standard error of estimate & Confidence interval.
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
Choosing Statistical Procedures
AM Recitation 2/10/11.
Statistical inference: confidence intervals and hypothesis testing.
Chapter 9 Statistical Data Analysis
Chapter 19: Two-Sample Problems STAT Connecting Chapter 18 to our Current Knowledge of Statistics ▸ Remember that these formulas are only valid.
Education 793 Class Notes T-tests 29 October 2003.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Topic 5 Statistical inference: point and interval estimate
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Interval Estimation for Means Notes of STAT6205 by Dr. Fan.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Biostatistics IV An introduction to bootstrap. 2 Getting something from nothing? In Rudolph Erich Raspe's tale, Baron Munchausen had, in one of his many.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Estimating and Constructing Confidence Intervals.
Estimating Incremental Cost- Effectiveness Ratios from Cluster Randomized Intervention Trials M. Ashraf Chaudhary & M. Shoukri.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Determination of Sample Size: A Review of Statistical Theory
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
DOX 6E Montgomery1 Design of Engineering Experiments Part 9 – Experiments with Random Factors Text reference, Chapter 13, Pg. 484 Previous chapters have.
Chapter 7 Process Capability. Introduction A “capable” process is one for which the distributions of the process characteristics do lie almost entirely.
Experimental Design and Statistics. Scientific Method
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Lesson 9 - R Chapter 9 Review.
Descriptive Statistics Used to describe a data set –Mean, minimum, maximum Usually include information on data variability (error) –Standard deviation.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
Inferences Concerning the Difference in Population Proportions (9.4) Previous sections (9.1,2,3): We compared the difference in the means (  1 -  2 )
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
McGraw-Hill/Irwin Business Research Methods, 10eCopyright © 2008 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 17 Hypothesis Testing.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Sample Size Needed to Achieve High Confidence (Means)
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Project Plan Task 8 and VERSUS2 Installation problems Anatoly Myravyev and Anastasia Bundel, Hydrometcenter of Russia March 2010.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Chapter 9 Introduction to the t Statistic
Chapter 11 Inference for Distributions AP Statistics 11.2 – Inference for comparing TWO Means.
Chapter 3 INTERVAL ESTIMATES
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Application of the Bootstrap Estimating a Population Mean
Chapter 3 INTERVAL ESTIMATES
Part Four ANALYSIS AND PRESENTATION OF DATA
Chapter 8: Inference for Proportions
Bootstrap Confidence Intervals using Percentiles
BOOTSTRAPPING: LEARNING FROM THE SAMPLE
Categorical Data Analysis Review for Final
What are their purposes? What kinds?
Presentation transcript:

Confidence Interval Estimation in System Dynamics Models Gokhan Dogan* MIT Sloan School of Management System Dynamics Group *Special thanks to John Sterman for his support

Automated Calibration (e.g. Vensim, Powersim) Motivation Calibration Manual Calibration Automated Calibration (e.g. Vensim, Powersim)

Motivation Once model parameters are estimated with automated calibration, next step: Estimate confidence intervals! Questions: -Are there available tools at software packages? -Do these methods have any limitations? -Are there alternative methods?

Why are confidence intervals important We reject the claim that the parameter value is equal to 0 (with 95% probability) We can’t reject the claim that the parameter value is equal to 0 (with 95% probability) 95% Confidence Interval 95% Confidence Interval Parameter Estimate θ Parameter Estimate θ

How can we estimate confidence intervals? Used in the System Dynamics Software (Vensim) /Literature The method we suggest for System Dynamics models Likelihood Ratio Method Bootstrapping Both methods yield approximate confidence intervals!

Likelihood Ratio Method The likelihood ratio method is used in system dynamics software packages (Vensim) and literature (Oliva and Sterman, 2001). It relies on asymptotic theory (large sample assumption).

However Likelihood Ratio Method (as it is used at software packages) assumes: At system dynamics models: -Large Sample -It is not always possible to have large sample -No feedback (autocorrelation) -There are many feedback loops -Normally distributed error terms -Error terms are not always normally distributed

Bootstrapping Introduced by Efron (1979) and based on resampling. Extensive survey in Li and Maddala (1996). It seems more appropriate for system dynamics models because - It doesn’t require large sample - It is applicable when there is feedback (autocorrelation) - It doesn’t assume normally distributed error terms

Drawbacks of bootstrapping The software packages do not implement it. It is time consuming.

Bootstrapping Fit the model and estimate parameters Compute the Error Terms

Bootstrapping uses resampling Nonparametric: Reshuffle Them and Generate many many new error term sets using the autocorrelation information Parametric: Fit a distribution and Generate many many new error term sets using the autocorrelation and distribution information

Resampling the Error Terms If we know that: - The error terms are autocorrelated - Their variance is not constant (heteroskedasticity) - They are not normally distributed => We can use this information while resampling the error terms Flexibility of bootstrapping stems from this stage

FABRICATED ERROR TERMS FABRICATED “HISTORICAL” DATA . . . . . . + =

FABRICATED “HISTORICAL” DATA . . . Fit the model and estimate parameters Parameter Estimate Parameter Estimate Fit the model and estimate parameters Parameter Estimate 500 Parameter Estimates Fit the model and estimate parameters

Distribution of a model parameter

Experiments We had experimental time series data from 240 subjects. Subjects were beer game players. For each subject we had 48 data points, so we estimated parameters and confidence intervals using 48 data points.

Model (Same as Sterman 1989) Ot = Max[0, θLRt + (1–θ)ELt + α(S' – St –βSLt) + error termt] Parameters to be estimated are θ, α, β, S‘

Likelihood Ratio Method Individual Results 95% Confidence Intervals for θ Likelihood Ratio Method 95% CI 1 0.77 θ=0.95 Bootstrapping 95% CI 1 0.01 θ=0.95

Individual Results 95% Confidence Intervals for β Likelihood Ratio Method 95% CI 0.2 Significantly Different From 0!!! β =0.01 Bootstrapping 95% CI 0.2 β =0.01

Overall Results Average 95% Confidence Interval Length   Theta Alpha Beta S-Prime Likelihood Ratio Method 0.19 0.11 13.20 Bootstrapping 0.67 0.30 0.52 973.59 Median of 95% Confidence Interval Length   Theta Alpha Beta S-Prime Likelihood Ratio Method 0.10 0.08 0.06 2.32 Bootstrapping 0.84 0.24 0.48 10.10

Overall Results Percentage of Subjects for whom the bootstrapping confidence interval is wider than the likelihood ratio method confidence interval Theta Alpha Beta S-Prime Bootstrapping CI wider than Likelihood Ratio Method CI 97.76% 98.81% 100% 98.56%

Likelihood Ratio Method vs Bootstrapping Is easy to compute Very fast BUT depends on assumptions that are usually violated by system dynamics models Yields very tight confidence intervals Bootstrapping: Is NOT easy to compute Takes longer time DOES NOT depend on assumptions that are usually violated by system dynamics models Yields larger confidence intervals. Usually more conservative.