Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.

Slides:



Advertisements
Similar presentations
Things to do in Lecture 1 Outline basic concepts of causality
Advertisements

+ Validation of Simulation Model. Important but neglected The question is How accurately does a simulation model (or, for that matter, any kind of model)
The Simple Regression Model
Chapter 7 Statistical Data Treatment and Evaluation
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Chapter 15 Comparing Two Populations: Dependent samples.
3.3 Omitted Variable Bias -When a valid variable is excluded, we UNDERSPECIFY THE MODEL and OLS estimates are biased -Consider the true population model:
G. Alonso, D. Kossmann Systems Group
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Lecture 2 Today: Statistical Review cont’d:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Output analyses for single system
ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Point estimation, interval estimation
Chapter 4 Multiple Regression.
Simulation Modeling and Analysis Session 12 Comparing Alternative System Designs.
Evaluating Hypotheses
Lecture 9 Output Analysis for a Single Model. 2  Output analysis is the examination of data generated by a simulation.  Its purpose is to predict the.
Variance Reduction Techniques
1 Validation and Verification of Simulation Models.
4. Review of Basic Probability and Statistics
Lecture 10 Comparison and Evaluation of Alternative System Designs.
Experimental Evaluation
The Analysis of Variance
BCOR 1020 Business Statistics
Maximum likelihood (ML)
Chapter 7 Probability and Samples: The Distribution of Sample Means
Determining the Size of
1 CSI5388 Data Sets: Running Proper Comparative Studies with Large Data Repositories [Based on Salzberg, S.L., 1997 “On Comparing Classifiers: Pitfalls.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Standard Error of the Mean
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation.
Copyright © 2010 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Experiments and Observational Studies. Observational Studies In an observational study, researchers don’t assign choices; they simply observe them. look.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
CORRELATION & REGRESSION
Verification & Validation
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Copyright © 2009 Cengage Learning Chapter 10 Introduction to Estimation ( 추 정 )
PARAMETRIC STATISTICAL INFERENCE
MS 305 Recitation 11 Output Analysis I
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
1 Lecture 16: Point Estimation Concepts and Methods Devore, Ch
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Probability and Statistics
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
1 OUTPUT ANALYSIS FOR SIMULATIONS. 2 Introduction Analysis of One System Terminating vs. Steady-State Simulations Analysis of Terminating Simulations.
Variance Reduction Techniques
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Sampling and estimation Petter Mostad
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
CHAPTER- 3.2 ERROR ANALYSIS. 3.3 SPECIFIC ERROR FORMULAS  The expressions of Equations (3.13) and (3.14) were derived for the general relationship of.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Variance reduction techniques Mat Simulation
Stats Methods at IC Lecture 3: Regression.
Module II Lecture 1: Multiple Regression
Joint Probability Distributions and Random Samples
Introduction to Estimation
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Variance reduction techniques

2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures expedited execution and minimized storage requirements. “Statistical” efficiency is related to the variance of the output random variables from a simulation. If we can somehow reduce the variance of an output random variable of interest without disturbing its expectation, we can obtain greater precision. That is, smaller confidence intervals, for the same amount of simulating, or, alternatively, achieve a desired precision with less simulating.

3 Introduction The method of applying Variance Reduction Techniques (VRTs) usually depends on the particular model(s) of interest. Hence, a complete understanding of the model is required for proper use of VRTs. Typically, it is impossible to know beforehand how great a variance reduction might be realized. Or worse, whether the variance will be reduced at all in comparison with straight-forward simulation. If affordable, primary runs should be made to compare results of applying a VRT with those from straight-forward simulation.

4 Introduction Some VRTs are themselves going to increase computing costs, and this decrease in computational efficiency must be traded off against the potential gain in statistical efficiency. Almost all VRTs require some extra effort on the part of the analyst. This could just be to understand the technique, or sometimes little more!

5 Common random numbers (CRN) Probably the most used and popular VRT. More commonly used for comparing multiple systems rather than for analyzing a single system. Basic principle: “we should compare alternate configurations under similar experimental conditions.” Hence we will be more confident that the observed differences are due to differences in the system configuration rather than the fluctuations of the “experimental conditions.” In our simulation experiment, these experimental conditions are the generated random variates that are used to drive the model through the simulated time.

6 Common random numbers (CRN) The name for this techniques is because of the possibility in many situations of using the same basic U(0,1) random numbers to drive each of the alternate configurations through time. To see the rationale for the use of CRN, consider two systems to be compared with output performance parameters X 1j and X 2j, respectively for replication j. Let µ i = E[X i ] be the expected output measure for system i. We are interested in ζ = µ 1 - µ 2. Let Z j = X 1j – X 2j for j = 1,2 … n. Then, E[Z j ] = ζ. That is,

7 Common random numbers (CRN) Since Z j ’s are IID variables: If the simulations of two different configurations are done independently, with different random numbers, X 1j and X 2j will be independent. So the covariance will be zero. If we could somehow simulate the two configurations so that X 1j and X 2j, are positively co-related, then Cov(X 1j, X 2j ) > 0 so that the variance of the sample mean is reduced. So its value is closer to the population parameter ζ.

8 Common random numbers (CRN) CRN is a technique where we try to introduce this positive correlation by using the same random numbers to simulate all configurations. However, success of using CRN is not guaranteed. We can see that as long as the output performance measures for two configurations X 1j and X 2j, react monotonically to the common random numbers, CRN works. However, if X 1j and X 2j react in opposite directions to the random variables, CRN backfires. Another drawback of CRN is that formal statistical analyses can be complicated by the induced correlation.

9 Common random numbers (CRN) Synchronization To implement CRN, we must match up, or synchronize, the random numbers across different configurations on a particular replications. Ideally, a specific random variable should be used for a specific purpose on one configuration is used for exactly same purpose on all configurations. For example, say we are comparing different configurations of a queuing system. If a random number is used to generate service time for one system, the same random number should be used to generate service times for the other systems.

10 Antithetic variates Antithetic variates (AV) is a VRT that is more applicable in simulating a single system. As with CRN, we try to induce correlation between separate runs, but now we seek negative correlation. Basic idea: Make pairs of runs of the model such that a “small” observation on one of the run in the pair tends to be offset by a “large” observation on the other. So the two observations are negatively correlated. Then, if we use the average of the two observations in the pair as a basic data point for analysis, it will tend to be closer to the common expectation µ of an observation than it would be if the two observations in the pair were independent.

11 Antithetic variates AV induces negative correlation by using complementary random numbers to drive the two runs of the pair. If U is a particular random number used for a particular purpose in the first run, we use 1 – U for the same purpose in the second run. This number 1 – U is valid because if U ~ U(0,1) then (1 – U) ~ U(0,1). Note that synchronization is required in AV too – use of complementary random numbers for the same purpose in a pair.

12 Antithetic variates Suppose that we make n pairs of runs of the simulation model resulting in observations (X 1 (1), X 1 (2) ) … (X n (1), X n (2) ), where X j (1) is from the first run of the jth pair, and X j (2) is from the antithetic run of the jth pair. Both X j (1) are X j (2) are legitimate that is E(X j (1) ) = E(X j (2) ) = µ. Also each pair is independent of every other pair. In fact, total number of replications are thus 2n. For j = 1, 2 …n, let X j = (X j (1) + X j (2) )/2 and let average of X j ’s be the unbiased point estimator for population mean µ.

13 Antithetic variates Since, X j ’s are IID, If the two runs within a pair were made independently, then Cov(X j (1), X j (2) ) = 0. On the other hand, if we could induce negative correlation between X j (1) are X j (2) then Cov(X j (1), X j (2) ) < 0, which reduces Var[ ]. This is the goal of AV.

14 Antithetic variates Like CRN, AV doesn’t guarantees that the method will work every time. For AV to work, its response to a random number used for a particular purpose needs to monotonic – in either direction. How about combining CRN and AV? We can drive each configuration using AV and then use CRN to simulate multiple configurations under similar conditions.

15 Control variates The method of Control Variates (CV) attempts to take advantage of correlation between certain random variables to obtain a variance reduction. Suppose we are interested in the output parameter X. Particularly, we want µ = E[X]. Suppose Y be another random variable involved in the same simulation that is thought to be correlated with X – either positively or negatively. Also suppose that we know the value ν = E[Y].

16 Control variates If X and Y are positively correlated, then it is highly likely that in a particular simulation run, Y > ν would lead to X > µ. Thus, if in a run, we notice that Y > ν, we might suspect that X is above its expectation µ as well, and accordingly we adjust X downward by some amount. Alternatively, if we find Y < ν, then we would suspect X < µ as well and adjust X upward accordingly. This way, we use our knowledge of Y’s expectation to pull X (up or down) towards its expected value µ, thus reducing variability about µ from one run to next. We call Y a control variate of X.

17 Control variates Unlike CRN or AV, success of CV does not depend on the correlation being of a particular sign. If Y and X are negatively correlated, CV would still work. Now, we would simply adjust X upward if Y > ν and downward if Y < ν. To implement this idea, we need to quantify the amount of the upward or downward adjustment of X. We will express this quantification in terms of the deviation Y – ν, of Y from its expectation.

18 Control variates Let a be a constant that has the same sign as correlation between Y and X. We use a to scale the deviation Y – ν to arrive at an adjustment to X and thus define the “controlled” estimator: X c = X – a(Y – ν). Since µ = E[X] and ν = E[Y], then for any real number a, E(X c ) = µ. So is an unbiased estimator of µ and may have lower variance. Var(X c ) = Var(X) + a 2 Var(Y) – 2a Cov(X, Y). So has less variance if and only if: a 2 Var(Y) < 2a Cov(X, Y).

19 Control variates We need to choose the value of a carefully so that the condition is always satisfied. The optimal value is: In practice, though, its bit difficult than it appears! Depending on the source and nature of the control variate Y, we may not know Var(Y) and certainly not know Cov(X, Y). Hence obtaining the optimal value of a might be difficult.