1 Rare Event Simulation Estimation of rare event probabilities with the naive Monte Carlo techniques requires a prohibitively large number of trials in.

Slides:



Advertisements
Similar presentations
SADC Course in Statistics Estimating population characteristics with simple random sampling (Session 06)
Advertisements

+ Validation of Simulation Model. Important but neglected The question is How accurately does a simulation model (or, for that matter, any kind of model)
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
COUNTING AND PROBABILITY
Lecture 2 Today: Statistical Review cont’d:
Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.
Evaluation.
Chapter 20 Basic Numerical Procedures
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Point estimation, interval estimation
Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.
Additional Topics in Regression Analysis
SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:
Resampling techniques
Evaluation.
Evaluating Hypotheses
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Statistical Background
Chapter 11 Multiple Regression.
Finite Mathematics & Its Applications, 10/e by Goldstein/Schneider/SiegelCopyright © 2010 Pearson Education, Inc. 1 of 60 Chapter 8 Markov Processes.
Experimental Evaluation
Copyright © Cengage Learning. All rights reserved. 6 Point Estimation.
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
CSE808 F'99Xiangping Chen1 Simulation of Rare Events in Communications Networks J. Keith Townsend Zsolt Haraszti James A. Freebersyser Michael Devetsikiotis.
CHAPTER 15 S IMULATION - B ASED O PTIMIZATION II : S TOCHASTIC G RADIENT AND S AMPLE P ATH M ETHODS Organization of chapter in ISSO –Introduction to gradient.
Simulation of Random Walk How do we investigate this numerically? Choose the step length to be a=1 Use a computer to generate random numbers r i uniformly.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Estimation Basic Concepts & Estimation of Proportions
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
1 Chapter 19 Monte Carlo Valuation. 2 Simulation of future stock prices and using these simulated prices to compute the discounted expected payoff of.
Basic Numerical Procedures Chapter 19 1 Options, Futures, and Other Derivatives, 7th Edition, Copyright © John C. Hull 2008.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Stefano Vezzoli, CROS NT
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Sampling and estimation Petter Mostad
Chapter 19 Monte Carlo Valuation. Copyright © 2006 Pearson Addison-Wesley. All rights reserved Monte Carlo Valuation Simulation of future stock.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
FORECASTING METHODS OF NON- STATIONARY STOCHASTIC PROCESSES THAT USE EXTERNAL CRITERIA Igor V. Kononenko, Anton N. Repin National Technical University.
Surveying II. Lecture 1.. Types of errors There are several types of error that can occur, with different characteristics. Mistakes Such as miscounting.
Classification and Regression Trees
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Goldstein/Schnieder/Lay: Finite Math & Its Applications, 9e 1 of 60 Chapter 8 Markov Processes.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 19 Monte Carlo Valuation. © 2013 Pearson Education, Inc., publishing as Prentice Hall. All rights reserved.19-2 Monte Carlo Valuation Simulation.
Virtual University of Pakistan
Sampling and Sampling Distribution
Virtual University of Pakistan
STATISTICAL INFERENCE
Discrete Event Simulation - 4
Slides for Introduction to Stochastic Search and Optimization (ISSO) by J. C. Spall CHAPTER 15 SIMULATION-BASED OPTIMIZATION II: STOCHASTIC GRADIENT AND.
Presentation transcript:

1 Rare Event Simulation Estimation of rare event probabilities with the naive Monte Carlo techniques requires a prohibitively large number of trials in most interesting cases. For rare event simulation (RES) two stochastic methods, called importance splitting/RESTART and importance sampling (IS), have been extensively investigated by the research simulation community in the last decade.

2 The basic idea of splitting is to partition the state-space of the system into a series of nested subsets and to consider the rare event as the intersection of a nested sequence of events. When a given subset is entered by a sample trajectory during the simulation, numerous random retrials are generated with the initial state for each retrial being the state of the system at the entry point. Thus, by doing so, the system trajectory has been split into a number of new sub-trajectories, hence the name splitting. A similar idea has been developed into a refined simulation technique under the name RESTART.

3 Importance sampling (IS) is also based on the idea to make the occurrence of rare events more frequent, or in other words, to speed up the simulation. Technically, IS aims to select a probability distribution (change of measure) that minimizes the variance of the IS estimate. The efficiency of IS (strongly) depends on obtaining a good change of measure.

4 IMPORTANCE SAMPLING Importance sampling changes the dynamics of the simulation model. The dynamics must be changed to increase the number of rare events of interests. Let f(x) be the original sampling distribution and X the stochastic variable. If g(x) is the property of interest, the original problem is that the probability of observing g(x) > 0 is very small when taking samples from f(x). To solve this problem, importance sampling changes the sampling distribution to a new distribution f*(x). Taking samples from f*(x) should increase the probability of observing g(x) > 0 significantly. To retain an unbiased estimate the observations must be corrected because the samples x are from f*(x) and not f(x).

5 Let the property of interest be = E f [g(X)]. This can now be rewritten: where L(x) = f(x)/f*(x) is denoted the likelihood ratio. Observe that the expected value of the observations under f is equal to the expected value of the observations under f* corrected for bias by the likelihood ratio,

6 An unbiased estimator for , taking samples x from f*(x), is with variance

7 To obtain the optimal – or at least a good – change of measure from the original sampling distribution f(x) to a new distribution f*(x), is the main challenge with respect to making importance sampling efficient and robust. The optimal f(x) is the one that minimizes the variance. Two general guidelines can be given: if g(x)f(x) > 0 then f*(x) > 0, otherwise L(x) infinite (g(X)L(X) -  ) must be small, f*(x)  g(x)f(x)/ .

8 Die RESTART-Methode The RESTART (REpetitive Simulation Trials After Reaching Thresholds) method is a simple simulation method for the estimation of small probabilities. It was introduced in Villen- Altamirano (1991) and enhanced in Villen-Altamirano (1994), but it is similar to an older technique called splitting proposed in Kahn et al. (1951). The basic idea of the RESTART method is to consider the rare event as the intersection of a nested sequence of events. The probability of the rare event is thus the product of conditional probabilities, each of which can usually be estimated much more accurately than the rare event itself, for a given simulation effort.

9 The method is based on restarting the simulation in valid states of the system that provoke the events of interest more often than in a straightforward simulation, thus leading to a shorter simulation run time. Step-by-Step RESTART

10 One of the main questions for any RESTART implementation is how and when to restart the simulation, in order to achieve the most accurate results for a fixed simulation effort. RESTART for overflow probabilities: The basic setting is the following: Consider a Markov process X := (X t ; t  0) with space E, and let f be a real-valued function on E. Define Z t := f(X t ), for all t  0. Assume for definiteness that Z 0  0. For any threshold or level L > 0, let T L denote the first time that the process Z := (Z t ; t  0) hits the set [L,  ) ; and let T 0 denote the first time, after 0, that Z hits the set (− , 0]. We assume that T L and T 0 are well- defined (possibly infinite) stopping times with respect to the history of X.

11 We are interested in the probability,  say, of the event D L := {T L < T 0 }, i.e., the probability that Z up-crosses level L before it down-crosses level 0. Note that  depends on the initial distribution of X. An exact analysis of  is often not possible. A standard way to estimate by simulation is the following. Generate independently r realizations (sample paths) of the Markov process X. Each path x (i) := (x t (i) ) defines a realization z (i) := (z t (i) ) of Z. Let I i be the indicator that z (i) up-crosses level L before it down-crosses level 0. An unbiased estimate for  is given by

12 Note that the relative error tends to infinity as  tends to 0. An alternative way to estimate  is based on the following observation: If L > K then D L  D K, where D K denotes the event that Z up-crosses level K before it down-crosses level 0. Therefore, we have by basic conditional probability,  = p 1 p 2 with p 1 := P (D K ) and p 2 := P (D L | D K ). For small values of  this method is not very efficient. We can see this by examining the relative error (RE) of the corresponding estimator which is defined as

13 Hence, if we estimate both p 1 and p 2 and multiply the results, we obtain an estimate for. When p 1 and p 2 are considerably larger than , this estimation procedure is likely to be more efficient than the standard method in the above equation. Moreover, the same arguments may be used when we divide the interval [0; L] into multiple subintervals, instead of just two.

14 Fixed Effort RESTART We partition the interval [0; L) into m subintervals [L 0, L 1 ), [L 1, L 2 ),..., [L m-1, L m ), with 0 =: L 0 < L 1 <... < L m := L. Let D i denote the event that process Z reaches level L i before returning to 0. It is assumed that Z actually hits all thresholds L 1,..., L m if event D m occurs. Then D 1, D 2,..., D m is a nested sequence of events, decreasing to D m. And, with p 1 := P (D 1 ), p 2 := P (D 2 |D 1 ),..., we have  = p 1 p 2 ...  p m. We wish to estimate at the k th stage ( k = 1,..., m ) the conditional probability p k. We do this by generating a fixed number of samples I 1 (k),..., I r k (k) of the indicator that process Z reaches level L k before returning to 0, starting from level L k −1. We call r k the simulation effort at stage k, and refer to this RESTART implementation as the Fixed Effort (FE) method.

15 is the total number of successes at the k th stage. Moreover, the natural estimator of  is Estimators of p 1,..., p m are given by where Obviously these are unbiased estimators.

16 (n 1 = r 1 ). Fixed Splitting RESTART (more standard) Every time when level L k from L k-1 is reached, n k, a fixed number, of samples in level L k are simulated. Hence, the effort in level k, r k = n k R k-1, is random. The estimator for  is

17 Single Step vs. Global Step In any simulation experiment involving RESTART we have two choices: either we simulate “stage–by–stage” or “root–by–root”. In the first case we complete all the paths starting from a certain stage before we move to the next one. This is called the Single Step approach.

18 In the second case, we generate all the offspring originating from a single root (i.e. a state in the upper stage which is reached immediately from the lower state) before we move to the next root. We call this the Global Step approach.

19 Literatur: C. Görg, E. Lamers, O. Fuß, P. Heegaard: Rare Event Simulation, chapter in: Modelling and Simulation Environment for Satellite and Terrestrial Communication Networks, Proceedings of the European COST Telecommunications Symposium (Cost256), Chapter 21, pp , Kluwer, Boston, Dordrecht, New York, M.J.J.Garvels und D.P.Kroese, a comparison of RESTART implementations. Proceedings of the 1998 Winter Simulation Conference, D.J.Medeires, E.F.Watson, J.S.Carson, and M.S.Manivannan, eds.