Generating Random Variates
Introduction There are various procedures to produce observations (variates) from some desired input distribution (like exponential, gamma, etc.) The algorithm usually depends on desired distribution But all algorithms have the same general form Note the critical importance of a good random-number generator There may be several different algorithms for the desired distribution
Desired Properties Exact: X should have exactly (not approximately) the desired distribution Example of approximate algorithm: Treat Z = U1 + U2 + ... + U12 – 6 as N(0, 1) Mean and variance are correct; it relies on CLT for approximate normality Range is clearly incorrect Efficient: Low storage Fast Efficient regardless of parameter values (robust) Simple: Easy to understand and implement (often tradeoff against efficiency) Requires only U(0, 1) input If possible, one U ® one X (for synchronization in variance reduction)
Generating Random Variates There are five general approaches to generating a univariate random variables from a distribution Inverse transform method Composition method Convolution method Acceptance-rejection method Special properties
Inverse Transform Method The simplest (in principle) and “best” method in some ways Continuous Case Suppose X is continuous with cumulative distribution function (CDF) F(x) = P(X £ x) for all real numbers x that is strictly increasing over all x Step 2 involves solving the equation F(X) = U for X; the solution is written as X = F–1(U) and we must invert the CDF F Inverting F might be easy (exponential), or difficult (normal) in which case numerical methods might be necessary Algorithm: 1. Generate U ~ U(0, 1) using a random number generator 2. Find X such that F(X) = U and return this value of X
Inverse Transform Method Mathematical proof: Assume that F is strictly increasing for all x. For a fixed value x0 P(X £ x0) = P(F–1(U) £ x0) = P(F(F-1(U)) £ F(x0)) = P(U £ F(x0)) = F(x0) Graphical proof: X1 £ x0 if and only if U1 £ F(x0), so P(X1 £ x0) = P(U1 £ F(x0)) = F(x0)
Examples Weibull (α, β) distribution with parameters α > 0 and β > 0 Density function is CDF is Solving U = F(X) for X, we get Since 1 – U ~ U(0, 1) as well, we can replace 1 – U by U to get
Intuition Behind the Inverse Transform Method Weibull (α = 1.5, β = 6)
Intuition Behind the Inverse Transform Method
The Algorithm in Action
Inverse Transform Method Discrete case Suppose X is discrete with cumulative distribution function F(x) = P(X £ x) for all real numbers x and probability mass function p(xi) = P(X = xi), where x1, x2, ... are the possible values X can take Unlike the continuous case, the discrete inverse transform method can always be used for any discrete distribution Graphical proof: From the above picture, P(X = xi) = p(xi) in every case Algorithm: 1. Generate random number U ~ U(0,1) 2. Find the smallest positive integer i such that U ≤ F(xi) 3. Return X = xi
Example Discrete uniform distribution on 1, 2, ..., 100 xi = i, and p(xi) = p(i) = P(X = i) = 0.01 for i = 1, 2, ..., 100
Example “Literal” inverse transform search 1. Generate U ~ U(0,1) 2. If U ≤ 0.01 return X = 1 and stop; else go on 3. If U ≤ 0.02 return X = 2 and stop; else go on 4. If U ≤ 0.03 return X = 3 and stop; else go on . 100. If U ≤ 0.99 return X = 99 and stop; else go on 101. Return X = 100 Equivalently (on a U-for-U basis) 1. Generate U ~ U(0,1) 2. Return X = ë100 Uû + 1
Generalized Inverse Transform Method X = min{x: F(x) ³ U} This is valid for any CDF F(x) Continuous, possibly with flat spots (i.e., not strictly increasing) Discrete Mixed continuous-discrete Problems with the inverse transform method Inverting the CDF may be difficult (numerical methods) It may not be the fastest or simplest approach for a given distribution
Other Implementations of Inverse Transform Approach Sometimes the inverse function is not analytically available but can be computed numerically. In this case, we can still ply the inverse transform approach by using the numerical inverse. Consider the normal distribution, the CDF cannot be written in closed form so the inverse function does not exist in closed form. However, most software enables numerical calculation of the inverse function. Example: NORMINV(p,m,s) in Excel returns F-1X(p) where X is Normal(m,s) and 0<p<1. Therefore, we can generate realizations from X by: NORMINV(RAND() ,m,s) which is equivalent to: F-1X(U) where U is uniform (0,1).
Acceptance-Rejection Method This is used when inverse transform is not directly applicable or is inefficient (like for gamma and beta distributions) It has continuous and discrete versions; we’ll just do the continuous case, discrete case is similar Goal is to generate X with some density function f Specify a function t(x) that majorizes f(x), (t(x) ≥ f(x) for all x)
Acceptance-Rejection Method Then t(x) ≥ 0 for all x, but , so t(x) is not a density Set Define a new density r(x) = t(x)/c for all x Algorithm Generate Y having density r Generate U ~ U(0,1) (Independent of Y in Step 1) If U ≤ f(Y)/t(Y), return X = Y and stop; else, go back to step 1 and try again (Repeat 1-2-3 until acceptance finally occurs in Step 3) Since t majorizes f, f(Y)/t(Y) ≤ 1 so “U ≤ f(Y) / t(Y)” may or may not occur We must be able to generate Y with density r, hopefully this is easy with our choice of t On each pass, P(acceptance) = 1/c, so we want small c = area under t(x), to “fit” down on top of f closely (we want t and thus r to resemble f closely) Tradeoff between ease of generation from r, and closeness of fit to f
Acceptance-Rejection Method Proof: The key point is that we get an X only conditional on acceptance in step 3
Example of Acceptance-Rejection Method Beta (4,3) distribution, density is f(x) = 60 x3(1-x)2 for 0 ≤ x ≤ 1 The maximum occurs at x=0.6, f(0.6)= 2.0736 (exactly), so let t(x) = 2.0736 for 0 ≤ x ≤ 1 Thus, c = 2.0736, and r is the U(0,1) density function Algorithm: Generate Y ~ U(0,1) Generate U ~ U(0,1) independent of Y If U ≤ 60Y3(1-Y)2 /2.0736, return X=Y; else go back to step 1 and try again P(acceptance) in step 3 is 1/2.0736 = 0.48
Intuition
Intuition A different way to look at it Accept Y if U ≤ f(Y)/ t(Y), so plot the pairs (Y, U t(Y)) and accept the Y’s for which the pair is under the f curve
A Closer Fit Higher acceptance probability on a given pass Harder to generate Y with density shaped like t, can perhaps use the composition method
Example fX(x)=3/4(-x2+1) if -1<x<1 and fX(x)=0 otherwise. Majorizing function? 3/4 x -1 1
Composition Method Suppose that the inverse transform method is difficult or too slow Suppose we can find other CDFs F1, F2, ... (finite or infinite list) and weights p1, p2, ... (pj ≥ 0 and p1 + p2 + ... = 1) such that for all x F(x) = p1F1(x) + p2F2(x) + ... Equivalently, we can decompose the density or mass function f(x) into convex combination of other density or mass functions f(x) = p1f1(x) + p2f2(x) + ... Algorithm: 1. Generate a positive random integer J such that P(J = j) = pj 2. Return X with CDF FJ (given J = j, X is generated independent of J) The trick is to find Fj’s from which generation is easy and fast Sometimes we can use geometry of distribution to suggest a decomposition
Example of Composition Method Symmetric triangular distribution on [–1, +1] Inverse transform method
Example of Composition Method Composition of the density f1(x) p1 f2(x) p2
Example of Composition Method Composition algorithm 1. Generate U1, U2 ~ U(0,1) independently 2. If U1< 1/2, return ;otherwise return Comparison of algorithms (expectations) So composition needs one more U, one less multiplication—faster if RNG is fast Method U’s Compares Adds Multiplies Sq. Roots Inverse transform 1 1.5 Composition 2
Convolution Suppose X has the same distribution as Y1 + Y2 + ... + Ym, where the Yj’s are IID and m is fixed and finite Then, X ~ Y1 + Y2 + ... + Ym has the m-fold convolution of the distribution of Yj Contrast with the composition method Composition method expresses the distribution function (or density or mass) as a (weighted) sum of other distribution functions (or densities or masses) Convolution method expresses the random variable itself as the sum of other random variables Algorithm (obvious) 1. Generate Y1, Y2, ..., Ym independently from their distribution 2. Return X = Y1 + Y2 + ... + Ym
Examples of Convolution Method Symmetric triangular distribution on [–1, +1] (again) By simple conditional probability, if U1,U2 ~ IID U(0,1), then U1 + U2 ~ symmetric triangular distribution on [0, 2], so just shift it left by 1 2 Us, 2 adds; no compares, multiplies, or square roots and it clearly beats both inverse transform and composition methods X ~ Erlang m with mean b > 0 Then, X = Y1 + Y2 + ... + Ym where Yj’s ~ IID exponential with mean b/m
Special Properties Simply “tricks” that rely completely on a given distribution’s form It often involves combining several “component” variates algebraically (like convolution) Must be able to figure out distributions of functions of random variables There is no coherent general form, only examples