Presentation is loading. Please wait.

Presentation is loading. Please wait.

Further Statistical Issues

Similar presentations


Presentation on theme: "Further Statistical Issues"— Presentation transcript:

1 Further Statistical Issues
Chapter 12 Last revision August 26, 2006, 2003 Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

2 Chapter 12 – Further Statistical Issues
What We’ll Do ... Random-number generation Generating random variates Nonstationary Poisson processes Variance reduction Sequential sampling Designing and executing simulation experiments Backup material: Appendix C: A Refresher on Probability and Statistics Appendix D: Arena’s Probability Distributions Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

3 Random-Number Generators (RNGs)
Algorithm to generate independent, identically distributed draws from the continuous UNIF (0, 1) distribution These are called random numbers in simulation Basis for generating observations from all other distributions and random processes Transform random numbers in a way that depends on the desired distribution or process (later in this chapter) It’s essential to have a good RNG There are a lot of bad RNGs — this is very tricky Methods, coding are both tricky x 1 f(x) Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

4 Chapter 12 – Further Statistical Issues
The Nature of RNGs Recursive formula (algorithm) Start with a seed (or seed vector) Do something weird to the seed to get the next one Repeat … generate the same sequence Will eventually repeat (cycle) … want long cycle length Not really “random” as in unpredictable Does this matter? Philosophically? Practically? Want to “design” RNGs Long cycle length Good statistical properties (uniform, independent) -- tests Fast Streams (subsegments) – many and long (for variance reduction … later) This is hard! Doing something weird isn’t enough Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

5 Linear Congruential Generators (LCGs)
The most common of several different methods But not the one in Arena (though it’s related) … later Generate a sequence of integers Z1, Z2, Z3, … via the recursion Zi = (a Zi–1 + c) (mod m) a, c, and m are carefully chosen constants Specify a seed Z0 to start off “mod m” means take the remainder of dividing by m as the next Zi All the Zi’s are between 0 and m – 1 Return the ith “random number” as Ui = Zi / m Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

6 Zi = (22 Zi–1 + 4) (mod 63), seed with Z0 = 19
Example of a “Toy” LCG Parameters m = 63, a = 22, c = 4, Z0 = 19: Zi = (22 Zi–1 + 4) (mod 63), seed with Z0 = 19 i 22 Zi–1+4 Zi Ui 0 19 : : : : : : : : Cycling — will repeat forever Cycle length £ m (could be << m depending on parameters) Pick m BIG But that might not be enough for good statistical properties Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

7 Chapter 12 – Further Statistical Issues
Issues with LCGs Cycle length: < m Typically, m = 2.1 billion (= 231 – 1) or more Which used to be a lot … more later Other parameters chosen so that cycle length = m or m – 1 Statistical properties Uniformity, independence There are many tests of RNGs Empirical tests Theoretical tests — “lattice” structure (next slide …) Speed, storage — both are usually fine Must be carefully, cleverly coded — BIG integers Reproducibility — streams (long internal subsequences) with fixed seeds Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

8 Issues with LCGs (cont’d.)
“Regularity” of LCGs (and other kinds of RNGs): For the earlier “toy” LCG … “Design” RNGs: dense lattice in high dimensions Other kinds of RNGs — longer memory in recursion, combination of several RNGs Plot of Ui vs. i Plot of Ui+1 vs. Ui “Random Numbers Fall Mainly in the Planes” — George Marsaglia Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

9 Chapter 12 – Further Statistical Issues
Original (1983) Arena RNG LCG with: m = 231 – 1 = 2,147,483,647 a = 75 = 16,807 c = 0 Cycle length = m – 1 = 2.1  109 Ten different automatic streams with fixed seeds In its day, this was a good, well-tested generator in an efficient code But current computer speed make this cycle length inadequate Exhaust in 10 minutes on 2 GHz PC if we just generate Can get this generator (not recommended) Place a Seeds module (Elements panel) in your model Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

10 Chapter 12 – Further Statistical Issues
Current Arena RNG Uses some of the same ideas as LCG Modulo division, recursive on earlier values But is not an LCG Combines two separate component generators Recursion involves more than just the preceding value Combined multiple recursive generator (CMRG) An = ( An-2 – An-3) mod Bn = ( Bn-1 – Bn-3) mod Zn = (An – Bn) mod Seed = a six-vector of first three An’s, Bn’s Two simultaneous recursions Combine the two Zn / if Zn > 0 / if Zn = 0 Un = The next random number Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

11 The Current (2000) Arena RNG – Properties
Extremely good statistical properties Good uniformity in up to 45-dimensional hypercube Cycle length = 3.1  1057 To cycle, all six seeds must match up On 2 GHz PC, would take 2.8  1040 millennia to exhaust Under Moore’s law, it will be 216 years until this generator can be exhausted in a year of nonstop computing Only slightly slower than old LCG And RNG is usually a minor part of overall computing time Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

12 The Current (2000) Arena RNG – Streams and Substreams
Automatic streams and substreams 1.8  1019 streams of length 1.7  1038 each Each stream further divided into 2.3  1015 substreams of length 7.6  1022 each 2 GHz PC would take 669 million years to exhaust a substream Default stream is 10 (historical reasons) Also used for Chance-type Decide module To use a different stream, append its number after a distribution’s parameters For example, EXPO(6.7, 4) to use stream 4 When using multiple replications, Arena automatically advances to next substream in each stream for the next replication Helps synchronize for variance reduction ... later Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

13 Generating Random Variates
Have: Desired input distribution for model (fitted or specified in some way), and RNG (UNIF (0, 1)) Want: Transform UNIF (0, 1) random numbers into “draws” from the desired input distribution Method: Mathematical transformations of random numbers to “deform” them to the desired distribution Specific transform depends on desired distribution Details in online Help about methods for all distributions Do discrete, continuous distributions separately Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

14 Generating from Discrete Distributions
Example: probability mass function Divide [0, 1] Into subintervals of length 0.1, 0.5, 0.4 Generate U ~ UNIF (0, 1) See which subinterval it’s in Return X = corresponding value –2 3 Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

15 Discrete Generation: Another View
Plot cumulative distribution function; generate U and plot on vertical axis; read “across and down” Inverting the CDF Equivalent to earlier method Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

16 Generating from Continuous Distributions
Example: EXPO (5) distribution Density (PDF) Distribution (CDF) General algorithm (can be rigorously justified): 1. Generate a random number U ~ UNIF(0, 1) 2. Set U = F(X) and solve for X = F–1(U) Solving analytically for X may or may not be simple (or possible) Sometimes use numerical approximation to “solve” Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

17 Generating from Continuous Distributions (cont’d.)
Solution for EXPO (5) case: Set U = F(X) = 1 – e–X/5 e–X/5 = 1 – U –X/5 = ln (1 – U) X = – 5 ln (1 – U) Picture (inverting the CDF, as in discrete case): Intuition (garden hose): More U’s will hit F(x) where it’s steep This is where the density f(x) is tallest, and we want a denser distribution of X’s Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

18 Nonstationary Poisson Processes
Many systems have externally originating events affecting them — e.g., arrivals of customers If process is stationary over time, usually specify a fixed interevent-time distribution But process could vary markedly in its rate Fast-food lunch rush Freeway rush hours Ignoring nonstationarity can lead to serious model and output errors Already seen this — enhanced call center, Models 5-2, 5-3 Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

19 Nonstationary Poisson Processes – Definition
Usual model: nonstationary Poisson process: Have a rate function l(t) Number of events in [t1, t2] ~ Poisson with mean Issues: How to estimate rate function? Given an estimate, how to generate during simulation? l(t) t Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

20 Nonstationary Poisson Processes – Estimating the Rate Function
Estimation of the rate function Probably the most practical method is piecewise constant Decide on a time interval within which rate is fixed Estimate from data the (constant) rate during each interval Be careful to get the units right Other (more complicated) methods exist in the literature Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

21 Nonstationary Poisson Processes – Generation
Arena has a built-in method to generate, assuming a piecewise-constant rate function Arrival Schedule in Create module – Models 5-2, 5-3 Method is to invert a rate-one stationary Poisson process against the cumulate rate function L Similar to inverting CDF for continuous random-variable generation Exploits some speed-up possibilities Details in Help topic “Non-Stationary Exponential Distribution” Alternative method: “thinning” of a stationary Poisson process at the peak rate Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

22 Chapter 12 – Further Statistical Issues
Variance Reduction Random input Þ random output (RIRO) In other words, output has variance Higher output variance means less precise results Would like to eliminate or reduce output variance One (bad) way to eliminate: replace all input random variables by constants (like their mean) Will get rid of random output, but will also invalidate model Thus, best hope is to reduce output variance Easy (brute-force) variance reduction: just simulate some more Terminating: additional replications Steady-state: additional replications or a longer run Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

23 Variance Reduction (cont’d.)
But sometimes can reduce variance without more runs — free lunch (?) Key: unlike physical experiments, can control randomness in computer-simulation experiments via manipulating the RNG Re-use the same “random” numbers either as they were, in some opposite sense, or for a similar but simpler model Several different variance-reduction techniques Classified into categories — common random numbers, antithetic variates, control variates, indirect estimation, … Usually requires thorough understanding of model, “code” Will look only at common random numbers in detail Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

24 Common Random Numbers (CRN)
Applies when objective is to compare two (or more) alternative configurations or models Interest is in difference(s) of performance measure(s) across alternatives Model 7-2 (small mfg. system), total avg. WIP output — two alternatives A. Base case (as is) B. 3.5% increase in business (interarrival-time mean falls from 13 to minutes) Same run conditions, but change model into Model 12-1: Remove Output File Total WIP History.dat Add entry to Statistic module to compute and save to a .dat file the total avg. WIP on each replication Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

25 The “Natural” Comparison
Run case A, make the change to get to case B and run it, then Compare Means via Output Analyzer: Difference is not statistically significant Were the runs of A and B statistically independent? Did we use the same random numbers running A and B? Did we use the same random numbers intelligently? Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

26 Chapter 12 – Further Statistical Issues
CRN Intuition Get sharper comparison if you subject all alternatives to the same “conditions” Then observed differences are due to model differences rather than random differences in the “conditions” Small mfg. system: For both A and B runs, cause: The “same” parts arrive at the same times Be assigned same attributes (job type) Have the same process times at each step Then observed differences will be attributable to system differences, not random bounce There isn’t any random bounce Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

27 Synchronization of Random Numbers in CRN
Generally, get CRN by using the same RNG, seed, stream(s) for all alternatives Already are using the same stream, default = stream 10 But its usage generally gets mixed up across alternatives Must use the same random numbers for the same purposes across the alternatives — synchronization of random-number usage Usually requires some work, understanding of model Usually use different streams in the RNG Usually different ways to do this in a given model Sometimes can’t synchronize completely for complex models — settle for partial synchronization Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

28 Synchronization of Random Numbers in CRN (cont’d.)
Synchronize by source of randomness (we’ll do) Assign stream to each point of variate generation Separate random-number “faucets” … extra parameter in r.v. calls Model 12-1: 14 sources of randomness, separate stream for each (see book for details), modify into Model 12-2 Fairly simple but might not ensure complete synchronization; still usually get some benefit from this Synchronize by entity (won’t do — see Exercises) Pre-generate every possible random variate an entity might need when it arrives, assign to attributes, used downstream Better synchronization insurance but uses more memory Across replications, RNG automatically goes to next substream within each stream Maintains synchronization if alternatives disagree on number of random numbers used per stream per replication Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

29 Chapter 12 – Further Statistical Issues
Effect of CRN “Natural” Comparison Synchronized CRN CRN was no more computational effort Effect here is fairly dramatic, but it will not always be this strong Depends on “how similar” A and B are Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

30 CRN Statistical Issues
In Output Analyzer, Analyze > Compare Means option, have choice of Paired-t or Two-Sample-t for c.i. test on difference between means Paired-t subtracts results replication by replication — must use this if doing CRN Two-Sample-t treats the samples independently — can use this if doing independent sampling, often better than Paired-t Mathematical justification for CRN Let X = output r.v. from alternative A, Y = output from B Var(X – Y) = Var(X) + Var(Y) – 2 Cov(X, Y) = 0 if indep > 0 with CRN Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

31 Other Variance-Reduction Techniques
For single-system simulation, not comparisons Antithetic variates: make pairs of runs Use “U’s” on first run of pair, “1 – U’s” on second run of pair Take average of two runs: negatively correlated, reducing variance of average Like CRN, must take care to synchronize Control variates Use internal variate generation to “control” results up, down Indirect estimation Simulation estimates something other than what you want, but related to what you want by a fixed formula Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

32 Chapter 12 – Further Statistical Issues
Sequential Sampling Always try to quantify imprecision in results If imprecision is “small enough,” you’re done If not, need to do something to increase precision Just saw one way: variance-reduction techniques Obvious way to increase precision: keep simulating one more “step” at a time, quit when you achieve desired precision Terminating models: “step” = another replication Cannot extend length of replications — that’s part of the model Steady-state models: “step” = another replication if using truncated replications, or “step” = some extension of the run if using batch means Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

33 Sequential Sampling — Terminating Models
Modify Model 12-2 (small mfg., base case, with random-number streams) into Model 12-3 Made 25 replications, get 95% c.i. on expected average Total WIP as ± 1.28 Suppose the ±1.28 is too big — want to reduce to ±0.5 Approximate formulas from Sec. 6.3: need 124 or 164 total replications (depending on which formula) rather than 25 Instead, just make one more at a time, re-compute c.i., stop as soon as half-width is less than 0.5 “Trick” Arena to keep making more replications until c.i. half-width < tolerance = 0.5 Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

34 Sequential Sampling — Terminating Models (cont’d.)
Recall: With > 1 replication, automatically get cross-replication 95% c.i.’s for expected values in Category Overview report Related internal Arena variables: ORUNHALF(Output ID) = half-width of 95% c.i. using completed replications (Output ID = Avg Total WIP) MREP = total number of replications asked for (initially, MREP = Number of Replications in Run > Setup > Replication Parameters) NREP = replication number now in progress (= 1, 2, 3, …) Use, manipulate these variables Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

35 Sequential Sampling — Terminating Models (cont’d.)
Initially set MREP to huge value in Number of Replications in Run > Setup > Replication Parameters Keep replicating until we cut off when half-width £ 0.5 Add logic (in a submodel) to sense when done Create one “control” entity at beginning of each replication Control entity immediately checks to see if: NREP £ 2: This is beginning of 1st or 2nd replication ORUNHALF(Avg Total WIP) > 0.5: c.i. on completed replications is still too big In either case, keep going with this replication (and the next one too); control entity is Disposed and takes no action If both conditions are false, Control entity Assigns MREP = NREP to stop after this replication, and is Disposed Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

36 Sequential Sampling — Terminating Models (cont’d.)
Details This overshoots required number of replications by one In Assign module setting MREP to NREP, have to select Type = Other since MREP is a built-in Arena variable Results: Stopped with 232 total replications, yielding half width = (barely less than 0.5) Different from earlier number-of-replications approximations (they’re just that) Generalizations Precision demands on several outputs Relative-width stopping: (half-width) / (pt. estimate) small Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

37 Sequential Sampling — Steady-State Models
If doing truncated-replications approach to steady-state statistical analysis Same strategy as above for terminating models Warm-up Period specified in Run > Setup > Replication Parameters Err on the side of too much warmup Point-estimator bias is especially dangerous in sequential sampling Getting tight c.i. centered in the wrong place The tighter the c.i. demanded, the worse the coverage probability Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

38 Sequential Sampling — Steady-State Models (cont’d.)
Batch-means approach Model 12-4: modification of Model 7-4 (small mfg. system) – want half-width on E(average WIP) to be < 1 Keep extending the run to reduce c.i. half-width Use automatic run-time batch-means 95% c.i.’s Stopping criterion: Terminating Condition field of Run > Setup > Replication Parameters Half-width variables are THALF(Tally ID) or DHALF(Dstat ID) For us, condition is DHALF(Total WIP) < 1 Remove all other stopping devices from model If “Insuf” or “Corr” would be returned because of too little data, half-width variables set to huge value — keep going Could demand multiple smallness criteria, relative precision (TAVG, DAVG variables for point-estimate denominator) Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues

39 Designing and Executing Simulation Experiments
Think of a simulation model as a convenient “testbed” or laboratory for experimentation Look at different output responses Look at effects, interaction of different input factors Apply classical experimental-design techniques Factorial experiments — main effects, interactions Fractional-factorial experiments Factor-screening designs Response-surface methods, “metamodels” CRN is “blocking” in experimental-design terminology Process Analyzer (PAN) provides a convenient way to carry out a designed experiment See Chapt. 6 for an example of using PAN for a factorial experiment Simulation with Arena, 4th ed. Chapter 12 – Further Statistical Issues


Download ppt "Further Statistical Issues"

Similar presentations


Ads by Google