Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 1 of 39 Further Statistical Issues Chapter 12 Last revision June 9, 2003.

Similar presentations


Presentation on theme: "Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 1 of 39 Further Statistical Issues Chapter 12 Last revision June 9, 2003."— Presentation transcript:

1 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 1 of 39 Further Statistical Issues Chapter 12 Last revision June 9, 2003

2 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 2 of 39 What We’ll Do... Random-number generation Generating random variates Nonstationary Poisson processes Variance reduction Sequential sampling Designing and executing simulation experiments Backup material: Appendix C: A Refresher on Probability and Statistics Appendix D: Arena’s Probability Distributions

3 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 3 of 39 Random-Number Generators (RNGs) Algorithm to generate independent, identically distributed draws from the continuous UNIF (0, 1) distribution  These are called random numbers in simulation Basis for generating observations from all other distributions and random processes  Transform random numbers in a way that depends on the desired distribution or process (later in this chapter) It’s essential to have a good RNG There are a lot of bad RNGs — this is very tricky  Methods, coding are both tricky x 0 1 f(x)f(x) 1

4 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 4 of 39 The Nature of RNGs Recursive formula (algorithm)  Start with a seed (or seed vector)  Do something weird to the seed to get the next one  Repeat … generate the same sequence  Will eventually repeat (cycle) … want long cycle length Not really “random” as in unpredictable  Does this matter? Philosophically? Practically? Want to “design” RNGs  Long cycle length  Good statistical properties (uniform, independent) -- tests  Fast  Streams (subsegments) – many and long (for variance reduction … later) This is not easy!  Doing something weird isn’t enough

5 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 5 of 39 Linear Congruential Generators (LCGs) The most common of several different methods  But not the one in Arena (though it’s related) … more later Generate a sequence of integers Z 1, Z 2, Z 3, … via the recursion Z i = (a Z i–1 + c) (mod m) a, c, and m are carefully chosen constants Specify a seed Z 0 to start off “mod m” means take the remainder of dividing by m as the next Z i All the Z i ’s are between 0 and m – 1 Return the ith “random number” as U i = Z i / m

6 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 6 of 39 Example of a “Toy” LCG Parameters m = 63, a = 22, c = 4, Z 0 = 19: Z i = (22 Z i–1 + 4) (mod 63), seed with Z 0 = 19 i 22 Z i–1 +4Z i U i 019 1 422440.6984 2 972270.4286 3 598310.4921 4 686560.8889 : ::: 61 158320.5079 62 708150.2381 63 334190.3016 64 422440.6984 65 972270.4286 66 598310.4921 : ::: Cycling — will repeat forever Cycle length  m (could be << m depending on parameters) Pick m BIG But that might not be enough for good statistical properties

7 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 7 of 39 Issues with LCGs Cycle length: < m  Typically, m = 2.1 billion (= 2 31 – 1) or more – Which used to be a lot … more later  Other parameters chosen so that cycle length = m or m – 1 Statistical properties  Uniformity, independence  There are many tests of RNGs – Empirical tests – Theoretical tests — “lattice” structure (next slide …) Speed, storage — both are usually fine Must be carefully, cleverly coded — BIG integers Reproducibility — streams (long internal subsequences) with fixed seeds

8 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 8 of 39 Issues with LCGs (cont’d.) “Regularity” of LCGs (and other kinds of RNGs): For the earlier “toy” LCG … “Design” RNGs: dense lattice in high dimensions Other kinds of RNGs — longer memory in recursion, combination of several RNGs “Random Numbers Fall Mainly in the Planes” — George Marsaglia Plot of U i vs. iPlot of U i+1 vs. U i

9 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 9 of 39 The Original (1983) Arena RNG LCG with:m = 2 31 – 1 = 2,147,483,647 a = 7 5 = 16,807 c = 0 Cycle length = m – 1 = 2.1  10 9 Ten different automatic streams with fixed seeds In its day, this was a good, well-tested generator in an efficient code But current computer speed make this cycle length inadequate  Exhaust in 10 minutes on 2 GHz PC if we just generate Can get this generator (not recommended)  Place a Seeds module (Elements panel) in your model

10 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 10 of 39 The Current (2000) Arena RNG Uses some of the same ideas as LCG  Modulo division, recursive on earlier values But is not an LCG  Combines two separate component generators  Recursion involves more than just the preceding value Combined multiple recursive generator (CMRG) A n = (1403580 A n-2 – 810728 A n-3 ) mod 4294967087 B n = (527612 B n-1 – 1370589 B n-3 ) mod 4294944443 Z n = (A n – B n ) mod 4294967087 Seed = a six-vector of first three A n ’s, B n ’s Two simultaneous recursions Combine the two The next random number Z n / 4294967088 if Z n > 0 4294967087 / 4294967088 if Z n = 0 U n =

11 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 11 of 39 The Current (2000) Arena RNG – Properties Extremely good statistical properties  Good uniformity in up to 45-dimensional hypercube Cycle length = 3.1  10 57  To cycle, all six seeds must match up  On 2 GHz PC, would take 2.8  10 40 millennia to exhaust  Under Moore’s law, it will be 216 years until this generator can be exhausted in a year of nonstop computing Only slightly slower than old LCG  And RNG is usually a minor part of overall computing time

12 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 12 of 39 The Current (2000) Arena RNG – Streams and Substreams Automatic streams and substreams  1.8  10 19 streams of length 1.7  10 38 each  Each stream further divided into 2.3  10 15 substreams of length 7.6  10 22 each – 2 GHz PC would take 669 million years to exhaust a substream Default stream is 10 (historical reasons)  Also used for Chance-type Decide module To use a different stream, append its number after a distribution’s parameters  For example, EXPO(6.7, 4) to use stream 4 When using multiple replications, Arena automatically advances to next substream in each stream for the next replication  Helps synchronize for variance reduction

13 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 13 of 39 Generating Random Variates Have: Desired input distribution for model (fitted or specified in some way), and RNG (UNIF (0, 1)) Want: Transform UNIF (0, 1) random numbers into “draws” from the desired input distribution Method: Mathematical transformations of random numbers to “deform” them to the desired distribution  Specific transform depends on desired distribution  Details in online Help about methods for all distributions Do discrete, continuous distributions separately

14 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 14 of 39 Generating from Discrete Distributions Example: probability mass function Divide [0, 1]  Into subintervals of length 0.1, 0.5, 0.4  Generate U ~ UNIF (0, 1)  See which subinterval it’s in  Return X = corresponding value –203

15 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 15 of 39 Discrete Generation: Another View Plot cumulative distribution function; generate U and plot on vertical axis; read “across and down” Inverting the CDF Equivalent to earlier method

16 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 16 of 39 Generating from Continuous Distributions Example: EXPO (5) distribution Density (PDF) Distribution (CDF) General algorithm (can be rigorously justified): 1.Generate a random number U ~ UNIF(0, 1) 2.Set U = F(X) and solve for X = F –1 (U)  Solving analytically for X may or may not be simple (or possible)  Sometimes use numerical approximation to “solve”

17 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 17 of 39 Generating from Continuous Distributions (cont’d.) Solution for EXPO (5) case: SetU = F(X) = 1 – e –X/5 e –X/5 = 1 – U –X/5 = ln (1 – U) X = – 5 ln (1 – U) Picture (inverting the CDF, as in discrete case): Intuition (garden hose): More U’s will hit F(x) where it’s steep This is where the density f(x) is tallest, and we want a denser distribution of X’s

18 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 18 of 39 Nonstationary Poisson Processes Many systems have externally originating events affecting them — e.g., arrivals of customers If process is stationary over time, usually specify a fixed interevent-time distribution But process could vary markedly in its rate  Fast-food lunch rush  Freeway rush hours Ignoring nonstationarity can lead to serious model and output errors Already seen this — automotive repair shop, Chapter 5

19 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 19 of 39 Nonstationary Poisson Processes – Definition Usual model: nonstationary Poisson process:  Have a rate function (t)  Number of events in [t 1, t 2 ] ~ Poisson with mean Issues:  How to estimate rate function?  Given an estimate, how to generate during simulation? (t) t

20 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 20 of 39 Nonstationary Poisson Processes – Estimating the Rate Function Estimation of the rate function  Probably the most practical method is piecewise constant – Decide on a time interval within which rate is fixed – Estimate from data the (constant) rate during each interval – Be careful to get the units right  Other (more complicated) methods exist in the literature

21 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 21 of 39 Nonstationary Poisson Processes – Generation Arena has a built-in method to generate, assuming a piecewise-constant rate function  Arrival Schedule in Create module – auto repair (Model 5-2) Method is to invert a rate-one stationary Poisson process against the cumulate rate function   Similar to inverting CDF for continuous random-variable generation  Exploits some speed-up possibilities  Details in Help topic “Non-Stationary Exponential Distribution” Alternative method: “thinning” of a stationary Poisson process at the peak rate

22 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 22 of 39 Variance Reduction Random input  random output (RIRO) In other words, output has variance  Higher output variance means less precise results  Would like to eliminate or reduce output variance – One (bad) way to eliminate: replace all input random variables by constants (like their mean) – Will get rid of random output, but will also invalidate model – Thus, best hope is to reduce output variance Easy (brute-force) variance reduction: just simulate some more  Terminating: additional replications  Steady-state: additional replications or a longer run

23 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 23 of 39 Variance Reduction (cont’d.) But sometimes can reduce variance without more runs — free lunch (?) Key: unlike physical experiments, can control randomness in computer-simulation experiments via manipulating the RNG  Re-use the same “random” numbers either as they were, in some opposite sense, or for a similar but simpler model Several different variance-reduction techniques  Classified into categories — common random numbers, antithetic variates, control variates, indirect estimation, …  Usually requires thorough understanding of model, “code”  Will look only at common random numbers in detail

24 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 24 of 39 Common Random Numbers (CRN) Applies when objective is to compare two (or more) alternative configurations or models  Interest is in difference(s) of performance measure(s) across alternatives  Model 7-2 (small mfg. system), total avg. WIP output — two alternatives A. Base case (as is) B. 3.5% increase in business (interarrival-time mean falls from 13 to 12.56 minutes)  Same run conditions, but change model into Model 12-1: – Remove Output File Total WIP History.dat – Add entry to Statistic module to compute and save to a.dat file the total avg. WIP on each replication

25 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 25 of 39 The “Natural” Comparison Run case A, make the change to get to case B and run it, then Compare Means via Output Analyzer: Difference is not statistically significant  Were the runs of A and B statistically independent?  Did we use the same random numbers running A and B?  Did we use the same random numbers intelligently?

26 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 26 of 39 CRN Intuition Get sharper comparison if you subject all alternatives to the same “conditions”  Then observed differences are due to model differences rather than random differences in the “conditions”  Small mfg. system: For both A and B runs, cause: – The “same” parts arrive at the same times – Be assigned same attributes (job type) – Have the same process times at each step Then observed differences will be attributable to system differences, not random bounce  There isn’t any random bounce

27 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 27 of 39 Synchronization of Random Numbers in CRN Generally, get CRN by using the same RNG, seed, stream(s) for all alternatives  Already are using the same stream, default = stream 10  But its usage generally gets mixed up across alternatives Must use the same random numbers for the same purposes across the alternatives — synchronization of random-number usage  Usually requires some work, understanding of model  Usually use different streams in the RNG  Usually different ways to do this in a given model  Sometimes can’t synchronize completely for complex models — settle for partial synchronization

28 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 28 of 39 Synchronization of Random Numbers in CRN (cont’d.) Synchronize by source of randomness (we’ll do)  Assign stream to each point of variate generation – Separate random-number “faucets” … extra parameter in r.v. calls – Model 12-1: 14 sources of randomness, separate stream for each (see book for details), modify into Model 12-2  Fairly simple but might not ensure complete synchronization; still usually get some benefit from this Synchronize by entity (won’t do — see Exercises)  Pre-generate every possible random variate an entity might need when it arrives, assign to attributes, used downstream  Better synchronization insurance but uses more memory Across replications, RNG automatically goes to next substream within each stream  Maintains synchronization if alternatives disagree on number of random numbers used per stream per replication

29 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 29 of 39 Effect of CRN CRN was no more computational effort Effect here is fairly dramatic, but it will not always be this strong  Depends on “how similar” A and B are “Natural” Comparison Synchronized CRN

30 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 30 of 39 CRN Statistical Issues In Output Analyzer, Analyze > Compare Means option, have choice of Paired-t or Two-Sample-t for c.i. test on difference between means  Paired-t subtracts results replication by replication — must use this if doing CRN  Two-Sample-t treats the samples independently — can use this if doing independent sampling, often better than Paired-t Mathematical justification for CRN  Let X = output r.v. from alternative A, Y = output from B Var(X – Y) = Var(X) + Var(Y) – 2 Cov(X, Y) = 0 if indep > 0 with CRN

31 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 31 of 39 Other Variance-Reduction Techniques For single-system simulation, not comparisons Antithetic variates: make pairs of runs  Use “U’s” on first run of pair, “1 – U’s” on second run of pair  Take average of two runs: negatively correlated, reducing variance of average  Like CRN, must take care to synchronize Control variates  Use internal variate generation to “control” results up, down Indirect estimation  Simulation estimates something other than what you want, but related to what you want by a fixed formula

32 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 32 of 39 Sequential Sampling Always try to quantify imprecision in results  If imprecision is “small enough,” you’re done  If not, need to do something to increase precision  Just saw one way: variance-reduction techniques Obvious way to increase precision: keep simulating one more “step” at a time, quit when you achieve desired precision  Terminating models: “step” = another replication – Cannot extend length of replications — that’s part of the model  Steady-state models: – “step” = another replication if using truncated replications, or – “step” = some extension of the run if using batch means

33 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 33 of 39 Sequential Sampling — Terminating Models Modify Model 12-2 (small mfg., base case, with random-number streams) into Model 12-3 Made 25 replications, get 95% c.i. on expected average Total WIP as 13.03 ± 1.28  Suppose the ±1.28 is too big — want to reduce to ±0.5  Approximate formulas from Sec. 6.3: need 124 or 164 total replications (depending on which formula) rather than 25  Instead, just make one more at a time, re-compute c.i., stop as soon as half-width is less than 0.5  “Trick” Arena to keep making more replications until c.i. half-width < tolerance = 0.5

34 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 34 of 39 Sequential Sampling — Terminating Models (cont’d.) Recall: With > 1 replication, automatically get cross-replication 95% c.i.’s for expected values in Category Overview report Related internal Arena variables:  ORUNHALF(Output ID) = half-width of 95% c.i. using completed replications (Output ID = Avg Total WIP )  MREP = total number of replications asked for (initially, MREP = Number of Replications in Run > Setup > Replication Parameters)  NREP = replication number now in progress (= 1, 2, 3, …) Use, manipulate these variables

35 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 35 of 39 Sequential Sampling — Terminating Models (cont’d.) Initially set MREP to huge value in Number of Replications in Run > Setup > Replication Parameters  Keep replicating until we cut off when half-width  0.5 Add a logic (in a submodel) to sense when done  Create one “control” entity at beginning of each replication  Control entity immediately checks to see if: – NREP  2: This is beginning of 1st or 2nd replication – ORUNHALF(Avg Total WIP) > 0.5: c.i. on completed replications is still too big In either case, keep going with this replication (and the next one too); control entity is Disposed and takes no action  If both conditions are false, Control entity Assigns MREP = NREP to stop after this replication, and is Disposed

36 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 36 of 39 Sequential Sampling — Terminating Models (cont’d.) Details  This overshoots required number of replications by one  In Assign module setting MREP to NREP, have to select Type = Other since MREP is a built-in Arena variable  Results: Stopped with 232 total replications, yielding half width = 0.49699 (barely less than 0.5)  Different from earlier number-of-replications approximations (they’re just that) Generalizations  Precision demands on several outputs  Relative-width stopping: (half-width) / (pt. estimate) small

37 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 37 of 39 Sequential Sampling — Steady-State Models If doing truncated-replications approach to steady-state statistical analysis  Same strategy as above for terminating models  Warm-up Period specified in Run > Setup > Replication Parameters  Err on the side of too much warmup – Point-estimator bias is especially dangerous in sequential sampling – Getting tight c.i. centered in the wrong place – The tighter the c.i. demand, the worse the coverage probability

38 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 38 of 39 Sequential Sampling — Steady-State Models (cont’d.) Batch-means approach  Model 12-4: modification of Model 7-4 (small mfg. system) – want half-width on E(average WIP) to be < 1  Keep extending the run to reduce c.i. half-width  Use automatic run-time batch-means 95% c.i.’s  Stopping criterion: Terminating Condition field of Run > Setup > Replication Parameters – Half-width variables are THALF(Tally ID) or DHALF(Dstat ID) – For us, condition is DHALF(Total WIP) < 1  Remove all other stopping devices from model  If “Insuf” or “Corr” would be returned because of too little data, half-width variables set to huge value — keep going  Could demand multiple smallness criteria, relative precision (use TAVG, DAVG variables for point-estimate denominator)

39 Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 39 of 39 Designing and Executing Simulation Experiments Think of a simulation model as a convenient “testbed” or laboratory for experimentation  Look at different output responses  Look at effects, interaction of different input factors Apply classical experimental-design techniques  Factorial experiments — main effects, interactions  Fractional-factorial experiments  Factor-screening designs  Response-surface methods, “metamodels”  CRN is “blocking” in experimental-design terminology  Process Analyzer (PAN) provides a convenient way to carry out a designed experiment – See Chapt. 6 for an example of using PAN for a factorial experiment


Download ppt "Simulation with Arena, 3 rd ed.Chapter 12 – Further Statistical IssuesSlide 1 of 39 Further Statistical Issues Chapter 12 Last revision June 9, 2003."

Similar presentations


Ads by Google