Testing Random-Number Generators Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Slides:



Advertisements
Similar presentations
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Advertisements

© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 8 Estimation: Additional Topics
Multiple regression analysis
Probability Densities
Point estimation, interval estimation
Simulation Modeling and Analysis
1 Random Number Generation H Plan: –Introduce basics of RN generation –Define concepts and terminology –Introduce RNG methods u Linear Congruential Generator.
Statistical inference form observational data Parameter estimation: Method of moments Use the data you have to calculate first and second moment To fit.
BHS Methods in Behavioral Sciences I
Random Number Generation
IEEM 3201 Two-Sample Estimation: Paired Observation, Difference.
Distinguishing Features of Simulation Time (CLK)  DYNAMIC focused on this aspect during the modeling section of the course Pseudorandom variables (RND)
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
- 1 - Summary of P-box Probability bound analysis (PBA) PBA can be implemented by nested Monte Carlo simulation. –Generate CDF for different instances.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
8-5 Testing a Claim About a Standard Deviation or Variance This section introduces methods for testing a claim made about a population standard deviation.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Continuous Random Variables and Probability Distributions.
BCOR 1020 Business Statistics
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Fall 2011 CSC 446/546 Part 6: Random Number Generation.
+ Quantitative Statistics: Chi-Square ScWk 242 – Session 7 Slides.
ETM 607 – Random Number and Random Variates
Linear Regression Inference
Regression Analysis (2)
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
SECTION 6.4 Confidence Intervals for Variance and Standard Deviation Larson/Farber 4th ed 1.
STAT 497 LECTURE NOTES 2.
Correlation.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Random-Number Generation Andy Wang CIS Computer Systems Performance Analysis.
1 Statistical Distribution Fitting Dr. Jason Merrick.
Chapter 7 Random-Number Generation
Tests for Random Numbers Dr. Akram Ibrahim Aly Lecture (9)
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
1 Chapter 2: Simple Comparative Experiments (SCE) Simple comparative experiments: experiments that compare two conditions (treatments) –The hypothesis.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
Randomness Test Fall 2012 By Yaohang Li, Ph.D.. Review Last Class –Random Number Generation –Uniform Distribution This Class –Test of Randomness –Chi.
Tests of Random Number Generators
Copyright © 2010 Pearson Education, Inc. Slide
B AD 6243: Applied Univariate Statistics Data Distributions and Sampling Professor Laku Chidambaram Price College of Business University of Oklahoma.
Confidence Intervals for Variance and Standard Deviation.
CY1B2 Statistics1 (ii) Poisson distribution The Poisson distribution resembles the binomial distribution if the probability of an accident is very small.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
ANOVA, Regression and Multiple Regression March
Review of Probability Concepts Prepared by Vera Tabakova, East Carolina University.
Ch4: 4.3The Normal distribution 4.4The Exponential Distribution.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
Section 6.4 Inferences for Variances. Chi-square probability densities.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
2/15/2016ENGM 720: Statistical Process Control1 ENGM Lecture 03 Describing & Using Distributions, SPC Process.
Chi Square Test for Goodness of Fit Determining if our sample fits the way it should be.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
MONTE CARLO METHOD DISCRETE SIMULATION RANDOM NUMBER GENERATION Chapter 3 : Random Number Generation.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Chapter 9 Roadmap Where are we going?.
Sample Mean Distributions
Chapter 7 Random Number Generation
Statistical Process Control
Chapter 6 Confidence Intervals.
Chapter 9 Estimation: Additional Topics
Presentation transcript:

Testing Random-Number Generators Andy Wang CIS Computer Systems Performance Analysis

Testing Random-Number Generators How do you know if your random number streams are good? –Plot histogram, CDF, QQ-plot –Run tests Necessary but not sufficient to prove the quality of a generator –If it fails a test, the generator is bad –Passing a test is not a guarantee that a generator is good –Tests can also be used to verify distribution matching 2

3 Chi-Square Test Most commonly used test –Tests if a data set matches a distribution Steps –Prepare a histogram with k bins –Compare the observed and expected frequencies via a formula –If the computed value <  2 [1-α; k-1] (Table A.5) We have a match

Chi-Square Example N = 1000 k = 10  2 [1-0.1; 10-1] = –A match 4 iobservedexpected(observed – expected) 2 / expected Total

Fine Prints For nonuniform distributions –Watch out for small expected values, which can affect the outcome Remedy –Use variable bin sizes, so that the expected outcomes are equal Designed for discrete distribution with large sample sizes 5

6 Kolmogorov-Smirnov Test Tests if a sample of n observations is from a continuous distribution Observation –If two distributions match, | observed CDF F o (x) - expected CDF F e (x) | should be small

K-S Test Differences are measured by the maximum observed deviations above and below the expected CDF K + =  n max x [F o (x i ) - F e (x i )] K - =  n max x [F e (x i + 1 ) - F o (x i )] If both K + and K - < K [1-α; n] (Table A.9) –We have a match 7

K-S Example x n = 3x n-1 % 31 n = 30 x 0 = 15 K [0.9, 30] = A match 8 ejej ojoj o j /30sorted o j /30o j – e j e j+1 –o j … maximum0.033 sqrt(30)*max0.183

K-S Test vs. Chi-square Test K-S test –Designed for small samples and continuous distributions –Uses differences between CDFs Similar to QQ-plots –Tests each sample without grouping –Exact Chi-square test –Designed for large samples and discrete distributions –Uses differences between pdfs/pmfs –Requires grouping –Approximate 9

10 Serial-Correlation Test Covariance: tests dependence of two random variables –If the covariance is nonzero, variables are dependent –If the covariance is zero, variables can still be dependent

Autocovariance Autocovariance at lag k (>1) R k : covariance between numbers that are k values apart, 0 < random number U < 1 For large n –R k is normally distributed, with a mean of 0 Variance of 1/[144(n – k)] 11

Autocovariance Confidence interval If confidence interval does not include zero, the sequence has a significant correlation Example from slide 8 –x n = 3x n-1 % 31 –x 0 = 15 –Not so good 12

Autocovariance Another example x n = 7 5 x n-1 % (2 31 – 1) x 0 = 1 A better random-number generator 13

14 Two-Level Tests If the sample test is too small, test results only apply locally If the sample test is large, test results only apply globally

15 Solution –Apply a chi-square test on n bins of size k –Then apply a chi-square test on a set of n- bin statistics –Can locate a nonrandom segment of a random sequence Two-Level Tests

16 k-Dimensional Uniformity or k-Distributivity So far, tests ensure that numbers are uniformly distributed in one dimension For 0 < a < u < b < 1 P(a < u n < b) = b – a It is known as the 1-distributivity of u n For 2-distributivity P(a 1 < u n-1 < b 1 and a 2 < u n < b 2 ) = (b 1 – a 1 )(b 2 – a 2 )

Visual Check Plot successive overlapping pairs of numbers in a 2-D space –E.g., (u 0, u 1 ), (u 1, u 2 ), … Example of 15-bit random numbers: –Tausworthe generator –x 15 + x + 1 –Period: –b n+15  b n+1  b n = 0 –b n = b n-14  b n-15 17

Visual Check Another example of 15-bit random numbers: –Tausworthe generator –x 15 + x –b n+15  b n+4  b n = 0 –b n = b n-11  b n-15 18

19 Serial Test Test for uniformity > 2D –Plot nonoverlapping pairs E.g., (u 0, u 1 ), (u 2, u 3 ), … –For 2D, divide the space into K 2 cells Each cell is expected to have n/2K 2 points Use the chi-square test With K 2 – 1 degrees of freedom –Can generalize it into k-dimension –Dependency likely to fail high-dimension chi-square tests

Serial Test vs. Visual Check Serial Test –Uses nonoverlapping points Assumed by the chi- square test –Has n/2 pairs Visual Check –Uses overlapping points –Has n – 1 pairs 20

21 Spectral Test Plot successive overlapping pairs of numbers K-dimension tuples from an LCG fall on a finite number of parallel hyperplanes

22 x n = 3x n-1 % 31 –Can see three lines with positive slopes –x n = 3x n-1 –x n = 3x n –x n = 3x n Spectral Test Example

Observations k-tuples from an LCG fall on at most (k!m) 1/k parallel hyperplanes, where m is the modulus –M = 232 –3-tuples will have < 2953 hyperplanes –4-tuples, < 566 hyperplanes –10-tuples, < 41 23

Another Spectral Test Example x n = 3x n-1 % 31 –Period: 30 –Pass chi-square test –3 lines x n = 13x n-1 % 31 –Period: 30 –Pass chi-square test –10 closer lines –Better 2-distributivity 24

25 White Slide