Computer Simulation Techniques Generating Pseudo-Random Numbers Lebanese American University Rayana H. Jaafar
Random Numbers Applications Fields of Application Numerical Analysis Computer Programming Cryptography Simulation Model Randomness
Random Numbers in Simulation Machine Interface Model Machine Operation Time Machine Repair Time Realistic Simulations Models Generate Random Numbers that follow a given Theoretical or Empirical Distribution
Pseudo-Random Numbers Pseudo-Random Numbers (RN) Uniform Distribution Range ∈[0,1] Stochastic or Random Variates Other Theoretical or Empirical Distribution
Random Numbers Generation True Random Numbers Physical Phenomena Unpredictable & Non-Reproducible Source Thermal Noise from a Semi-Conductor Diode Particle Emission in Radioactive Decay Disadvantages High Cost of Production and Storage Non-Reproducible Not Useful for Computer Simulation
Random Numbers Generation Pseudo-Random Numbers Mathematical Algorithms Generate Numbers Deterministically Appear as Random Numbers Statistical Tests for Randomness Advantages Reproducible Useful for Computer Simulation Generated on Demand
Random Numbers Properties Sequences of Random Numbers or Bits Uniformly Distributed Statistically Independent Reproducible Non-Repeating for the Desired Length “The list of widely used generators that should be discarded is long... Check the default PRNG of your favorite software and be ready to replace it if needed. This last recommendation has been made over and over again over the past 40 years. Perhaps amazingly, it remains as relevant today as it was 40 years ago.” International Encyclopedia of Statistical Science
Von Neumann’s Mid-Square Method First Method to Generate RNs By Computers Square the previous random number Extract middle digits 5772156649 33317792380594909291 7923805949 How Can This Method Generate a Sequence of RNs ?
The Congruential Method Based on the following Recursive Algorithm 𝑥 𝑖+1 = 𝑎 𝑥 𝑖 +𝑐 𝑚𝑜𝑑 𝑚 0<𝑎<𝑚, 0≤𝑐<𝑚 𝑥 0 =0 𝑎=7 𝑐=7 𝑚=10 7 6 9 0 7 6 9 0 … 𝑇=4 Seed
Tausworthe Generators (TG) Additive Congruential Generators where 𝑚=2 𝑥 𝑖 = 𝑎 1 𝑥 𝑖−1 + …+ 𝑎 𝑛 𝑥 𝑖−𝑛 𝑚𝑜𝑑 2 𝑥 𝑖 = 𝑎 1 𝑥 𝑖−1 ⨁ …⊕ 𝑎 𝑛 𝑥 𝑖−𝑛 𝑥 𝑖 ∈ 0,1 , 𝑎 𝑖 ∈ 0,1 Characteristics Very Long Cycles Relatively Slow
Lagged Fibonacci Generators (LFG) General Form LFG 𝑥 𝑛 = (𝑥 𝑛−𝑗 Ο 𝑥 𝑛−𝑘 ) 𝑚𝑜𝑑 𝑚 0<𝑗<𝑘 Characteristics Produce RNs with Very Good Statistical Properties Execution can be Parallelized Highly Sensitive to the Seed
𝑥 𝑘+𝑛 = 𝑥 𝑘+𝑚 ⨁ 𝑥 𝑘 𝑢 | 𝑥 𝑘+1 𝑙 A k≥0 Mersenne Twister (MT) Generates sequence of bits grouped as blocks 𝑥 𝑘+𝑛 = 𝑥 𝑘+𝑚 ⨁ 𝑥 𝑘 𝑢 | 𝑥 𝑘+1 𝑙 A k≥0 𝐴 is a 𝑤×𝑤 matrix 0 1 0 … 0 0 0 1 … 0 ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 … 1 𝑎 𝑤−1 𝑎 𝑤−2 … … 𝑎 0 𝑥 is a block of 𝑤 bits 𝑛 is the degree of recurrence relation 𝑥 𝑘 𝑢 is the upper 𝑤−𝑟 bits of 𝑥 𝑘 𝑥 𝑘+1 𝑙 is the lower 𝑟 bits of 𝑥 𝑘+1 𝑟 is the separation point 1≤𝑚<𝑛
Mersenne Twister The multiplication by 𝐴 can be done as follows 𝑥𝐴= 𝑥≫1 𝑖𝑓 𝑥 0 =0 𝑥≫1 ⊕𝑎 𝑖𝑓 𝑥 0 =1 Where 𝑎= 𝑎 𝑤−1 … 𝑎 0 and 𝑥= 𝑥 𝑤−1 … 𝑥 0 To increase statistical properties, multiply 𝑥 by 𝑇, equivalent to 𝑥 →𝑍=𝑥𝑇 𝑦=𝑥⊕ 𝑥≫𝑞 𝑦=𝑦⨁ 𝑦≪𝑠 ∧𝑏 𝑦=𝑦⨁ 𝑦≪𝑡 ∧𝑐 Z=𝑦⊕ 𝑦≫𝑝
Hypothesis Testing Null Hypothesis 𝐻 0 Alternative Hypothesis 𝐻 𝑎 “The produced sequence is random” Alternative Hypothesis 𝐻 𝑎 “The produced sequence is not random” For PRNGs, we test the randomness of an output sequence The result of the statistical test itself is probabilistic Decision Real Situation 𝐻 0 Accepted 𝐻 0 Rejected 𝐻 0 True Valid Type I Error 𝐻 0 Not True Type II Error 𝑃𝑟 𝑇𝑦𝑝𝑒 𝐼 =𝛼 Level of Significance
Frequency Test Tests whether the number of 0’s and 1’s is about the same Steps Generate 𝑛 Pseudo-Random Numbers to form a sequence Convert the 0’s to -1’s and add the bits 𝑆 𝑛 = 𝑋 1 + …+ 𝑋 𝑛 Compute 𝑆 𝑜 = 𝑆 𝑛 𝑛 Compute 𝑝=𝑒𝑟𝑓𝑐 𝑆 0 2 If 𝑝<0.01 Reject 𝐻 0 Else the sequence is accepted as random
Frequency Test Example Sequence 1 0 1 1 0 1 0 1 0 1 Convert to 1 -1 1 1 -1 1 -1 1 -1 1 𝑆 𝑛 =6∗ 1 +4∗ −1 =2 𝑆 𝑜 = 𝑆 𝑛 2 =0.6324 𝑝= 𝑒𝑟𝑓𝑐 0.6324 =0.527 >0.01 Sequence is accepted as Random
Serial Test Checks the randomness of overlapping blocks of 𝑘, 𝑘−1, and 𝑘−2 bits found in a sequence Let 𝑒 be a sequence of 𝑛 bits and 𝑘< 𝑙𝑜𝑔 2 𝑛 −2 𝑒 1 ′ =𝑒|(𝑓𝑖𝑟𝑠𝑡 𝑘−1 𝑏𝑖𝑡𝑠 𝑜𝑓 𝑒) 𝑒 2 ′ =𝑒|(𝑓𝑖𝑟𝑠𝑡 𝑘−2 𝑏𝑖𝑡𝑠 𝑜𝑓 𝑒) 𝑒 3 ′ =𝑒|(𝑓𝑖𝑟𝑠𝑡 𝑘−3 𝑏𝑖𝑡𝑠 𝑜𝑓 𝑒) Let 𝑓 𝑖 , 𝑓 𝑖 ′ and 𝑓 𝑖 ′′ be the frequency of occurrence of the 𝑖 𝑡ℎ 𝑘, 𝑘−1, and 𝑘−2 overlapping bit combinations in 𝑒 1 ′ , 𝑒 2 ′ and 𝑒 3 ′ respectively.
Serial Test Compute the following 𝑆 𝐾 2 = 2 𝑘 𝑛 𝑖 𝑓 𝑖 2 −n ∆𝑆 𝑘 2 = 𝑆 𝐾 2 − 𝑆 𝐾−1 2 and ∆ 2 𝑆 𝑘 2 = 𝑆 𝐾 2 − 2 𝑆 𝐾−1 2 + 𝑆 𝐾−2 2 𝑃 1 =𝐼𝑛𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑒𝐺𝑎𝑚𝑚𝑎( 2 𝑘−2 , ∆𝑆 𝑘 2 2 ) 𝑃 2 =𝐼𝑛𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑒𝐺𝑎𝑚𝑚𝑎( 2 𝑘−3 , ∆ 2 𝑆 𝑘 2 2 ) If 𝑃 1 𝑜𝑟 𝑃 2 <0.01 Reject 𝐻 0 Else the sequence is accepted as random
Serial Test Example Let 𝑒=0011011101, 𝑛=10 and 𝑘=3 𝑒 1 ′ =001101110100 𝑒 2 ′ =00110111010 𝑒 3 ′ =0011011101 For 𝑒 1 ′ 𝑘=3 𝑏𝑖𝑡𝑠 𝐶 1 =000, 𝐶 2 =001, 𝐶 3 =010, 𝐶 4 =011, 𝐶 5 =100, 𝐶 6 =101, 𝐶 7 =110, 𝐶 1 =111 𝑓 1 =0, 𝑓 2 =1, 𝑓 3 =1, 𝑓 4 =2, 𝑓 5 =1, 𝑓 6 =2, 𝑓 7 =2, 𝑓 8 =1 For 𝑒 2 ′ 𝑘−1=2 𝑏𝑖𝑡𝑠 𝑓 1 ′ =1, 𝑓 2 ′ =3, 𝑓 3 ′ =3, 𝑓 4 ′ =3 For 𝑒 3 ′ 𝑘−2=1 𝑏𝑖𝑡 𝑓 1 ′′ =4, 𝑓 2 ′′ =6
Sequence is accepted as Random Serial Test Example 𝑆 3 2 =2.8, 𝑆 2 2 =1.2 and 𝑆 1 2 =0.4 ∆𝑆 3 2 =1.6 and ∆ 2 𝑆 3 2 =0.8 𝐼𝐺 2,0.8 =0.905>0.01 𝐼𝐺 1,0.4 =0.8805>0.01 Sequence is accepted as Random
Autocorrelation Test Let 𝑒 be a sequence of 𝑛 bits and let 𝑒 𝑖 be 𝑒 shifted by 𝑖 positions Consider 1≤𝑑≤ 𝑛 2 Find the number of different bits between 𝑒 𝑖 and 𝑒 𝑖+𝑑 as follows 𝐴 𝑑 = 𝑖=0 𝑛−𝑑−1 𝑒 𝑖 ⊕ 𝑒 𝑖+𝑑 Compute 𝑆= 2 𝐴 𝑑 − 𝑛−𝑑 2 𝑛−𝑑 For 5% level of significance, we accept 𝐻 0 if 𝑆≤1.96
Runs Test 𝑅= 1 𝑛 1≤𝑖,𝑗≤6 𝑟 𝑖 −𝑛 𝑝 𝑖 𝑟 𝑖 −𝑛 𝑝 𝑗 𝑎 𝑖𝑗 Let 𝑒 be a sequence of 𝑛 bits and let 𝑟 𝑖 be the nb of run-ups with length 𝑖 All run-ups with 𝑖>6 are grouped together 𝑅= 1 𝑛 1≤𝑖,𝑗≤6 𝑟 𝑖 −𝑛 𝑝 𝑖 𝑟 𝑖 −𝑛 𝑝 𝑗 𝑎 𝑖𝑗 For 𝑛≥4000, 𝑅 has a Chi-Square distribution under the assumption that the random numbers are independent and 𝑖.𝑖.𝑑
Chi-Square Test for Goodness of Fit Let 𝑒 be a sequence of 𝑛 random numbers ∈[0,1] Divide the interval [0,1] into 𝑘 equal sub-areas Let 𝑓 𝑖 be the number of random numbers in sub-area 𝑖 If the numbers are truly uniformly distributed, the average number of random numbers in each interval should be 𝑛/𝑘 𝜒 2 = 𝑘 𝑛 𝑖=1 𝑘 𝑓 𝑖 − 𝑛 𝑘 2 Based on the level of significance, and DoF, 𝐻 0 is rejected if 𝜒 2 has a greater value that the one of obtained from Chi-Square table