Download presentation
1
Parallel White Noise Generation on a GPU via Cryptographic Hash
Stanley Tzeng Li-Yi Wei Microsoft Research Asia
2
What is White Noise? Spatial domain: uniform random number
Frequency domain: white noise spatial domain frequency domain
3
Importance Mother of all random numbers
Commonly used, e.g. rand() in C/C++ Major algorithms sequential e.g. xn = a xn-1 + b mod c Processors are becoming parallel GPU, multi-core CPU, Cell sequential algorithms cannot leverage that
4
Contribution ☺Parallel algorithm for white noises
independent evaluation for every sample easy implementation as a GPU pixel shader speed faster than sequential algorithms quality same or better usage similar to texture mapping
5
PRNG (Pseudo Random Number Generator)
The main source of randomness in programs Desirable properties white noise statistics repeatable fast computation low memory usage
6
borrow cryptographic hash!
Core Idea input trivially prepared in parallel, e.g. linear ramp feed input value into hash, independently and in parallel output white noise key idea: borrow cryptographic hash! input hash output
7
Hash (however nice) input → (unrecognizable) mess
8
Cryptographic Hash A subclass of hash
Commonly used for security applications e.g. password, digital signature Properties irreversible – cannot find input from hash output decorrelating – similar inputs, dissimilar outputs uniform probability – all outputs likely to occur
9
Cryptographic Hash - Example
irreversible, decorrelating, uniform probability CHash ("The quick brown fox jumps over the lazy dog") = 9e107d9d372bb6826bd81d3542a419d6 CHash ("The quick brown fox jumps over the lazy eog") = ffd93f fbaef4da268dd0e
10
Cryptographic Hash as a PRNG
White noise statistics CHash is cryptographically secure Repeatable CHash is invariant with same input Fast computation CHash is parallel + constant cost Low memory usage CHash maintains no state Order-independent i.e. Random accessible important for parallel GPU applications hash
11
Which Cryptographic Hash?
Many options MD5, SHA, RIPEMD, Tiger, block cipher, etc Desirable properties white noise quality fast computation power-of-2 aligned (output & operations) pure pixel shader, no state maintenance
12
Our Hash of Choice: MD5 [Rivest 1992]
128-bit outputs and 32-bit operation Small number of constants fit entirely in shader Fastest among those satisfying quality criteria Not 100% secure [Wang and Yu 2005] but good enough for our goal
13
(bit op, table, arithmetic)
MD5 Algorithm Overview Scrambling (bit op, table, arithmetic) Input Output shift table sin table 64 rounds
14
Performance Bottlenecks for Pixel Shader
Scrambling (bit op, table, arithmetic) Input Output shift table sin table 64 rounds
15
(bit op, table, arithmetic)
Our Optimization Scrambling (bit op, table, arithmetic) Input Output reduced shift table shift table sin table sin function 64 rounds loop unrolling
16
Previous PRNG GPU CPU rand drand48
BBS [Blum et al. 1986, Olano 2005] O extremely fast X not good quality CEICG [Entacher et al. 1998, Sussman et al. 2006] O decent quality X processing time varies AES [NIST 2001, Yamanouchi ] O invertible (not hash) CPU rand O commonly used X not good quality drand48 O better quality X slower Mersenne Twister [Matsumoto and Nishimura 1998] O high quality and fast X not random accessible
17
Assessing Quality: DIEHARD [Marsaglia 1995]
De facto standard on measuring PRNG quality Runs 15 different tests on the bits generated Outputs p-val. If p == 0 || p == 1, fail. BIRTHDAY SPACINGS TEST, M= 512 N=2**24 LAMBDA= Results for aes.bin For a sample of size 500: mean aes.bin using bits 1 to duplicate number number spacings observed expected 6 to INF Chisquare with 6 d.o.f. = p-value=
18
Cumulative Distribution Function
Shows how data is distributed within set Given x in data, what % of data values are ≤ x 100% 100 % 0 % 0 % X=0 1 X=0 1 Normal Distribution Uniform Distribution
19
Kolmogorov-Smirnov Test
Determines how two sets of data are alike Looks at max difference D between distribution functions 100 % not alike 100 % alike D D 0 % 0 % X=0 1 X=0 1
20
Assessing Quality: DIEHARD
Run the results of the DIEHARD test (p-value) through a KS-test. Look at D-value. 100 Uniform Distribution Curve P-value Curve D-Value D Smaller D is better quality! Cumulative Distribution Function
21
Assessing Quality: Power Spectrum
Radial mean: should be uniform Radial variance: should be low & uniform Power spectrum density Radial mean Radial variance (Anisotropy)
22
Assessing Speed: Batch Rendering
Clock time to generate random bits n2 x 128 bits image, n = 512, 1024, 2048 and 4096 n2 n2
23
Assessing Speed: Texture Subset (For random accessibility)
A huge virtual texture clock time for access A B measure difference (smaller is better) 220 220 B
24
Test Results: DIEHARD Results
the higher the better the lower the better
25
Test Results: Power Spectrum Tests
MD5 M. Twister GPU BBS
26
Test Results: Batch Render Speed
27
Test Results: Texture Subset Speed
28
Trading Quality for Speed
Reducing # of rounds O faster speed X lower quality Rounds Time(ms) DIEHARD tests passed KS D-Val 64 6.3 15/15 0.2029 48 4.7 14/15 0.2042 32 3.1 13/15 0.2295 16 1.6 0.253
29
Applications Fractal terrain (vertex shader) Texture tiling
(fragment shader)
30
Future Work Implement our method in hardware Alternative hashes
very similar to texture unit but much smaller (no need for cache) Alternative hashes ride with advances in cryptographic hash
31
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.