Parallel Poisson Disk Sampling Li-Yi Wei Microsoft
Parallelism Processors are becoming parallel Intel Larrabee, NVIDIA, AMD/ATI, IBM/Sony Cell, etc. So are programming interfaces BSGP, CUDA, CAL, Ct, DX, OpenGL, etc. As well as applications To take advantage of parallel environment
Parallelization Traditional parallelization methods Sequential consistency [Lampert] - sorting, FFT, matrix, etc. Not all algorithms need to be seq-consistent Graphics, computer vision, image/video, statistics Approximate solutions might suffice Opportunities for new parallelization methods
First pick: Poisson disk sampling A set of samples that are as random as possible remain a minimum distance r away from each other Why pick this problem? important algorithm seemly non-parallelizable
Importance of Poisson disk sampling Best quality for N samples [Cook 1986] Natural object distribution (retina cells, ecology) Blue noise spectrum void in low freq noise in high freq Applications in Rendering, imaging, geometry processing, etc.
Optimal spectrum (given # samples) Blue noise: aliasing → noise regular grid jittered grid Poisson disk All with 1600 samples samples spectrum
Spatial sampling aliasing noisy regular grid jittered grid (zone plate) aliasing noisy regular grid jittered grid Poisson disk
Methods
Dart throwing [Cook 1986] Loop: O High quality X Slow speed Random sample from the entire domain Accept sample if not in conflict with existing ones O High quality Ground truth X Slow speed Inherently sequential
Speed improvement Computation on the fly (sequential) Scalloped regions [Dunbar & Humphreys 2006] Onion layers [Bridson 2007] Hierarchical dart throwing [White et al. 2007] Pre-computed data set (parallel access) Penrose tiling [Ostromoukhov et al 2004] Wang tiles [Cohen et al. 2003; Lagae & Dutre 2005; Kopf et al. 2006] Polyominoes [Ostromoukhov 2007] X Potential large data set + quality issue
Features of our approach Parallel computation Entirely on the fly (no pre-computed data) Good spectrum quality Like dart throwing + Adaptive sampling + Any dimension Parallel GPU run time (in slow motion) Multi-resolution synthesis
Our basic idea Samples from a grid 1 sample per grid cell Sample grid cells far apart in parallel Watch out for bias! Tricks to avoid bias
Algorithm in gradual steps Uniform sampling, sequential Uniform sampling, parallel Adaptive sampling
Sequential sampling Basic data structure Choose grid cell size d so that each cell has at most one sample r = minimum spacing n = dimension Inspired by [Bridson 2007] Texture synthesis
Sequential sampling scan-line order + single resolution Bias! Scanline order Grid sampling
Sequential sampling random order + single resolution Removes scanline bias But still grid-cell biased scanline random
Sequential sampling random order + multi-resolution Removes both biases scanline, grid 1 level 3 level 5 level scanline random
Sequential sampling Summary for bias removal Two sources of bias Grid sampling fixed by multi-resolution Traversal order fixed by random order scanline random 1 level 3 level 5 level
Sequential sampling Summary for each level low to high visit cells in a random order if cell contains no sample draw one sample randomly from the cell domain add the sample if not conflicting existing ones
Parallel sampling Key insight for each level low to high visit cells in a random order if cell contains no sample draw one sample randomly from the cell domain add the sample if not conflicting existing ones visit cells in parallel!
Parallel sampling Key insight Sample cells sufficiently far away in parallel 2D example: Cells apart cannot conflict with each other split cells → phase groups
Phase group partition grid partition scanline order random partition random order 6 7 8 9 3 4 8 1 7 2 6 5 1 3 2 3 4 5 8 4 6 1 2 5 7 O easy to compute X bias! (scanline) O good quality X hard to compute (sequential) O easy to compute O good quality
Parallel sampling Summary for each level low to high for each phase group p parallel: for each cell in p if cell contains no sample draw one sample randomly from the cell domain add the sample if not conflicting existing ones
Adaptive sampling Slightly more involved than uniform sampling Parallelizable as well
Adaptive sampling Multi-resolution as in uniform sampling But uses adaptive tree instead of uniform grid Subdivide only if possible to add more samples
Results
Spectrum comparison - 2D dart throwing our method power spectrum (10 run) radial mean radial variance
Sampling in higher dimensions Algorithm applicable to 2+ dimension power spectrum radial mean radial variance 3D samples
Performance O: on the fly P: pre-computed dataset # samples per second O Our method (NVIDIA 8800 GTX) 4.06 M 555 K 42.9 K 2.43 K 179 O Boundary sampling [Dunbar & Humphreys 2006] 0.20 M X O Hierarchical dart throwing [White et al. 2007] 0.21 M P Wang tiling [Kopf et al. 2006] 1 ~ 3 M P Polyominoes [Ostromoukhov 2007] >1 M
Wang tiling Corner tiling P-pentominoes Our method [Kopf et al. 2006] [Lagae & Dutre 2006] P-pentominoes [Ostromoukhov 2007] Our method
Limitations Only empirical, but no theoretical proof yet Slow in high dimensions, adaptive sampling Hard to control exact # of samples No fine-grain sample ranking e.g. progressive zoom-in [Kopf et al. 2006] Euclidean space only (no manifold surface)
Future work for parallel algorithm Sequential consistency [Lampert] too strict for some applications A looser sense of consistency? parallel texture synthesis [Lefebvre & Hoppe 2005] random number generation [Tzeng & Wei 2008]
Acknowledgements Ares Lagae Johannes Kopf Victor Ostromoukhov Eric Andres Zhouchen Lin Ting Zhang Kun Zhou Xin Tong Jian Sun Stanley Tzeng Eric Stollnitz Brandon Lloyd Dwight Daniels Jianwei Han Baining Guo Harry Shum Reviewers
Jittered grid One sample per grid cell Uniform sample O Very fast X Quality not so good
Jittered grid vs. Poisson disk (Recap) samples spectrum regular grid jittered grid Poisson disk
Combining strengths Quality of dart throwing Speed of jittered grid
Relaxation vs Dart throwing G-hexominoes [Ostromoukhov 2007] ☺ spatial uniform Our method ☺ spectrum radial mean radial variance sample layout
Future work Parallelizing important algorithms Sequential consistency [Lampert] Dwarfs [Landscape of Parallel Computing 2006] - linear algebra, spectral, N-body, grids, Monte-Carlo New dwarfs that do not need to be seq-con?
How to start a bar flight I saw people fighting in a bar for the dart board. I asked them why. They told me when they play together, their darts landed too close to each other.
How to stop the bar flight Parallel GPU run time (slow motion) 4M Poisson disk samples / sec in parallel! I told them if they play by my rule, I can ensure their darts to land at least a certain distance away from each other. The run also runs on a GPU, producing more than 4 million darts per second in parallel.
Poisson Disk Sampling Popular pattern Prior methods sequential [Cook 1986] Prior methods sequential e.g. dart throwing A Poisson disk set contains samples that are randomly distributed, but remain a minimum distance away from each other. It is a popular sampling pattern, used in many graphics applications. However, it is also hard to generate. Existing methods are sequential, such as dart throwing, where samples are drawn randomly, but rejected if too close to existing samples, and accepted otherwise.
Dart Throwing in Parallel Fast – 4 million samples/sec on GPU Entirely on the fly, no pre-compute Good quality – like (sequential) dart throwing Texture synthesis … To address this, I have figure out a method to throw darts in parallel. It is faster than existing methods, and produce more than 4 million samples per second. The quality is also good, just like traditional dart throwing. The algorithm is actually derived from texture synthesis (old dog with new tricks).
Parallel sampling Summary for each level low to high for each phase group p parallel: for each cell in p if cell contains no sample draw one sample randomly from the cell domain add the sample if not conflicting existing ones
Parallel sampling Summary for each level low to high for each phase group p parallel: for each cell in p if cell contains no sample draw one sample randomly from the cell domain add the sample if not conflicting existing ones
Limitations