Semantics Consistent Parallelism Li-Yi Wei Microsoft Research.

Slides:



Advertisements
Similar presentations
Parallel Poisson Disk Sampling
Advertisements

Contens What is video card? History Main Components Advantages of Video Cards.
Tile-Based Texture Mapping on Graphics Hardware Li-Yi Wei NVIDIA.
Line Segment Sampling with Blue-Noise Properties Xin Sun 1 Kun Zhou 2 Jie Guo 3 Guofu Xie 4,5 Jingui Pan 3 Wencheng Wang 4 Baining Guo 1 1 Microsoft Research.
Parallel Computing Glib Dmytriiev
The fundamental matrix F
Lecture 19: Parallel Algorithms
CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
SE263 Video Analytics Course Project Initial Report Presented by M. Aravind Krishnan, SERC, IISc X. Mei and H. Ling, ICCV’09.
Texture Synthesis on [Arbitrary Manifold] Surfaces Presented by: Sam Z. Glassenberg* * Several slides borrowed from Wei/Levoy presentation.
Spectral Matting Anat Levin 1,2 Alex Rav-Acha 1 Dani Lischinski 1 1 School of CS&Eng The Hebrew University 2 CSAIL MIT.
Mixed-Resolution Patch- Matching (MRPM) Harshit Sureka and P.J. Narayanan (ECCV 2012) Presentation by Yaniv Romano 1.
Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.
GPU Computing with CUDA as a focus Christie Donovan.
Texture Synthesis Tiantian Liu. Definition Texture – Texture refers to the properties held and sensations caused by the external surface of objects received.
Parallel White Noise Generation on a GPU via Cryptographic Hash
Programming with CUDA, WS09 Waqar Saleem, Jens Müller Programming with CUDA and Parallel Algorithms Waqar Saleem Jens Müller.
Order-Independent Texture Synthesis Li-Yi Wei Marc Levoy Gcafe 1/30/2003.
Final Gathering on GPU Toshiya Hachisuka University of Tokyo Introduction Producing global illumination image without any noise.
A Parallel Algorithm for Approximate Regularity, by Laurence Boxer and Russ Miller, A presentation for the Niagara University Research Council, Nov.,
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Approximate Nearest Subspace Search with applications to pattern recognition Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech.
Image Pyramids and Blending
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
OpenSSL acceleration using Graphics Processing Units
Zoltan Szego †*, Yoshihiro Kanamori ‡, Tomoyuki Nishita † † The University of Tokyo, *Google Japan Inc., ‡ University of Tsukuba.
CS448f: Image Processing For Photography and Vision The Gradient Domain.
Slide Systems of Linear Equations A system of linear equations consists two or more linear equations.
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
Modern Consumer Video Card Cheng-Han Du. What Is Video Card? A separated card to generate and output image to display. Not the integrated graphic processor.
MATLAB and the GPU Who is AccelerEyes? What’s a GPU?
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
GPU in HPC Scott A. Friedman ATS Research Computing Technologies.
Shape Analysis and Retrieval Structural Shape Descriptors Notes courtesy of Funk et al., SIGGRAPH 2004.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
On a Few Ray Tracing like Algorithms and Structures. -Ravi Prakash Kammaje -Swansea University.
Diane Marinkas CDA 6938 April 30, Outline Motivation Algorithm CPU Implementation GPU Implementation Performance Lessons Learned Future Work.
Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering.
Large-scale Deep Unsupervised Learning using Graphics Processors
Accelerating Statistical Static Timing Analysis Using Graphics Processing Units Kanupriya Gulati and Sunil P. Khatri Department of ECE, Texas A&M University,
Dense Image Over-segmentation on a GPU Alex Rodionov 4/24/2009.
Richard Kelley Motion Planning on a GPU. Last Time Nvidia’s white paper Productive discussion.
OpenCL Framework for Heterogeneous CPU/GPU Programming a very brief introduction to build excitement NCCS User Forum, March 20, 2012 György (George) Fekete.
SIGGRAPH 2011 ASIA Preview Seminar Rendering: Accuracy and Efficiency Shinichi Yamashita Triaxis Co.,Ltd.
IIIT Hyderabad Scalable Clustering using Multiple GPUs K Wasif Mohiuddin P J Narayanan Center for Visual Information Technology International Institute.
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
Joseph M. Hellerstein Peter J. Haas Helen J. Wang Presented by: Calvin R Noronha ( ) Deepak Anand ( ) By:
Duy & Piotr. How to reconstruct a high quality image with the least amount of samples per pixel the least amount of resources And preserving the image.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
GPU Accelerated MRI Reconstruction Professor Kevin Skadron Computer Science, School of Engineering and Applied Science University of Virginia, Charlottesville,
Fitting image transformations Prof. Noah Snavely CS1114
Practical Message-passing Framework for Large-scale Combinatorial Optimization Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin KAIST 2015.
Linear Algebra Operators for GPU Implementation of Numerical Algorithms J. Krüger R. Westermann computer graphics & visualization Technical University.
Ray Tracing using Programmable Graphics Hardware
Space Charge with PyHEADTAIL and PyPIC on the GPU Stefan Hegglin and Adrian Oeftiger Space Charge Working Group meeting –
Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware Tim Foley Mike Houston Pat Hanrahan Computer Graphics Lab Stanford University.
Multi-Class Blue Noise Sampling Li-Yi Wei Microsoft Research.
Large-scale geophysical electromagnetic imaging and modeling on graphical processing units Michael Commer (LBNL) Filipe R. N. C. Maia (LBNL-NERSC) Gregory.
Visual Tracking of Surgical Tools in Retinal Surgery using Particle Filtering Group 14 William Yang and David Li Presenter: William Yang Mentor: Dr. Rogerio.
Generalized and Hybrid Fast-ICA Implementation using GPU
GPU-based iterative CT reconstruction
Brad Baker, Wayne Haney, Dr. Charles Choi
Positive Gordon–Wixom Coordinates
Patric Perez, Michel Gangnet, and Andrew Black
Kenneth Moreland Edward Angel Sandia National Labs U. of New Mexico
Ray Tracing on Programmable Graphics Hardware
PANN Testing.
Graphics Processing Unit
Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech
VIDEO CARDS.
Presentation transcript:

Semantics Consistent Parallelism Li-Yi Wei Microsoft Research

Parallelism Algorithm API Hardware parallel ? CUDA, BSGP CPU, GPU parallel

Traditional parallelization Sequential consistency [Lampert] parallel algorithms do NOT change result sorting, FFT, matrix, etc sort

New parallelization (cheating) Not all algorithms need to be seq-consistent perception (graphics/vision/image/video), statistics approximate solutions might suffice more parallel Semantic consistency

Examples

Pseudo Random Number Generator The main source of randomness in programs e.g. rand() in c/c++ Desirable properties white noise statistics repeatable, fast computation Traditional sequential method e.g. x n = (a x n-1 + b) mod c

Parallel PRNG 1. input trivially prepared in parallel, e.g. linear ramp 2. feed input value into hash, independently & in parallel 3. output white noise key idea: borrow cryptographic hash! [Tzeng and Wei I3D 2008] hash input output

How to start a bar flight

How to stop the bar flight Parallel GPU run time (slow motion) 4M Poisson disk samples / sec in parallel!

Poisson disk sampling A set of samples that are as random as possible remain a minimum distance r away from each other Why pick this problem? important algorithm - sampling, graphics, statistics seemly non-parallelizable

Dart throwing [Cook 1986] Loop: random sample from the entire domain accept sample if not in conflict with existing ones O High quality ground truth X Slow speed inherently sequential

Parallel Poisson disk sampling [Wei SIGGRAPH 2008] Samples from a grid 1 sample per grid cell Sample grid cells far apart in parallel Watch out for bias! Tricks to avoid bias

Image cloning

Poisson image editing [Perez et al. SIGGRAPH 2003] State-of-art for image cloning + other stuff Solving Poisson equation interior detail (source) boundary condition (target) Heavy computation

[Farbman et al. SIGGRAPH 2009] p = w i b i Easy computation Run parallel on a GPU Coordinate interpolation

Summary Conclusion sequential consistency too strict semantic consistency (perceptual, statistical, etc) Future work parallelism via semantic consistency individual algorithms / applications general methodology?

Parallelism Processors are becoming parallel Intel Larrabee, NVIDIA, AMD/ATI, IBM/Sony Cell, etc. So are programming interfaces BSGP, CUDA, CAL, Ct, DX, OpenGL, etc. As well as applications To take advantage of parallel environment

Parallel Algorithm Traditional parallelization methods Sequential consistency [Lampert] - sorting, FFT, matrix, etc. Not all algorithms need to be seq-consistent perception (graphics/vision/image/video), statistics Approximate solutions might suffice Opportunities for new parallelization methods

Image cloning