Computer graphics & visualization Jens Schneider Martin Kraus Rüdiger Westermann.

Slides:



Advertisements
Similar presentations
GPU-accelerated fractal imaging Jeremy Ehrhardt CS 81 - Spring 2009.
Advertisements

COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Utilization of GPU’s for General Computing Presenter: Charlene DiMeglio Paper: Aspects of GPU for General Purpose High Performance Computing Suda, Reiji,
Fast Circuit Simulation on Graphics Processing Units Kanupriya Gulati † John F. Croix ‡ Sunil P. Khatri † Rahm Shastry ‡ † Texas A&M University, College.
2009/04/07 Yun-Yang Ma.  Overview  What is CUDA ◦ Architecture ◦ Programming Model ◦ Memory Model  H.264 Motion Estimation on CUDA ◦ Method ◦ Experimental.
Optimizing and Auto-Tuning Belief Propagation on the GPU Scott Grauer-Gray and Dr. John Cavazos Computer and Information Sciences, University of Delaware.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Numerical geometry of non-rigid shapes
Back-Projection on GPU: Improving the Performance Wenlay “Esther” Wei Advisor: Jeff Fessler Mentor: Yong Long April 29, 2010.
GPU-Based Frequency Domain Volume Rendering Ivan Viola, Armin Kanitsar, and Meister Eduard Gröller Institute of Computer Graphics and Algorithms Vienna.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
Final Gathering on GPU Toshiya Hachisuka University of Tokyo Introduction Producing global illumination image without any noise.
Sorting and Searching Timothy J. PurcellStanford / NVIDIA Updated Gary J. Katz based on GPUTeraSort (MSR TR )U. of Pennsylvania.
The FFT on a GPU Graphics Hardware 2003 July 27, 2003 Kenneth MorelandEdward Angel Sandia National LabsU. of New Mexico Sandia is a multiprogram laboratory.
Memory Efficient Acceleration Structures and Techniques for CPU-based Volume Raycasting of Large Data S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller.
All-Pairs-Shortest-Paths for Large Graphs on the GPU Gary J Katz 1,2, Joe Kider 1 1 University of Pennsylvania 2 Lockheed Martin IS&GS.
Accelerating Machine Learning Applications on Graphics Processors Narayanan Sundaram and Bryan Catanzaro Presented by Narayanan Sundaram.
HPEC_GPU_DECODE-1 ADC 8/6/2015 MIT Lincoln Laboratory GPU Accelerated Decoding of High Performance Error Correcting Codes Andrew D. Copeland, Nicholas.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU Presented by: Ahmad Lashgar ECE Department, University of Tehran.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Ioannis Karamouzas, Roland Geraerts, Mark Overmars Indicative Routes for Path Planning and Crowd Simulation.
GPU-accelerated Evaluation Platform for High Fidelity Networking Modeling 11 December 2007 Alex Donkers Joost Schutte.
Computationally Efficient Histopathological Image Analysis: Use of GPUs for Classification of Stromal Development Olcay Sertel 1,2, Antonio Ruiz 3, Umit.
Computer Graphics Graphics Hardware
Parallel Applications Parallel Hardware Parallel Software IT industry (Silicon Valley) Users Efficient Parallel CKY Parsing on GPUs Youngmin Yi (University.
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
Cg Programming Mapping Computational Concepts to GPUs.
General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.
Gregory Fotiades.  Global illumination techniques are highly desirable for realistic interaction due to their high level of accuracy and photorealism.
YOU LI SUPERVISOR: DR. CHU XIAOWEN CO-SUPERVISOR: PROF. LIU JIMING THURSDAY, MARCH 11, 2010 Speeding up k-Means by GPUs 1.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
On a Few Ray Tracing like Algorithms and Structures. -Ravi Prakash Kammaje -Swansea University.
Diane Marinkas CDA 6938 April 30, Outline Motivation Algorithm CPU Implementation GPU Implementation Performance Lessons Learned Future Work.
GPU-Accelerated Surface Denoising and Morphing with LBM Scheme Ye Zhao Kent State University, Ohio.
Accelerating Statistical Static Timing Analysis Using Graphics Processing Units Kanupriya Gulati and Sunil P. Khatri Department of ECE, Texas A&M University,
Fast Support Vector Machine Training and Classification on Graphics Processors Bryan Catanzaro Narayanan Sundaram Kurt Keutzer Parallel Computing Laboratory,
CSE 690: GPGPU Lecture 7: Matrix Multiplications Klaus Mueller Computer Science, Stony Brook University.
Dense Image Over-segmentation on a GPU Alex Rodionov 4/24/2009.
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
IIIT Hyderabad Scalable Clustering using Multiple GPUs K Wasif Mohiuddin P J Narayanan Center for Visual Information Technology International Institute.
Accelerated Stereoscopic Rendering using GPU François de Sorbier - Université Paris-Est France February 2008 WSCG'2008.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Based on paper by: Rahul Khardekar, Sara McMains Mechanical Engineering University of California, Berkeley ASME 2006 International Design Engineering Technical.
A Neural Network Implementation on the GPU By Sean M. O’Connell CSC 7333 Spring 2008.
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
Havok FX Physics on NVIDIA GPUs. Copyright © NVIDIA Corporation 2004 What is Effects Physics? Physics-based effects on a massive scale 10,000s of objects.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Debunking the 100X GPU vs. CPU Myth An Evaluation of Throughput Computing on CPU and GPU Present by Chunyi Victor W Lee, Changkyu Kim, Jatin Chhugani,
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
3/12/2013Computer Engg, IIT(BHU)1 CUDA-3. GPGPU ● General Purpose computation using GPU in applications other than 3D graphics – GPU accelerates critical.
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
COMP 175 | COMPUTER GRAPHICS Remco Chang1/XX13 – GLSL Lecture 13: OpenGL Shading Language (GLSL) COMP 175: Computer Graphics April 12, 2016.
Image Fusion In Real-time, on a PC. Goals Interactive display of volume data in 3D –Allow more than one data set –Allow fusion of different modalities.
NVIDIA® TESLA™ GPU Based Super Computer By : Adam Powell Student # For COSC 3P93.
GPU Acceleration of Particle-In-Cell Methods B. M. Cowan, J. R. Cary, S. W. Sides Tech-X Corporation.
Computer Graphics Graphics Hardware
GPU Architecture and Its Application
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
GPU-based iterative CT reconstruction
Graphics Processing Unit
From Turing Machine to Global Illumination
Efficient Image Classification on Vertically Decomposed Data
Localizing the Delaunay Triangulation and its Parallel Implementation
GPGPU: Distance Fields
Sorting and Searching Tim Purcell NVIDIA.
Computer Graphics Graphics Hardware
Kenneth Moreland Edward Angel Sandia National Labs U. of New Mexico
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

computer graphics & visualization Jens Schneider Martin Kraus Rüdiger Westermann

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Given a set of N points,

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Given a set of N points, and a set of k features,,

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Given a set of N points, and a set of k features,, compute

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Given a set of N points, and a set of k features,, compute i.e., measures the closest Euclidean Distance between and any feature point.

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Variations: - i.e., stores the feature‘s position closest to.

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Variations: - i.e., stores the feature‘s position closest to. - i.e., stores feature IDs (aka. Voronoi Diagram)

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Variations: - i.e., stores the feature‘s position closest to. - i.e., stores feature IDs (aka. Voronoi Diagram) Generalization: -Features may consist of curves

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group „Generalized“ Voronoi Diagram

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Continuous Runtime ( e.g. Fortune 1986 ): BUT: No efficient algorithm for generalizations!

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Euclidean Distance Transforms Continuous Runtime ( e.g. Fortune 1986 ): BUT: No efficient algorithm for generalizations! Discrete Runtime ( e.g. Maurer 2003, Danielsson 1980 ) number of image pixels Generalizations trivial – curves are rasterized!

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Why GPUs? Far Cry 2, © Crytek 2008 Doom 3, © id Software 2004 Specs (NVIDIA GeForce 285) parallel processors GB/s bandwidth + TFlops + cheap (<600 €) -Data Parallelism req‘d -1GB RAM (only) -Programming: „Think CG“…

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Why GPUs ? Existence of entirely GPU-based pipes -Lots of algorithms exist! -Adobe Photoshop CS 4 completely GPU-based -„Classic“ Image processing easy to port Good speed-ups possible -GPGPU -Distance Transforms

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Limitations of current GPUs Programming: -Graphics APIs -OpenGL 2.1, DirectX 10 -GPGPU APIs -CUDA, CTM, OpenCL, DirectX 11 Compute Shaders Caveats -Data (SIMD) Parallelism -Concurrent Read/Write access expensive

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Vector Propagation ( Danielsson 1980 ) Closest Feature is stored by nD Vector, i.e.

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Vector Propagation ( Danielsson 1980 ) Closest Feature is stored by nD Vector, i.e. Initialize each feature with pointer to itself, each other pixel with pointer to infinity

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Vector Propagation ( Danielsson 1980 ) Closest Feature is stored by nD Vector, i.e. Initialize each feature with pointer to itself, each other pixel with pointer to infinity Perform „Sweeps“ to update closest features

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Vector Propagation ( Danielsson 1980 ) Closest Feature is stored by nD Vector, i.e. Initialize each feature with pointer to itself, each other pixel with pointer to infinity Perform „Sweeps“ to update closest features For each sweep and each pixel – Check candidates (neighboring pixels) for closest feature so far

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Vector Propagation ( Danielsson 1980 ) Closest Feature is stored by nD Vector, i.e. Initialize each feature with pointer to itself, each other pixel with pointer to infinity Perform „Sweeps“ to update closest features For each sweep and each pixel – Check candidates (neighboring pixels) for closest feature so far – If candidate closer than best guess so far, update

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Vector Propagation ( Danielsson 1980 ) 8-SED (2D images) – 8 neighbors Signed Distance Transform – 4 sweeps, gives rise to „vector templates“

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Vector Propagation ( Danielsson 1980 ) Fast (even on CPUs) Easy to implement Very low errors – Analysis shows that VP offers best speed vs. Error rate trade-off of all approximative schemes ( Jones et al ) BUT: Data parallelism problematic – Reason: L-shaped template result in complicated dependency graphs

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Modified Vector Propagation Modified Templates Advantages: – Sweeps perfectly separated along X- and Y-axes – SIMD Parallel – Uses only mutual exclusive Read/Write operations

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Store all data in textures / buffers (int16-tuples) Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Swap Read and Write Buffers and advance Write Buffer Read Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Repeat procedure for each column Alternating between R/W Swaps yields:

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Merging Sweep Pairs Correct values after „+X“ Sweep, prior to „-X“ Read Buffer Write Buffer Copy Column

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Merging Sweep Pairs Correct values after „+X“ Sweep, prior to „-X“ Read Buffer Write Buffer Correct values of „-X“ Sweep

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Merging Sweep Pairs Perform „-X“ Sweep Buffer A Buffer B

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Merge interleaved results Write Buffer Read Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Implementation Copy Buffer and perform sweep in Y-direction Read Buffer Write Buffer

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Performance 2D CPU: single thread on Core2Duo 2.4GHz GPU: NVIDIA GeForce 8800GTX ResolutionCPU timeGPU timePixel rateSpeed-up ms2.50 ms6.23 M/s0.42 x ms4.21 ms14.84 M/s1.09 x ms7.42 ms33.68 M/s2.70 x ms15.14 ms66.02 M/s6.06 x ms41.68 ms95.96M/s16.72 x ms186.5 ms85.79 M/s14.74 x CPU: single thread on Core2Duo 2.4 GHz GPU: NVIDIA GeForce 8800 GTX

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Errors Systematic error – Due to rasterization, half pixel diagonal Additionally: classification error – < Pixels – Relative error: about 0.3% – Negligible in practice Error likelihood (classification): – 0.56 per Million pixels (randomly seeded) – Decreases with increasing number of seeds

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Errors

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Generalization to 3D Canonic Extension -straight-forward (in principle) -Rendering to 3D Textures limited to Z-slices -Memory could be a problem (Quadro 5800: 4GB) Different Error bounds! -Not discussed in the paper -Still sub-voxel

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Generalization to 3D voxel grid of a right femural bone Spacing 0.373mm x 0.373mm x 1.0mm Iso-surfaces 15mm (green) and 30mm (red) away 1860ms (65.2MVoxels/s), Quadro5600, 1.5GB

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Performance 3D CPU: single thread on Core2Duo 2.4GHz GPU: NVIDIA GeForce 8800GTX ResolutionCPU timeGPU timePixel rateSpeed-up ms3.71 ms8.42 M/s2.61 x ms8.21 ms30.45 M/s10.25 x ms30.85 ms64.83 M/s33.08x ms213.0 ms75.12 M/s43.17 x CPU: single thread on Core2Duo 2.4 GHz GPU: NVIDIA GeForce 8800 GTX

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Conclusions Good speed-up for higher 2D resolutions – Sweet spot for 2048x2048 – Lower resolutions can be „packed“ to larger image Amazing speed-up for 3D – Caching issues become overwhelming on CPU – Limited VRAM could be a problem (up to 4GB)

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Future Work More error sometimes tolerable -LOD computation possible -Coarse computation using this method -One additional pass of JFA ( Rong et al ) -Less memory footprint, faster Subpixel accuracy for iso-contours Bricked computation

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group

computer graphics & visualization INSTICC VISAPP 09 – Lisboa, Portugal Jens Schneider – Computer Graphics and Visualization Group Thank you for your attention! Questions ?