02/22/2006 1 Manocha Interactive Modeling and Simulation using Graphics Processors Dinesh Manocha University of North Carolina at Chapel Hill

Slides:



Advertisements
Similar presentations
Point-based Graphics for Estimated Surfaces
Advertisements

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware.
COMPUTER GRAPHICS SOFTWARE.
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
GI 2006, Québec, June 9th 2006 Implementing the Render Cache and the Edge-and-Point Image on Graphics Hardware Edgar Velázquez-Armendáriz Eugene Lee Bruce.
GRAPHICS AND COMPUTING GPUS Jehan-François Pâris
IMGD 4000: Computer Graphics in Games Emmanuel Agu.
CS5500 Computer Graphics © Chun-Fa Chang, Spring 2007 CS5500 Computer Graphics April 19, 2007.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Slide 1 OneSAF Objective System (OOS) Overview Marlo Verdesca, Eric Root, Jaeson Munro
Brook for GPUs Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, Pat Hanrahan Stanford University DARPA Site Visit, UNC.
Collision Detection on the GPU Mike Donovan CIS 665 Summer 2009.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
Interactive Shadow Generation in Complex Environments Naga K. Govindaraju, Brandon Lloyd, Sung-Eui Yoon, Avneesh Sud, Dinesh Manocha Speaker: Alvin Date:
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
6/25/ MRM Computational Challenges for Modeling and Simulation Michael Macedonia Chief Technology Officer, US Army Program Executive Office for Simulation,
Evolution of the Programmable Graphics Pipeline Patrick Cozzi University of Pennsylvania CIS Spring 2011.
Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.
Status – Week 283 Victor Moya. 3D Graphics Pipeline Akeley & Hanrahan course. Akeley & Hanrahan course. Fixed vs Programmable. Fixed vs Programmable.
Technology to the Warfighter Quicker Stream Processing for Computer Generated Forces Kickoff Meeting Maria Bauer RDECOM-STTC.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Constraint-Based Motion Planning using Voronoi Diagrams Maxim Garber and Ming C. Lin Department of Computer.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection for Deformable Objects Xin Huang 16/10/2007.
11/28/ Manocha Interactive CGF Computations using COTS Graphics Processors Dinesh Manocha University of North Carolina at Chapel Hill
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Computations on GPU: Proximity Queries Avneesh Sud &Dinesh Manocha.
Parallel Graphics Rendering Matthew Campbell Senior, Computer Science
GPU Tutorial 이윤진 Computer Game 2007 가을 2007 년 11 월 다섯째 주, 12 월 첫째 주.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
Under the Hood: 3D Pipeline. Motherboard & Chipset PCI Express x16.
Interactive Visualization of Volumetric Data on Consumer PC Hardware: Introduction Daniel Weiskopf Graphics Hardware Trends Faster development than Moore’s.
CSE 690 General-Purpose Computation on Graphics Hardware (GPGPU) Courtesy David Luebke, University of Virginia.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
Slide 1 / 16 On Using Graphics Hardware for Scientific Computing ________________________________________________ Stan Tomov June 23, 2006.
Enhancing GPU for Scientific Computing Some thoughts.
Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.
Computationally Efficient Histopathological Image Analysis: Use of GPUs for Classification of Stromal Development Olcay Sertel 1,2, Antonio Ruiz 3, Umit.
Computer Graphics Graphics Hardware
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
Quick-CULLIDE: Efficient Inter- and Intra- Object Collision Culling using Graphics Hardware Naga K. Govindaraju, Ming C. Lin, Dinesh Manocha University.
Stream Processing Main References: “Comparing Reyes and OpenGL on a Stream Architecture”, 2002 “Polygon Rendering on a Stream Architecture”, 2000 Department.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL & MICROSOFT RESEARCH GPUTeraSort: High Performance Graphics Co-processor Sorting for Large Data Management.
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
Interactive Geometric Computations using Graphics Processors Naga K. Govindaraju UNC Chapel Hill.
Collision and Proximity Queries Dinesh Manocha Department of Computer Science University of North Carolina
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.
Havok FX Physics on NVIDIA GPUs. Copyright © NVIDIA Corporation 2004 What is Effects Physics? Physics-based effects on a massive scale 10,000s of objects.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Presented by Marcus Parker By Naga K. Govindaraju,
GLSL Review Monday, Nov OpenGL pipeline Command Stream Vertex Processing Geometry processing Rasterization Fragment processing Fragment Ops/Blending.
Postmortem: Deferred Shading in Tabula Rasa Rusty Koonce NCsoft September 15, 2008.
1 Real-Time High-Quality View-dependent Texture Mapping using Per-Pixel Visibility Damien Porquet Jean-Michel Dischler Djamchid Ghazanfarpour MSI Laboratory,
CMSC 611: Advanced Computer Architecture
Scalability of Intervisibility Testing using Clusters of GPUs
Graphics Processing Unit
Real-Time Ray Tracing Stefan Popov.
Chapter 6 GPU, Shaders, and Shading Languages
Hybrid Ray Tracing of Massive Models
GP2: General Purpose Computation using Graphics Processors
Computer-Generated Force Acceleration using GPUs: Next Steps
GPU-Accelerated Route Planning for Computer Generated Forces
GPGPU: Distance Fields
NVIDIA Fermi Architecture
Graphics Processing Unit
ATO Project: Year 3 Main Tasks
Ray Tracing on Programmable Graphics Hardware
Presentation transcript:

02/22/ Manocha Interactive Modeling and Simulation using Graphics Processors Dinesh Manocha University of North Carolina at Chapel Hill

02/22/ Manocha UNC Collaborators Co-PI  Ming C. Lin Research Staff  Naga Govindaraju  Dave Tuft Graduate Students  Russ Gayle  Brandon Lloyd  Brian Salomon  Avneesh Sud  Sungeui Yoon  Talha Zaman

02/22/ Manocha Collaborative Effort RDECOM  Maria Bauer  Angel Rodriguez SAIC  Eric Root  Marlo Verdesca  Jaeson Munro

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Current Desktop System CPU (3 GHz) System Memory (2 GB) AGP Memory (512 MB) 6.4 GB/s bandwidth PCI-E Bus (4 GB/s) 35.2 GB/s bandwidth Video Memory (512 MB) GPU (500 MHz) Video Memory (512 MB) GPU (500 MHz) 2 x 1 MB Cache

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GeForce 7800 – 302M Transistors

02/22/ Manocha CPU vs. GPU

02/22/ Manocha CPU vs. GPU (Henry Moreton: NVIDIA, Aug. 2005) PEE GTXGPU/CPU Graphics GFLOPs Shader GFLOPs Die Area (mm2) Die Area normalized Transistors (M) Power (W) GFLOPS/mm GFLOPS/tr GFLOPS/W

02/22/ Manocha This graph highlights the relative growth rate of GPUs vs. CPUs. GPUs have been growing at a rate faster than Moore’s law and this trend is expected to continue for at least 5 more years. Goal:Exploit GPUs for CGF Computations GPUs: Growing Faster than Moore’s Law

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Quad SLI: 1.3 Billion transistors Jan’2006

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GPGP: General Purpose computation using GPUs Scientific applications Geometric computations Scientific visualization Physical simulation Robotics & navigation Database computation Financial applications Cryptography Modeling and simulation

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL vertex setup rasterizer pixel texture image per-pixel texture, fp16 blending Graphics Pipeline programmable vertex processing (fp32) programmable per- pixel math (fp32) polygon polygon setup, culling, rasterization Z-buf, fp16 blending, anti-alias (MRT) memory

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL data setup rasterizer data data fetch, fp16 blending NON-Graphics Pipeline programmable MIMD processing (fp32) programmable SIMD processing (fp32) lists SIMD “rasterization” predicated write, fp16 blend, multiple output memory Courtesy: David Kirk, Chief Scientist, NVIDIA

02/22/ Manocha Issues in using GPUs Programmability Precision Handling large data

02/22/ Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations

02/22/ Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations

Real-time Computational Challenges for Computer Generated Forces (CGF) Atmospheric transport models Vehicle dynamics Wide area sensors Petabyte Urban Terrain Databases

Real-time Terrain Reasoning for Computer Generated Forces Best algorithms are O(N 2 ) where N = objects/entities in the CGF database (e.g., sensors, platforms, buildings, people) Currently over 40% of CGF CPU time for battalion-level scenarios spent in: – Collision detection – Line of sight computation – Terrain placement Current system can barely handle 300 entities on a 300K polygon terrain models at 10m x 10m resolution Need times improvement to handle sub-meter resolution terrain model CPUs progressing at Moore’s law (1.7x per year)  need more than 7-8 years to catch on

02/22/ Manocha Project Accomplishments GPU-based LOS algorithm  x improvement in LOS query  Integration into OneSAF: 15-20x simulation speed improvement (5000 entities)

02/22/ Manocha Project Accomplishments GPU-based LOS algorithm  x improvement in LOS query  Integration into OneSAF: 15-20x simulation speed improvement (5000 entities) Region-based visibility algorithms to accelerate LOS (Supported by ATO)  4-10x further improvement in LOS query  Integrations into OneSAF: 10x simulation speed improvement in urban environments (3000 entities)

02/22/ Manocha Project Accomplishments GPU-based LOS algorithm  x improvement in LOS query  Integration into OneSAF: 15-20x simulation speed improvement (5000 entities) Region-based visibility algorithms to accelerate LOS  4-10x further improvement in LOS query  Integrations into OneSAF: 10x simulation speed improvement in urban environments (3000 entities) GPU-based route planning  10-30X improvement in route computation  10x simulation speed improvement (3000 entities)

02/22/ Manocha Project Accomplishments GPU-based LOS algorithm  x improvement in LOS query  Integration into OneSAF: 15-20x simulation speed improvement (5000 entities) Region-based visibility algorithms to accelerate LOS:  4-10x further improvement in LOS query  Integrations into OneSAF: 10x simulation speed improvement in urban environments (3000 entities) GPU-based route planning  10-30X improvement in route computation  10x simulation speed improvement (3000 entities) GPU-based collision detection  10x estimated improvement in collision query  10x simulation speed improvement (150 entities)

11/28/ Manocha LOS Integration Process OneSAF/GPU Requirements (SAIC/UNC) OneSAF/GPU Requirements (SAIC/UNC) OneSAF Technical Report (SAIC) OneSAF Technical Report (SAIC) GPU Algorithm Creation (UNC) GPU Algorithm Creation (UNC) Execute Unit Test (SAIC/UNC) Execute Unit Test (SAIC/UNC) OneSAF Scenario Creation (SAIC) OneSAF Scenario Creation (SAIC) OneSAF Benchmark Results (SAIC) OneSAF Benchmark Results (SAIC) Integration into OOS (SAIC) Add several OpenGL dll’s to ERC libraries Place c++ header files for OpenGL among the ERC code Create a new directory among the ERC code - Setup a new makefile/buildfile, to allow GPU to build as its own library Add calls to ERC Initialization to: - Gather all the triangles in the entire database - Gather all features in the database - Pass all triangles and features into the initialization for the GPU Replace all original LOS calls with the GPU counterpart Integration into OOS (SAIC) Add several OpenGL dll’s to ERC libraries Place c++ header files for OpenGL among the ERC code Create a new directory among the ERC code - Setup a new makefile/buildfile, to allow GPU to build as its own library Add calls to ERC Initialization to: - Gather all the triangles in the entire database - Gather all features in the database - Pass all triangles and features into the initialization for the GPU Replace all original LOS calls with the GPU counterpart

23 OneSAF with GPU-based LOS Algorithm: Demonstration LOS Computation on 5K Entities Route Planning on 5K Entities

24 OneSAF with GPU-based LOS Algorithm: Demonstration Average time for Standard LOS service call: 1-2 millisecond (w/o GPU-based algorithm) Average time for GPU LOS service call: 8-12 microseconds Almost 200X speedup for single LOS query 15-20x improvement in OneSAF simulation speed in JRTC terrain with 5000 entities

02/22/ Manocha Project Accomplishments Successful demonstration at DARPATech’2005; I/ITSEC’04; I/ITSEC’05

02/22/ Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Proximity Queries Geometric reasoning of spatial relationships among objects (in a dynamic environment) d Closest Points & Separation Distance d Penetration Depth Collision Detection Contact Points & Normals

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection Systems I-COLLIDE (1995) RAPID (1996) V-COLLIDE (1997) H-COLLIDE (1998) PQP (1999) SWIFT (2000) PIVOT (2001) SWIFT++ (2001) DEEP (2002) CULLIDE (2003)

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Distance Fields Voronoi diagram computation using GPUs Render polygonal mesh approximations of primitive distance fields Color bufferDepth buffer: Result after compositing distance fields using minimum depth test [Hoff, et al; SIGGRAPH 1999]

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Our Hybrid Approach Image-space proximity queries Coarse object-space geometric localization CPUGPU Balance load by varying localization coarseness and error bound [Hoff, Zaferakis, Lin & Manocha; I3D01]

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Gears Non-convex, rigid objects Frequent interlocking contacts Unconstrained, penalty- based

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Algorithm Object Level Pruning Sub-object Level Pruning Exact Tests GPU-based PCS computationUsing CPU [Govindaraju, et al; GH 2003]

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Reliable Collision Culling using GPUs

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Interactive Self-Collision Detection order of magnitude improvement

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Interactive Proximity Query Breaking objects & changing topologies

02/22/ Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations

02/22/ Manocha Interactive Smoke Simulation using GPUs Interactive Fluid Simulation Demonstration 1 Demonstration 2 Demonstration 3

02/22/ Manocha Interactive Ice Simulation using GPUs Interactive Phase Field Method Simulation

02/22/ Manocha Interactive Fluid Simulation using GPUs Interactive Paint Mixing with a human in the loop Interactive Paint Mixing

02/22/ Manocha Interactive Lightning using GPUs Interactive Lightning Demonstration 1 Demonstration 2

02/22/ Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Interactive simulations Interactive shadows Database and data streaming Sorting and scientific computations

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Shadows Shadows occur on surfaces seen by the eye, but not seen by the light Light Object Shadow Eye

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Shadow Generation Shadows improve depth perception Shadows provide additional information about an object’s shape Aesthetics – shadows are more visually interesting & realistic

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Interactive Shadow Generation using GPUs

02/22/ Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Interactive simulations Interactive shadows Database and data streaming Sorting and scientific computations

02/22/ Manocha Databases: Predicate Evaluation CPU implementation — Intel compiler 7.1 with SSE optimizations (CPU + GPU) is ~20 times faster than only CPU SIGMOD 2004

02/22/ Manocha Comparison on Different GPUs: Super-Moore’s Law

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GPUSort: 32-bit floating point inputs GPUSORT: slashdot.org & Tom’s Hardware guide (750 downloads in 6 weeks)

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL LU-Decomposition with Partial Pivoting (32-bit inputs) IEEE/ACM SuperComputing 2005

02/22/ Manocha GPU-based Algorithms 1-2 Orders of magnitude improvement Performance gap would increase in the future OneSAF Scalability (using GPU clusters)

02/22/ Manocha Future Work Develop other GPU-based algorithms for OOS  Other LOS computations: attenuation, handling smoke  Force and atmospheric simulations  Combine with multi-resolution representations Handle very large and complex terrains GPUs clusters for modeling and simulation Extension to multiple simulation environments, WARSIM, JMTK, GIG Use GPUs with various RDEC models