Download presentation
Presentation is loading. Please wait.
1
02/22/2006 1 Manocha Interactive Modeling and Simulation using Graphics Processors Dinesh Manocha University of North Carolina at Chapel Hill dm@cs.unc.edu http://gamma.cs.unc.edu/hardware
2
02/22/2006 2 Manocha UNC Collaborators Co-PI Ming C. Lin Research Staff Naga Govindaraju Dave Tuft Graduate Students Russ Gayle Brandon Lloyd Brian Salomon Avneesh Sud Sungeui Yoon Talha Zaman
3
02/22/2006 3 Manocha Collaborative Effort RDECOM Maria Bauer Angel Rodriguez SAIC Eric Root Marlo Verdesca Jaeson Munro
4
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Current Desktop System CPU (3 GHz) System Memory (2 GB) AGP Memory (512 MB) 6.4 GB/s bandwidth PCI-E Bus (4 GB/s) 35.2 GB/s bandwidth Video Memory (512 MB) GPU (500 MHz) Video Memory (512 MB) GPU (500 MHz) 2 x 1 MB Cache
5
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GeForce 7800 – 302M Transistors
6
02/22/2006 6 Manocha CPU vs. GPU
7
02/22/2006 7 Manocha CPU vs. GPU (Henry Moreton: NVIDIA, Aug. 2005) PEE 8407800GTXGPU/CPU Graphics GFLOPs 25.6130050.8 Shader GFLOPs 25.631312.2 Die Area (mm2) 2063261.6 Die Area normalized 2062181.1 Transistors (M) 2303021.3 Power (W) 130650.5 GFLOPS/mm 0.16.047.9 GFLOPS/tr 0.14.338.7 GFLOPS/W 0.220.0101.6
8
02/22/2006 8 Manocha This graph highlights the relative growth rate of GPUs vs. CPUs. GPUs have been growing at a rate faster than Moore’s law and this trend is expected to continue for at least 5 more years. Goal:Exploit GPUs for CGF Computations GPUs: Growing Faster than Moore’s Law
9
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Quad SLI: 1.3 Billion transistors Jan’2006
10
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GPGP: General Purpose computation using GPUs Scientific applications Geometric computations Scientific visualization Physical simulation Robotics & navigation Database computation Financial applications Cryptography Modeling and simulation http://www.cs.unc.edu/GP2 http://www.gpgpu.org
11
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL vertex setup rasterizer pixel texture image per-pixel texture, fp16 blending Graphics Pipeline programmable vertex processing (fp32) programmable per- pixel math (fp32) polygon polygon setup, culling, rasterization Z-buf, fp16 blending, anti-alias (MRT) memory
12
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL data setup rasterizer data data fetch, fp16 blending NON-Graphics Pipeline programmable MIMD processing (fp32) programmable SIMD processing (fp32) lists SIMD “rasterization” predicated write, fp16 blend, multiple output memory Courtesy: David Kirk, Chief Scientist, NVIDIA
13
02/22/2006 13 Manocha Issues in using GPUs Programmability Precision Handling large data
14
02/22/2006 14 Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations
15
02/22/2006 15 Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations
16
Real-time Computational Challenges for Computer Generated Forces (CGF) Atmospheric transport models Vehicle dynamics Wide area sensors Petabyte Urban Terrain Databases
17
Real-time Terrain Reasoning for Computer Generated Forces Best algorithms are O(N 2 ) where N = objects/entities in the CGF database (e.g., sensors, platforms, buildings, people) Currently over 40% of CGF CPU time for battalion-level scenarios spent in: – Collision detection – Line of sight computation – Terrain placement Current system can barely handle 300 entities on a 300K polygon terrain models at 10m x 10m resolution Need 200-500 times improvement to handle sub-meter resolution terrain model CPUs progressing at Moore’s law (1.7x per year) need more than 7-8 years to catch on
18
02/22/2006 18 Manocha Project Accomplishments GPU-based LOS algorithm 150-200x improvement in LOS query Integration into OneSAF: 15-20x simulation speed improvement (5000 entities)
19
02/22/2006 19 Manocha Project Accomplishments GPU-based LOS algorithm 150-200x improvement in LOS query Integration into OneSAF: 15-20x simulation speed improvement (5000 entities) Region-based visibility algorithms to accelerate LOS (Supported by ATO) 4-10x further improvement in LOS query Integrations into OneSAF: 10x simulation speed improvement in urban environments (3000 entities)
20
02/22/2006 20 Manocha Project Accomplishments GPU-based LOS algorithm 150-200x improvement in LOS query Integration into OneSAF: 15-20x simulation speed improvement (5000 entities) Region-based visibility algorithms to accelerate LOS 4-10x further improvement in LOS query Integrations into OneSAF: 10x simulation speed improvement in urban environments (3000 entities) GPU-based route planning 10-30X improvement in route computation 10x simulation speed improvement (3000 entities)
21
02/22/2006 21 Manocha Project Accomplishments GPU-based LOS algorithm 150-200x improvement in LOS query Integration into OneSAF: 15-20x simulation speed improvement (5000 entities) Region-based visibility algorithms to accelerate LOS: 4-10x further improvement in LOS query Integrations into OneSAF: 10x simulation speed improvement in urban environments (3000 entities) GPU-based route planning 10-30X improvement in route computation 10x simulation speed improvement (3000 entities) GPU-based collision detection 10x estimated improvement in collision query 10x simulation speed improvement (150 entities)
22
11/28/2005 22 Manocha LOS Integration Process OneSAF/GPU Requirements (SAIC/UNC) OneSAF/GPU Requirements (SAIC/UNC) OneSAF Technical Report (SAIC) OneSAF Technical Report (SAIC) GPU Algorithm Creation (UNC) GPU Algorithm Creation (UNC) Execute Unit Test (SAIC/UNC) Execute Unit Test (SAIC/UNC) OneSAF Scenario Creation (SAIC) OneSAF Scenario Creation (SAIC) OneSAF Benchmark Results (SAIC) OneSAF Benchmark Results (SAIC) Integration into OOS (SAIC) Add several OpenGL dll’s to ERC libraries Place c++ header files for OpenGL among the ERC code Create a new directory among the ERC code - Setup a new makefile/buildfile, to allow GPU to build as its own library Add calls to ERC Initialization to: - Gather all the triangles in the entire database - Gather all features in the database - Pass all triangles and features into the initialization for the GPU Replace all original LOS calls with the GPU counterpart Integration into OOS (SAIC) Add several OpenGL dll’s to ERC libraries Place c++ header files for OpenGL among the ERC code Create a new directory among the ERC code - Setup a new makefile/buildfile, to allow GPU to build as its own library Add calls to ERC Initialization to: - Gather all the triangles in the entire database - Gather all features in the database - Pass all triangles and features into the initialization for the GPU Replace all original LOS calls with the GPU counterpart
23
23 OneSAF with GPU-based LOS Algorithm: Demonstration LOS Computation on 5K Entities Route Planning on 5K Entities
24
24 OneSAF with GPU-based LOS Algorithm: Demonstration Average time for Standard LOS service call: 1-2 millisecond (w/o GPU-based algorithm) Average time for GPU LOS service call: 8-12 microseconds Almost 200X speedup for single LOS query 15-20x improvement in OneSAF simulation speed in JRTC terrain with 5000 entities
25
02/22/2006 25 Manocha Project Accomplishments Successful demonstration at DARPATech’2005; I/ITSEC’04; I/ITSEC’05
26
02/22/2006 26 Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations
27
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Proximity Queries Geometric reasoning of spatial relationships among objects (in a dynamic environment) d Closest Points & Separation Distance d Penetration Depth Collision Detection Contact Points & Normals
28
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection Systems I-COLLIDE (1995) RAPID (1996) V-COLLIDE (1997) H-COLLIDE (1998) PQP (1999) SWIFT (2000) PIVOT (2001) SWIFT++ (2001) DEEP (2002) CULLIDE (2003)
29
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Distance Fields Voronoi diagram computation using GPUs Render polygonal mesh approximations of primitive distance fields Color bufferDepth buffer: Result after compositing distance fields using minimum depth test [Hoff, et al; SIGGRAPH 1999]
30
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Our Hybrid Approach Image-space proximity queries Coarse object-space geometric localization CPUGPU Balance load by varying localization coarseness and error bound [Hoff, Zaferakis, Lin & Manocha; I3D01]
31
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Gears Non-convex, rigid objects Frequent interlocking contacts Unconstrained, penalty- based
32
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Algorithm Object Level Pruning Sub-object Level Pruning Exact Tests GPU-based PCS computationUsing CPU http://gamma.cs.unc.edu/CULLIDE [Govindaraju, et al; GH 2003]
33
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Reliable Collision Culling using GPUs
34
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Interactive Self-Collision Detection 1-1.5 order of magnitude improvement
35
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Interactive Proximity Query Breaking objects & changing topologies
36
02/22/2006 36 Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Simulations Database and data streaming Sorting and scientific computations
37
02/22/2006 37 Manocha Interactive Smoke Simulation using GPUs Interactive Fluid Simulation Demonstration 1 Demonstration 2 Demonstration 3
38
02/22/2006 38 Manocha Interactive Ice Simulation using GPUs Interactive Phase Field Method Simulation
39
02/22/2006 39 Manocha Interactive Fluid Simulation using GPUs Interactive Paint Mixing with a human in the loop Interactive Paint Mixing
40
02/22/2006 40 Manocha Interactive Lightning using GPUs Interactive Lightning Demonstration 1 Demonstration 2
41
02/22/2006 41 Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Interactive simulations Interactive shadows Database and data streaming Sorting and scientific computations
42
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Shadows Shadows occur on surfaces seen by the eye, but not seen by the light Light Object Shadow Eye
43
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Shadow Generation Shadows improve depth perception Shadows provide additional information about an object’s shape Aesthetics – shadows are more visually interesting & realistic
44
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL
47
Interactive Shadow Generation using GPUs
48
02/22/2006 48 Manocha GPU-based Computations Accelerating OneSAF using GPUs Interactive collision detection Interactive simulations Interactive shadows Database and data streaming Sorting and scientific computations
49
02/22/2006 49 Manocha Databases: Predicate Evaluation CPU implementation — Intel compiler 7.1 with SSE optimizations (CPU + GPU) is ~20 times faster than only CPU SIGMOD 2004
50
02/22/2006 50 Manocha Comparison on Different GPUs: Super-Moore’s Law
51
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL GPUSort: 32-bit floating point inputs GPUSORT: slashdot.org & Tom’s Hardware guide (750 downloads in 6 weeks)
52
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL LU-Decomposition with Partial Pivoting (32-bit inputs) IEEE/ACM SuperComputing 2005
53
02/22/2006 53 Manocha GPU-based Algorithms 1-2 Orders of magnitude improvement Performance gap would increase in the future OneSAF Scalability (using GPU clusters)
54
02/22/2006 54 Manocha Future Work Develop other GPU-based algorithms for OOS Other LOS computations: attenuation, handling smoke Force and atmospheric simulations Combine with multi-resolution representations Handle very large and complex terrains GPUs clusters for modeling and simulation Extension to multiple simulation environments, WARSIM, JMTK, GIG Use GPUs with various RDEC models
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.