Download presentation
Presentation is loading. Please wait.
Published byBarbra Watkins Modified over 9 years ago
1
David Angulo Rubio FAMU CIS GradStudent
2
Introduction GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become extremely powerful and flexible Programmability Precision Power GPGPU computing is an emerging field which objective is to harness GPUs for general- purpose computation
3
GPU Performance Trends
4
Motivation: Flexible and Precise Modern GPUs are deeply programmable Programmable pixel, vertex, video engines. Solidifying high level language support Modern GPUs support high precision 32 bit floating point throughout the pipeline High enough for many (not all) applications Newest GPUs have 64bit support
6
Stream Programming Abstraction Streams Collection of data records All data is expressed in streams Kernels Inputs/outputs are streams Perform computation on streams Can be chained together KERNEL stream
7
Stream Programming Abstraction Dolphin Triangle Mesh
8
Stream Programming Abstraction Benchmark Funnel: In this simulation, a cloth falls into a funnel and pass through it under the pressure of a ball. This model has 47K vertices, 92K triangles, and a lot of self- collisions. Our novel GPU-based CCD algorithm takes 4.4ms and 10ms per frame to compute all the collisions on a NVIDIA GeForce GTX 480 and a NVIDIA GeForce GTX 285, respectively.
9
Stream Programming Abstraction
10
Why Streams Ample computation by exposing parallelism Streams expose data parallelism Multiple streams elements can be processed in parallel Pipeline (task) parallelism Multiple tasks can be processed in parallel Kernels yield high arithmetic intensity Efficient communication Producer consumer locality Predictable memory access pattern Optimize for throughput of all elements, not latency of one Processing elements at once allows latency hiding
11
CPU GPU ANALOGIES Stream/Data array = Texture Memory read= Texture Sample
12
Structuring a GPU Program Cpu assembles input data Cpu transfers data to GPU(GPU “main memory” or “device memory”) Cpu calls GPU program (or set of kernels). GPU runs out of GPU main memory. When GPU finishes, CPU copies back results into CPU memory. Recent interfaces allow overlap What lessons can we draw from this sequence of operations
16
Kernels CPU GPU ADVECT KERNEL / LOOP BODY / ALGORITHM STEP = Fragment Program You write one program. It runs on every vertex/fragment.
17
Conclusion Can we apply these techniques to more general problems? GPUs should excel at tasks that : Require ample computation Regular computation Efficient communication
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.