Download presentation
Presentation is loading. Please wait.
Published byDawid Górski Modified over 6 years ago
1
GP2: General Purpose Computation using Graphics Processors
Dinesh Manocha & Avneesh Sud Spring 2007 Department of Computer Science UNC Chapel Hill
2
Instructors Dinesh Manocha: dm@cs.unc.edu: 962-1749
Avneesh Sud:
3
Class Schedule Current Time Slot: 2:00 – 3:15pm, Mon/Wed, SN011
Office hours: TBD Class mailing list: (??)
4
GPGP: What kind of course is it?
Is it a graphics course?
5
GPGP: What kind of course is it?
Is it a graphics course? Is it a system course?
6
GPGP: What kind of course is it?
Is it a graphics course? Is it a system course? Is it an application course?
7
GPGP: What kind of course is it?
Is it a graphics course? Is it a system course? Is it an application course? It is all of them!!
8
Is this the right course for me?
No strict pre-requisites Course would borrow concepts from Computer graphics Linear algebra Numerical computations Architectures: CPU & GPUs Parallel programming (data parallel programming) Applications Geometric computations Database computations Scientific computing and physical simulation Computer vision …
9
Modern Commodity Processors
GPU (1.3 GHz) CPU (2 x 3GHz) Video Memory (768 MB) 2 x 4 MB Cache CPU (2 x 3GHz) PCI-E Bus (4 GB/s) GPU (1.3 GHz) 2 x 4 MB Cache Modern computer architectures consists of two processors - CPUs or GPUs to handle these datasets. We quickly glance over the issues with CPUs and later explain the advantages of GPUs Video Memory (768 MB) System Memory (4 GB) HyperTransport (20 GB/s)
10
GPUs of Today! The GPU on commodity video cards has evolved into an extremely flexible and powerful processor Programmability Precision Power
11
GPGP The GPU on commodity video cards has evolved into an extremely flexible and powerful processor Programmability Precision Power This course will address how to harness that power for general-purpose computation (non-rasterization) Algorithmic issues Programming and systems Applications
12
GeForce 7900 – 302M Transistors (2005)
13
GeForce 7900 – 302M Transistors (OUT OF DATE)
14
GeForce 8800 – 600M Transistors (2006)
15
Graphics Processing Units (GPUs)
Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Low memory latency pipeline Programmable High growth rate Power-efficient
16
GPU: Commodity Processor
Laptops Consoles Cell phones PSP Desktops
17
GPU: Commodity Processor
Laptops Consoles Cell phones ???? SuperComputers PSP Desktops
18
GPU: Commodity Processor
Laptops Consoles Cell phones ???? iPhone PSP Desktops
19
Graphics Processing Units (GPUs)
Commodity processor for graphics applications Massively parallel vector processors 10-20x more operations per sec than CPUs High memory bandwidth Better hides memory latency pipeline Programmable High growth rate Power-efficient
20
Parallelism on GPUs Graphics FLOPS GPU – 1.3 TFLOPS CPU – 25.6 GFLOPS
21
Quad SLI: 1.3 Billion transistors
Jan’2006
22
Graphics Processing Units (GPUs)
Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Better hides latency pipeline Programmable 10x more memory bandwidth than CPUs High growth rate Power-efficient
23
CPU vs. GPU Memory Hierarchy
Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache L2 cache L2 cache DDR2 RAM GDDR4 RAM
24
CPU vs. GPU Memory Hierarchy: Broad Level Comparison
Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache Write back Write through L2 cache L2 cache DDR2 RAM GDDR4 RAM
25
CPU vs. GPU Memory Hierarchy
Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache Small, 4MB Very small L2 cache L2 cache DDR2 RAM GDDR4 RAM
26
CPU vs. GPU Memory Hierarchy
Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache L2 cache L2 cache High B/W, 86 GB/s Low B/W, 8GB/s DDR2 RAM GDDR4 RAM
27
Graphics Processing Units (GPUs)
Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Better hides latency pipeline Programmable High growth rate Power-efficient
28
GFLOPS for GPUs & CPUs Graphics-Flops Giga-Flops
29
Graphics Processing Units (GPUs)
Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Better hides latency pipeline Programmable High growth rate Power-efficient (high throughput per watt)
30
Computational Power of GPUs
Why are GPUs getting faster so fast? Arithmetic intensity: the specialized nature of GPUs makes it easier to use additional transistors for computation not cache Economics: multi-billion dollar video game market is the killer application that pays for innovation
31
GPUs and Computer Architecture
Current research in computer architecture is looking at: Streaming computation Flexible polymorphous computing systems Multi-core architecture Heterogeneous architecture More on these topics in the future
32
GPUs and Computer Architecture
Current research in computer architecture is looking at: Streaming computation Flexible polymorphous computing systems Multi-core architecture Heterogeneous architecture GPU-like architectures have a lot in common with all these research trends!
33
GPUs and Computer Architecture
Current research in computer architecture is looking at: Streaming computation Flexible polymorphous computing systems Multi-core architecture Heterogeneous architecture GPU-like architectures have a lot in common with all these research trends! We plan to touch on many of these topics as part of the course!
34
Is There a Future of GPGPU?
One of the Five Disruptive Technologies for 2007 SuperComputing’s Next Revolution
35
Capabilities of Current GPUs
Modern GPUs are deeply programmable Programmable pixel, vertex, video engines Solidifying high-level language support Modern GPUs support 32-bit floating point precision Great development in the last few years 64-bit arithmetic may be coming soon Almost IEEE FP compliant
36
The Potential of GPGP The power and flexibility of GPUs makes them an attractive platform for general-purpose computation Example applications range from in-game physics simulation, geometric applications to conventional computational science Goal: make the inexpensive power of the GPU available to developers as a sort of computational coprocessor Check out
37
GPGP: Challenges GPUs designed for and driven by video games
Programming model is unusual & tied to computer graphics Programming environment is tightly constrained Underlying architectures are: Inherently parallel Rapidly evolving (even in basic feature set!) Largely secret No clear standards (besides DirectX imposed by MSFT) Can’t simply “port” code written for the CPU! Is there a formal class of problems that can be solved using current GPUs
38
Importance of Data Parallelism
GPUs are designed for graphics or gaming industry Highly parallel tasks GPUs process independent vertices & fragments Temporary registers are zeroed No shared or static data No read-modify-write buffers Data-parallel processing GPUs architecture is ALU-heavy Multiple vertex & pixel pipelines, multiple ALUs per pipe Hide memory latency (with more computation)
39
GPGPU Applications Geometric computations Database computations
Scientific computing and physical simulation Signal processing Computer vision Efficient when computation domain is a uniform grid
40
Geometric Computations
Distance computations: Data-parallel computation Demo (2D)
41
Geometric Computations
Distance computations
42
Geometric Computations
Collision Detection and Proximity Computations GPU: A culling co-processor N-Objects Stage 1 Culling GPU-Based Culling Exact Tests Potential Colliding Set Overlap Tests Collision Potential Neighbor Set Distance Distance-Based Culling CPU GPU
43
Geometric Computations
Collision Detection
44
Geometric Computations
Proximity Computations
45
Database Computations
46
Physical Simulation Solving PDEs Reaction-Diffusion Demo Fluid Demo
Numerical methods Linear Algebra Reaction-Diffusion Demo Fluid Demo
47
Signal Processing FFT, DCT, Video Processing DCT demo
Video filtering demo
48
Computer Vision Realtime feature tracker (KLT)
49
Computer Vision Realtime feature tracker (KLT)
50
Goals of this Course A detailed introduction to general-purpose computing on graphics hardware Emphasis includes: Core computational building blocks Strategies and tools for programming GPUs Cover many applications and explore new applications Highlight major research issues
51
Course Organization Survey lectures
Instructors, other faculty, senior graduate students Breadth and depth coverage Student presentations
52
Course Contents Overview of GPUs: architecture and features
Models of computation for GPU-based algorithms System issues: Cache and data management; Languages and compilers Numerical and Scientific Computations: Linear algebra computations. Optimization, FFTrigid body simulation, fluid dynamics Geometric computations: Proximity computations; distance fields; motion planning and navigation Database computations: database queries: predicates, booleans, aggregates; streaming databases and data mining; sorting & searching GPU Clusters: Parallel computing environments for GPUs Rendering: Ray-tracing, photon mapping; Shadows
53
Student Load Stay awake in classes! One class lecture
Read a lot of papers 2-3 small assignments
54
Student Load Stay awake in classes! One class lecture
Read a lot of papers 2-3 small assignments A MAJOR COURSE PROJECT WITH RESEARCH COMPONENT
55
Course Projects Work by yourself or part of a small team
Develop new algorithms for simulation, geometric problems, database computations Formal model for GPU algorithms or GPU hacking Issues in developing GPU clusters for scientific computation Look into new architecture and parallel programming trends
56
Possible Course Projects
Check the WWW site
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.