Download presentation
Presentation is loading. Please wait.
Published byAshlyn Kelly Modified over 9 years ago
1
1 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac GPU Precision, Power, Programmability –CPU: x60/decade, 6 GFLOPS, 6GB/sec –GPU: x1000/decade, 20 GFLOPs, 25GB/sec –Arithmetic heavy (read OR write): faster hardware –Parallelization –Multi-billion $ entertainment market drives innovation –32-bit Floating point –Programmable (graphics, physics, general purpose data-flow) –Can’t simply “port” CPU code to GPU David Luebke et al. GPGPU, SIGGRAPH 2004
2
2 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac History of the 3D graphics industry 60s: –Line drawings, hidden lines, parametric surfaces (B-splines…) –Automated drafting & machining for car, airplane, and ships manufacturers 70’s: –Mainframes, Vector tubes (HP…) –Software: Solids, (CSG), Ray Tracing, Z-buffer for hidden lines 80s: –Graphics workstations ($50K-$1M): Frame buffers, rasterizers, GL, Phigs –VR: CAVEs and head-mounted displays –CAD/CAM & GIS: CATIA, SDRC, PTC –Sun, HP, IBM, SGI, E&S, DEC 90s: –PCs ($2K): Graphics boards, OpenGL, Java3D –CAD+Videogames+Animations: AutoCAD, SolidWorks…, Alias-Wavefront –Intel, many board vendors 00s: –Laptops, PDAs, Cell Phones: Parallel graphic chips –Everything will be graphics, 3D, animated, interactive –Nvidia, Sony, Nokia
3
3 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac History of GPU Pre-GPU Graphics Acceleration –SGI, Evans & Sutherland. Introduced concepts like vertex transformation and texture mapping. Very expensive! First-Generation GPU (-1998) –Nvidia TNT2, ATI Rage, Voodoo3. Vertex transformation on CPU, limited set of math operations. Second-Generation GPU (1999-2000) –GeForce 256, Geforce2, Radeon 7500, Savage3D. Transformation & Lighting. More configurable, still not programmable. Third-Generation GPU (2001) –Geforce3, Geforce4 Ti, Xbox, Radeon 8500. Vertex Programmability, pixel-level configurability. Fourth-Generation GPU (2002-) –Geforce FX series, Radeon 9700 and on. Vertex-level and pixel-level programmability.
4
4 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Architecture Vertex Shader Rasterizer Fragment Shader Compositor Display Application transformed vertices, normals, colors fragments (surfels per pixel) pixel color, depth, stencil texture Geometry Shader
5
5 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Buffers Color: 8-bit index to color table, float/16-bit true color… Depth: 24-bit or float (0 at back plane) Back and front: display front, update back, swap Stereo: Shutter glasses, HMD. Alternate frames Auxiliary: off-screen working space. Helps reduce passes. Stencil: 8 bits (left-over of depth buffer). … mask, ++ Accumulation: sum, scale (supersampling, blur) P-buffer, superbuffers: Render to texture
6
6 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Fragment operations Depth tests:, <=, ==, Z depth-interval Stencil test: mask?, counter, parity. Alpha tests: compare to reference alpha Alpha blending: + max, min, replace, blend
7
7 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Data Parallelism in GPUs Data flow: vertices > fragments > pixels Parallelism at each stage No shared or static data (except textures) ALU-heavy (multiple ALUs per stage in pipe) Fight memory latency with more computation
8
8 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac GPGPU Stream: collection of records (pixels, vertices…) –Stored in Textures (a computational grid) Kernel: Function applied to each element in stream –Transform, evolve (no dependency between records) Matrix algebra Image/volume processing Physical simulation Global illumination –Ray tracing –Photon mapping –Radiosity
9
9 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Computational Resources Programmable parallel processors –Vertex & Fragment pipelines Rasterizer –Mostly useful for interpolating addresses (texture coordinates) and per- vertex constants Texture unit –Read-only memory interface Render to texture (or Copy to texture) –Write-only memory interface
10
10 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Vertex Processor Fully programmable (SIMD / MIMD) Processes 4-vectors (RGBA / XYZW) Capable of scatter but not gather (A[i,j]=x;) –Can change the location of current vertex –Cannot read info from other vertices –Can only read a small constant memory Vertex Texture Fetch –Random access memory for vertices –Arguably still not gather
11
11 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Fragment Processor May be invoked at each pixel by drawing a full screen quad Fully programmable (SIMD) Processes 4-vectors (RGBA / XYZW) Random access memory read (textures) Capable of gather (x=A[i+1,j];) and some scatter –RAM read (texture), but no RAM write –Output address fixed to a specific pixel –But can change that address Typically more useful than vertex processor –More fragment pipelines than vertex pipelines –Gather –Direct output (fragment processor is at end of pipeline)
12
12 SIC / CoC / Georgia Tech MAGIC Lab http://www.gvu.gatech.edu/~jarekJarek Rossignac Branching Not supported or expensive Avoid, replace by math Depth test Stencil test Occlusion query (conditional execution) Pre-computation (region of interest, use to set stencil mask)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.