Download presentation
Presentation is loading. Please wait.
Published byTyler Mooney Modified over 10 years ago
1
computer graphics & visualization GP-GPU
2
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Graphics hardware Current performace – PlayStation 3 Current performace – PlayStation 3 CPU: Cell Prozessor (3,2 GHz) CPU: Cell Prozessor (3,2 GHz) – 512 kB L2-Cache – ~200 GFLOP/s GPU (Graphics Processing Unit) GPU (Graphics Processing Unit) – Nvidia RSX Reality Synthesizer (550 MHz, ~300 MTransistors – ~ 1,8 TFLOP/s – ~ 20 GPixels/s – ~ 2 GTriangles/s
3
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Graphics hardware - history 80: simple rasterization 80: simple rasterization – Windows, lines, polygons, text-fonts 90-95: Geometry-Engines only on High-End-Workstations 90-95: Geometry-Engines only on High-End-Workstations – e.g. SGI O2 vs. Indigo2) 95: new rasterization functionality 95: new rasterization functionality – Realism by texturing, e.g: SGI Infinite Reality 98: Geometry processor (T&L) on PC-Graphics 98: Geometry processor (T&L) on PC-Graphics 2000: PC-Graphics achieves similar performance to High-End-Workstations 2000: PC-Graphics achieves similar performance to High-End-Workstations – 3D is becoming standard in Aldi-PC 2001: PC-Graphics offers new functionality 2001: PC-Graphics offers new functionality – Multitextures, Vertex- and Pixel-Shader 2002: DirectX Level 9.0 Hardware 2002: DirectX Level 9.0 Hardware – High Level Shader Languages 2006: DirectX Level 10.0 Hardware 2006: DirectX Level 10.0 Hardware – Geometry – Shader
4
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Trends in graphics hardware Number of transistors doubles every 6 months Advances in performance and functionality 0 10 20 30 40 50 60 9/973/989/983/999/993/009/003/01 Time (month/year) Transistors (Mi) Riva 128 (3M) GeForce3 (57M) R200 (60M) 9/02 150 GeForceFX / ATI Radeon 9800 300 ATI R520
5
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Trends in graphics hardware Grows faster than Moores law predicts Grows faster than Moores law predicts Time Performance Network Graphics CPU
6
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Parallel graphics hardware Graphics hardware has always been parallel Graphics hardware has always been parallel – Internal on chip or board Multiple rasterizer serve one frame buffer Multiple rasterizer serve one frame buffer – Multi-Pipe Multiple graphics cards in one system for one or multiple displays Multiple graphics cards in one system for one or multiple displays Multiple geometry engines Multiple geometry engines – Distributed graphics Multiple knots in a connected cluster with one or multiple cards serve one or multiple displays driven by one application Multiple knots in a connected cluster with one or multiple cards serve one or multiple displays driven by one application
7
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Graphics architectures State-of-the-Art GPUs State-of-the-Art GPUs – Highly parallel stream architecture Stream of vertices/fragments is processed Stream of vertices/fragments is processed Pipelined and SIMD parallel processing Pipelined and SIMD parallel processing – SIMD: single set of instructions on multiple stream elements – Specifies new rendering pipeline Additional stages a vertex or a fragment is passing through Additional stages a vertex or a fragment is passing through – Specifies new (vendor specific) OpenGL extensions – Allows for new classes of algorithms – Eventually makes programs platform dependent
8
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Graphics architectures State-of-the-Art GPUs (G80)
9
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Graphics architectures State-of-the-Art GPUs State-of-the-Art GPUs – Multiple (texture) render targets – Up to 2GB video memory – Floating point textures (4 x 32 Bit) – Internal computations in float /double precision – Z-cull: discards fragments (before entering the pixel pipelines) that will fail the depth test – Dynamic flow control: per-vertex/geometry/fragment specific operations (if then else) – PCIe: serial, pont2point protocol, dual channels to allow for bandwidth in both directions (upload/download) – Fix fragment-to-pixel bound, i.e. a fragment (XY) can not be written to a pixel (X´Y´) no scattering (at least not in DX/GL)– only gathering no scattering (at least not in DX/GL)– only gathering
10
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Graphics architectures State-of-the-Art programmable GPUs
11
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Graphics architectures State-of-the-Art programmable GPUs
12
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group GP-GPU Water
13
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Programmable graphics hardware Displacement mapping DisplacerRendering static grid Simulation generates height field texture water surface
14
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Programmable graphics hardware GPU memory objects GPU memory objects – Semantics can be specified for chunk of memory – Memory object can be a texture, a vertex array, a frame buffer object What was a texture render target in the current pass becomes a vertex array in the upcoming pass What was a texture render target in the current pass becomes a vertex array in the upcoming pass – Texture elements can be interpreted as vertex attributes without any copying operations (not in OpenGL) – Same effect can be achieved with vertex texture fetch, but this fetch actually slows down performance
15
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Programmable graphics hardware Example Example – Computation of height values u at vertices of a 2D grid – Starting with an initial distribution, compute evolution over time t y x h h P ij P ij+1 P ij-1 P i+1j P i+1j+1 P i+1j-1 P i-1j P i-1j+1 P i-1j-1
16
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Programmable graphics hardware Algorithm: – Load initial height values (N x xN y ) as 2D texture (sGridPrev, sGrid) – Upload fragment shader (render to sGridNew): void PerPixelSim ( float2 fragpos: TEXCOORD0, out height : COLOR0) { centerPrev = tex2D(sGridPrev, fragpos); float2 leftIndex = float2(-1.0/TexSize, 0.0); left = tex2D(sGrid, fragpos + leftIndex); // same for right, upper, lower, center height = f(left, right, upper, lower, center, centerPrev); }
17
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Programmable graphics hardware Algorithm contd.: – Simulation: Render a Quad that covers N x x N y pixels with appropriate texture coords. Render a Quad that covers N x x N y pixels with appropriate texture coords. – N x x N y fragments will be generated – Data parallel execution of fragments – Swizzle texture identifiers sGridPrev = sGrid, sGrid = sGridNew; sGridNew = sGrdPrev sGridPrev = sGrid, sGrid = sGridNew; sGridNew = sGrdPrev – Display height field in texture sGrid (texCoord = 0,0) (1,1) (0,1) (1,0)
18
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Programmable graphics hardware Algorithm contd.: – Display: Upload fragment shader (render to color buffer): Upload fragment shader (render to color buffer): void PerPixelRefract ( float2 fragpos: TEXCOORD0, out color : COLOR0) { tangent = float3(1.0, 0.0, tex2D(sGrid, fragpos + rightIndex).r - tex2D(sGrid, fragpos).r; binormal = float3(0.0, 1.0, tex2D(sGrid, fragpos + upper).r - tex2D(sGrid, fragpos).r); normal = normalize(cross(tangent, binormal)); refract = f(normal, refractionIndex); color = tex2D(sBackground, fragpos + refract); }
19
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group GPGPU Particle Tracing
20
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group GPU Partikelverfolgung
21
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group GPU Partikelverfolgung Input Assembler RasterizerRasterizer OutputMergerOutputMerger
22
computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization Group Programmable graphics hardware Demonstration
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.