Download presentation
Presentation is loading. Please wait.
Published byRaymond Flowers Modified over 8 years ago
1
Martin Kruliš 26. 11. 2015 by Martin Kruliš (v1.0)1
2
Interoperability ◦ Allows CUDA code to read/write graphical buffers Works with OpenGL and Direct3D libraries ◦ Motivation Direct visualization of complex simulations Augmenting 3D rendering with visualization routines which are difficult to implement in shaders ◦ How it works The graphics resource is registered and represented by struct cudaGraphicResource The resource may be mapped to CUDA memory space cudaGraphicsMapResources(), … 26. 11. 2015 by Martin Kruliš (v1.0)2
3
Initialization ◦ Device must be selected by cudaGLSetGLDevice() Resources ◦ cudaGraphicsGLRegisterBuffer() for buffers The mapped buffers can be accessed in the same way as CUDA allocated memory ◦ cudaGraphicsGLRegisterImage() for images and render buffers The image buffers can be also accessed through texture and surface mechanisms 26. 11. 2015 by Martin Kruliš (v1.0)3 Examples
4
Direct3D Support ◦ Versions 9, 10, and 11 are supported Each version has its own API ◦ CUDA context may operate with one Direct3D device at a time And special HW mode must be set on the device ◦ Initialization is similar to OpenGL cudaD3D[9|10|11]SetDirect3DDevice() ◦ Available Direct3D resources Buffers, textures, and surfaces All using cudaGraphicsD3DXXRegisterResource() 26. 11. 2015 by Martin Kruliš (v1.0)4
5
GPU SLI Mode ◦ Multiple GPUs are interconnected (physically) and cooperating in rendering the scene AFR mode – different GPUs render subsequent frames ◦ CUDA interoperability issues Any CUDA allocation on one GPU is automatically performed on all SLI-connected GPUs CUDA has to use separate contexts for each GPU cudaGLGetDevices() – identify, which devices are in SLI cudaGLDeviceListAll cudaGLDeviceListCurrentFrame cudaGLDeviceListNextFrame 26. 11. 2015 by Martin Kruliš (v1.0)5
6
26. 11. 2015 by Martin Kruliš (v1.0)6
7
Martin Kruliš 26. 11. 2015 by Martin Kruliš (v1.0)7
8
CUDA Basic Linear Algebra Subroutines ◦ CUDA implementation of standard BLAS library ◦ Complete support of all 152 functions on vectors/matrices copy, move, rotate, swap maximum, minimum, multiply by scalar sum, dot products, Euclidean norms matrix multiplications, inverses, linear combinations ◦ Some operations have batch versions ◦ Supports floats, doubles, and complex numbers 26. 11. 2015 by Martin Kruliš (v1.0)8
9
CUDA Sparse Linear Algebra ◦ Open source C++ library for sparse linear structures (matrices, linear systems, …) ◦ Key features Sparse matrix operations (add, substraction, max independent set, polynomial relaxation, …) Supports various matrix formats COO, CSR, DIA, ELL, and HYB ◦ Require CUDA CC 2.0 or higher 26. 11. 2015 by Martin Kruliš (v1.0)9
10
CUDA Fast Fourier Transform ◦ Decompose signal to frequency spectrum ◦ 1-3D transforms (up to 128M elements) ◦ Many variations (precision, complex/real types, …) ◦ API similar to FFTW library Create plan ( cufftHandle ) which holds the configuration Associate/allocate work space (buffers) cufftExecC2C() (or R2C, C2R ) starts execution ◦ FFT plan can be associated with CUDA stream For synchronization and overlapping 26. 11. 2015 by Martin Kruliš (v1.0)10
11
CUDA Thrust ◦ C++ template library based on STL API ◦ Basic idea is to develop C++ parallel applications with minimal overhead ◦ STL like vectors (for devices) and vector operations copy, fill, create sequences, reordering, sorting, … ◦ Algorithms Transformations Reductions Prefix-sums 26. 11. 2015 by Martin Kruliš (v1.0)11
12
GPU AI for Board Games ◦ Specific AI library designed for games with large, but well-defined configuration space ◦ Requires CUDA CC 2.0 ◦ Currently supports Game Tree Split – alpha/beta pruning Single and multiple recursion (with large depths) Zero-sum games (3D Tic-Tac-Toe, Reversi, …) Sudoku backtracking generator and solver Statistical simulations (Monte Carlo for Go) 26. 11. 2015 by Martin Kruliš (v1.0)12
13
PhysX ◦ Realtime physics engine ◦ Originally developed by Ageia for PPU card NVIDIA bought it and re-implemented it for CUDA ◦ Most important features Simulation of rigid bodies (collisions, destruction) Cloths and fluid particle systems APEX ◦ Framework built on top of PhysX ◦ Designed for easy usage (artists, games, …) 26. 11. 2015 by Martin Kruliš (v1.0)13
14
26. 11. 2015 by Martin Kruliš (v1.0)14
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.