Download presentation
Presentation is loading. Please wait.
Published byAntony Henderson Modified over 9 years ago
1
OpenCL Peter Holvenstot
2
OpenCL Designed as an API and language specification Standards maintained by the Khronos group Currently 1.0, 1.1, and 1.2 Manufacturers release their own SDK and drivers Major backers: Apple, AMD/ATI, Intel
3
OpenCL Alternative to CUDA Not limited to ATI GPUs Designed for “heterogenous computing” Executable on many devices, including CPUs, GPUs, DSPs, and FPGAs
4
OpenCL Similar structure of host programs and kernels Set of compute devices is called a 'context' Kernels executed by 'processing elements' Kernels can be compiled at run-time or build-time
5
OpenCL Task Parallelism – many kernels running at once OpenCL 1.2 – device can be partitioned down to single Compute Unit Built-in kernels for device-specific functionality
6
Advantages Same code can be run on different devices Can also be run on NVIDIA GPUs! AMD/ATI attempting to integrate compute elements into other platforms (Accelerated Processing Units) Limited library of portable math routines Most common BLAST and FFT routines
7
Performance
10
Disadvantages No “official” implementation Vendors may meet specs or add restrictions Apple adds restrictions on group size Devices need appropriate settings to perform well Different capabilities → different performance Solution: Tuning/load balancing framework
11
Non-Optimized Performance
13
Restrictions No recursion, variadics, or function pointer Cannot dynamically allocate memory from device No native variable-length arrays, double-precision Some can be worked around by extensions
14
Terminology CUDA: Scalar Core Streaming Multiprocssr Warp PTX OpenCL: Stream Core Compute Unit Wavefront Intermediate Language
15
Terminology CUDA: Host Memory Global/Device Memory Local Memory Constant Memory Shared Memory Registers OpenCL: Host Memory Global Memory Constant Memory Local Memory Private Memory
16
Terminology CUDA: Grid Block Thread Thread ID Block Index Thread Index OpenCL: NDRange Work group Work item Global ID Block ID Local ID
17
References http://blog.accelereyes.com/blog/wp- content/uploads/2012/02/CUDAvsOpenCL.pdf http://blog.accelereyes.com/blog/wp- content/uploads/2012/02/CUDAvsOpenCL.pdf https://wiki.aalto.fi/download/attachments/40025977 /Cuda+and+OpenCL+API+comparison_presented.p df https://wiki.aalto.fi/download/attachments/40025977 /Cuda+and+OpenCL+API+comparison_presented.p df http://www.hpcwire.com/hpcwire/2012-02- 28/opencl_gains_ground_on_cuda.html http://www.hpcwire.com/hpcwire/2012-02- 28/opencl_gains_ground_on_cuda.html http://www.netlib.org/utk/people/JackDongarra/PAP ERS/parcocudaopencl.pdf http://www.netlib.org/utk/people/JackDongarra/PAP ERS/parcocudaopencl.pdf http://www.netlib.org/lapack/lawnspdf/lawn228.pdf
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.