Presentation is loading. Please wait.

Presentation is loading. Please wait.

OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2.

Similar presentations


Presentation on theme: "OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2."— Presentation transcript:

1 OpenCL Peter Holvenstot

2 OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2 Manufacturers release their own SDK and drivers Major backers: Apple, AMD/ATI, Intel

3 OpenCL Alternative to CUDA Not limited to ATI GPUs Designed for “heterogenous computing” Executable on many devices, including CPUs, GPUs, DSPs, and FPGAs

4 OpenCL Similar structure of host programs and kernels Set of compute devices is called a 'context' Kernels executed by 'processing elements' Kernels can be compiled at run-time or build-time

5 OpenCL Task Parallelism – many kernels running at once OpenCL 1.2 – device can be partitioned down to single Compute Unit Built-in kernels for device-specific functionality

6 Advantages Same code can be run on different devices  Can also be run on NVIDIA GPUs! AMD/ATI attempting to integrate compute elements into other platforms (Accelerated Processing Units) Limited library of portable math routines  Most common BLAST and FFT routines

7 Performance

8

9

10 Disadvantages No “official” implementation Vendors may meet specs or add restrictions  Apple adds restrictions on group size Devices need appropriate settings to perform well  Different capabilities → different performance  Solution: Tuning/load balancing framework

11 Non-Optimized Performance

12

13 Restrictions No recursion, variadics, or function pointer Cannot dynamically allocate memory from device No native variable-length arrays, double-precision Some can be worked around by extensions

14 Terminology CUDA: Scalar Core Streaming Multiprocssr Warp PTX OpenCL: Stream Core Compute Unit Wavefront Intermediate Language

15 Terminology CUDA: Host Memory Global/Device Memory Local Memory Constant Memory Shared Memory Registers OpenCL: Host Memory Global Memory Constant Memory Local Memory Private Memory

16 Terminology CUDA: Grid Block Thread Thread ID Block Index Thread Index OpenCL: NDRange Work group Work item Global ID Block ID Local ID

17 References http://blog.accelereyes.com/blog/wp- content/uploads/2012/02/CUDAvsOpenCL.pdf http://blog.accelereyes.com/blog/wp- content/uploads/2012/02/CUDAvsOpenCL.pdf https://wiki.aalto.fi/download/attachments/40025977 /Cuda+and+OpenCL+API+comparison_presented.p df https://wiki.aalto.fi/download/attachments/40025977 /Cuda+and+OpenCL+API+comparison_presented.p df http://www.hpcwire.com/hpcwire/2012-02- 28/opencl_gains_ground_on_cuda.html http://www.hpcwire.com/hpcwire/2012-02- 28/opencl_gains_ground_on_cuda.html http://www.netlib.org/utk/people/JackDongarra/PAP ERS/parcocudaopencl.pdf http://www.netlib.org/utk/people/JackDongarra/PAP ERS/parcocudaopencl.pdf http://www.netlib.org/lapack/lawnspdf/lawn228.pdf


Download ppt "OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2."

Similar presentations


Ads by Google