OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2.

OpenCL Peter Holvenstot

OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2 Manufacturers release their own SDK and drivers Major backers: Apple, AMD/ATI, Intel

OpenCL Alternative to CUDA Not limited to ATI GPUs Designed for “heterogenous computing” Executable on many devices, including CPUs, GPUs, DSPs, and FPGAs

OpenCL Similar structure of host programs and kernels Set of compute devices is called a 'context' Kernels executed by 'processing elements' Kernels can be compiled at run-time or build-time

OpenCL Task Parallelism – many kernels running at once OpenCL 1.2 – device can be partitioned down to single Compute Unit Built-in kernels for device-specific functionality

Advantages Same code can be run on different devices  Can also be run on NVIDIA GPUs! AMD/ATI attempting to integrate compute elements into other platforms (Accelerated Processing Units) Limited library of portable math routines  Most common BLAST and FFT routines

Performance

Disadvantages No “official” implementation Vendors may meet specs or add restrictions  Apple adds restrictions on group size Devices need appropriate settings to perform well  Different capabilities → different performance  Solution: Tuning/load balancing framework

Non-Optimized Performance

Restrictions No recursion, variadics, or function pointer Cannot dynamically allocate memory from device No native variable-length arrays, double-precision Some can be worked around by extensions

Terminology CUDA: Scalar Core Streaming Multiprocssr Warp PTX OpenCL: Stream Core Compute Unit Wavefront Intermediate Language

Terminology CUDA: Host Memory Global/Device Memory Local Memory Constant Memory Shared Memory Registers OpenCL: Host Memory Global Memory Constant Memory Local Memory Private Memory

Terminology CUDA: Grid Block Thread Thread ID Block Index Thread Index OpenCL: NDRange Work group Work item Global ID Block ID Local ID

References http://blog.accelereyes.com/blog/wp- content/uploads/2012/02/CUDAvsOpenCL.pdf http://blog.accelereyes.com/blog/wp- content/uploads/2012/02/CUDAvsOpenCL.pdf https://wiki.aalto.fi/download/attachments/40025977 /Cuda+and+OpenCL+API+comparison_presented.p df https://wiki.aalto.fi/download/attachments/40025977 /Cuda+and+OpenCL+API+comparison_presented.p df http://www.hpcwire.com/hpcwire/2012-02- 28/opencl_gains_ground_on_cuda.html http://www.hpcwire.com/hpcwire/2012-02- 28/opencl_gains_ground_on_cuda.html http://www.netlib.org/utk/people/JackDongarra/PAP ERS/parcocudaopencl.pdf http://www.netlib.org/utk/people/JackDongarra/PAP ERS/parcocudaopencl.pdf http://www.netlib.org/lapack/lawnspdf/lawn228.pdf

OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2.

Similar presentations

Presentation on theme: "OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2.

Similar presentations

Presentation on theme: "OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2."— Presentation transcript:

Similar presentations

About project

Feedback