OpenCL 소개 류관희 충북대학교 소프트웨어학과.

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

APARAPI Java™ platform’s ‘Write Once Run Anywhere’ ® now includes the GPU Gary Frost AMD PMTS Java Runtime Team.
 Open standard for parallel programming across heterogenous devices  Devices can consist of CPUs, GPUs, embedded processors etc – uses all the processing.
GPU Processing for Distributed Live Video Database Jun Ye Data Systems Group.
OpenCL Peter Holvenstot. OpenCL Designed as an API and language specification Standards maintained by the Khronos group  Currently 1.0, 1.1, and 1.2.
Where Do the 7 layers “fit”? Or, where is the dividing line between hdw & s/w? ? ?
DEPARTMENT OF COMPUTER ENGINEERING
Programming with CUDA, WS09 Waqar Saleem, Jens Müller Programming with CUDA and Parallel Algorithms Waqar Saleem Jens Müller.
Mobile Application Development
Contemporary Languages in Parallel Computing Raymond Hummel.
Panel Discussion: The Future of I/O From a CPU Architecture Perspective #OFADevWorkshop Brad Benton AMD, Inc.
OpenCL Introduction A TECHNICAL REVIEW LU OCT
© Copyright Khronos Group, Page 1 The State of the Union Update from the Working Group Chair Tom Olson, Texas Instruments Inc.
The Open Standard for Parallel Programming of Heterogeneous systems James Xu.
Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.
Copyright© Jeffrey Jongko, Ateneo de Manila University Android.
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
Tim Madden ODG/XSD.  Graphics Processing Unit  Graphics card on your PC.  “Hardware accelerated graphics”  Video game industry is main driver.  More.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
© 2009 Nokia V1-OpenCLEnbeddedProfilePresentation.ppt / / JyrkiLeskelä 1 OpenCL Embedded Profile Presentation for Multicore Expo 16 March 2009.
Introduction to OpenCL* Ohad Shacham Intel Software and Services Group Thanks to Elior Malul, Arik Narkis, and Doron Singer 1.
GPU Architecture and Programming
OpenCL Sathish Vadhiyar Sources: OpenCL quick overview from AMD OpenCL learning kit from AMD.
1 The Portland Group, Inc. Brent Leback HPC User Forum, Broomfield, CO September 2009.
1 Latest Generations of Multi Core Processors
OpenCL Programming James Perry EPCC The University of Edinburgh.
Tim Madden ODG/XSD.  Graphics Processing Unit  Graphics card on your PC.  “Hardware accelerated graphics”  Video game industry is main driver.  More.
© Copyright Khronos Group, Page 1 OpenGL ES SIGGRAPH 2006 Neil Trevett Vice President Embedded Content, NVIDIA President, Khronos.
Application Software System Software.
OpenCL Joseph Kider University of Pennsylvania CIS Fall 2011.
© Copyright Khronos Group, Page 1 Khronos and OpenGL ES Status Neil Trevett Vice President Embedded Content, NVIDIA President, Khronos.
Implementation and Optimization of SIFT on a OpenCL GPU Final Project 5/5/2010 Guy-Richard Kayombya.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
December 13, G raphical A symmetric P rocessing Prototype Presentation December 13, 2004.
My Coordinates Office EM G.27 contact time:
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Computer System Structures
Computer Engg, IIT(BHU)
Software Hardware refers to the physical devices of a computer system.
CUDA Introduction Martin Kruliš by Martin Kruliš (v1.1)
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Our Graphics Environment
Mobile App Development
Enabling machine learning in embedded systems
FIRMWARE PPT By:Hanh Nguyen.
Introduction to .NET Framework Ch2 – Deitel’s Book
Basic CUDA Programming
Texas Instruments TDA2x and Vision SDK
Introduction to OpenGL
CMPE419 Mobile Application Development
Processing Framework Sytse van Geldermalsen
CS 179: GPU Programming Lecture 1: Introduction 1
Antonio R. Miele Marco D. Santambrogio Politecnico di Milano
GPU Programming using OpenCL
SOC Runtime Gregory Stoner.
Introduction to OpenCL 2.0
Brook GLES Pi: Democratising Accelerator Programming
Konstantis Daloukas Nikolaos Bellas Christos D. Antonopoulos
Compiler Back End Panel
CIS16 Application Development – Programming with Visual Basic
Compiler Back End Panel
Chapter 7 –Implementation Issues
Windows Virtual PC / Hyper-V
WebGL: Latest Techniques
Graphics Processing Unit
Chap 1. Getting Started Objectives
Introduction to OpenGL
CMPE419 Mobile Application Development
6- General Purpose GPU Programming
Presentation transcript:

OpenCL 소개 류관희 충북대학교 소프트웨어학과

Processor Parallelism WMP Overview 6/23/2018 Processor Parallelism CPUs Multiple cores driving performance increases GPUs Increasingly general purpose data-parallel computing Emerging Intersection Multi-processor programming – e.g. OpenMP Heterogeneous Computing Graphics APIs and Shading Languages OpenCL is a programming framework for heterogeneous compute resources Copyright 2005, All rights reserved 2 2

The BIG Idea behind OpenCL OpenCL execution model … Define N-dimensional computation domain Execute a kernel at each point in computation domain C Derivative to write kernels – based on ISO C99 APIs to discover devices in a system and distribute work to them Targeting many types of device GPUs, CPUs, DSPs, embedded systems, mobile phones.. Even FPGAs 3

The BIG Idea behind OpenCL Traditional loops Data Parallel OpenCL void trad_mul(int n, const float *a, const float *b, float *c) { int i; for (i=0; i<n; i++) c[i] = a[i] * b[i]; } kernel void dp_mul(global const float *a, global const float *b, global float *c) { int id = get_global_id(0); c[id] = a[id] * b[id]; } // execute over “n” work-items 4

OpenCL Platform Model One Host + one or more Compute Devices WMP Overview 6/23/2018 OpenCL Platform Model One Host + one or more Compute Devices Each Compute Device is composed of one or more Compute Units Each Compute Unit is further divided into one or more Processing Elements RELATION TO CPU OR GPU Quad core is one compute device, 4 x (compute unit = processing element) Copyright 2005, All rights reserved 5

OpenCL Platform Model WMP Overview 6/23/2018 RELATION TO CPU OR GPU Quad core is one compute device, 4 x (compute unit = processing element) Copyright 2005, All rights reserved 6

OpenCL–Heterogeneous Computing Framework for programming diverse parallel computing resources in a system Platform Layer API Query, select and initialize compute devices Kernel Language Specification Subset of ISO C99 with language extensions Runtime API Execute compute kernels – gather results OpenCL has Embedded profile No need for a separate “ES” spec Copyright Khronos 2009

Announcing OpenCL 1.2! OpenCL 1.2 Specification publicly available today! Significant updates - Khronos being responsive to developer requests Updated OpenCL 1.2 conformance tests available Multiple implementations underway Backward compatible upgrade to OpenCL 1.1 OpenCL 1.2 will run any OpenCL 1.0 and OpenCL 1.1 programs OpenCL 1.2 platform can contain 1.0, 1.1 and 1.2 devices Maintains embedded profile for mobile and embedded devices

OpenCL Working Group Members Diverse industry participation – many industry experts Processor vendors, system OEMs, middleware vendors, application developers Academia and research labs, FPGA vendors NVIDIA is chair, Apple is specification editor Apple

OpenCL Milestones Six months from proposal to released OpenCL 1.0 specification Due to a strong initial proposal and a shared commercial incentive Multiple conformant implementations shipping For CPUs and GPUs on multiple OS 18 month cadence between OpenCL 1.0, OpenCL 1.1 and now OpenCL 1.2 Backwards compatibility protect software investment OpenCL 1.2 Specification and conformance tests released! OpenCL 1.0 conformance tests released OpenCL working group formed Dec08 Jun10 Jun08 May09 Nov11 OpenCL 1.0 released OpenCL 1.1 Specification and conformance tests released

Looking Forward OpenCL-HLM (High Level Model) Exploring high-level programming model, unifying host and device execution environments through language syntax for increased usability and broader optimization opportunities Long-term Core Roadmap Exploring enhanced memory and execution model flexibility to catalyze and expose emerging hardware capabilities WebCL Bring parallel computation to the Web through a JavaScript binding to OpenCL OpenCL-SPIR (Standard Parallel Intermediate Representation) Exploring low-level Intermediate Representation for code obfuscation/security and to provide target back-end for alternative high-level languages

Major New Features in OpenCL 1.2 Partitioning Devices Applications can partition a device into sub-devices Enables computation to be assigned to specific compute units Reserve a part of the device for use for high priority/latency-sensitive tasks or effectively use shared hardware resources such as a cache Separate compilation and linking of objects Provides the capabilities and flexibility of traditional compilers Create a library of OpenCL programs that other programs can link to Enhanced Image Support Added support for 1D images, 1D & 2D image arrays OpenGL sharing extension now enables an OpenCL image to be created from an OpenGL 1D texture, 1D and 2D texture arrays

More Major New Features in OpenCL 1.2 Custom devices and built-in kernels Drive specialized custom devices from OpenCL – even if not programmable Can enqueue built-in kernels to custom devices alongside OpenCL kernels DX9 Media Surface Sharing Efficient sharing between OpenCL and DirectX 9 or DXVA media surfaces DX11 surface sharing Efficient sharing between OpenCL and DirectX 11 surfaces Installable Client Drivers (optional) Portably handling multiple installed implementations from multiple vendors And many other updates and additions..

Real-time processing tasks Mainline processing tasks Partitioning Devices Host Devices can be partitioned into sub-devices More control over how computation is assigned to compute units Sub-devices may be used just like a normal device Create contexts, building programs, further partitioning and creating command-queues Three ways to partition a device Split into equal-size groups Provide list of group sizes Group devices sharing a part of a cache hierarchy Compute Device Compute Device Compute Device Compute Unit Compute Unit Compute Unit Compute Unit Compute Unit Compute Unit Sub-device #1 Real-time processing tasks Sub-device #2 Mainline processing tasks

Custom Devices and Built-in Kernels Embedded platforms often contain specialized hardware and firmware That cannot support OpenCL C Built-in kernels can represent these hardware and firmware capabilities Such as video encode/decode Hardware can be integrated and controlled from the OpenCL framework Can enqueue built-in kernels to custom devices alongside OpenCL kernels FPGAs are one example of device that can expose built-in kernels Latest FPGAs can support full OpenCL C as well OpenCL becomes a powerful coordinating framework for diverse resources Programmable and non-programmable devices controlled by one run-time Built-in kernels enable control of specialized processors and hardware from OpenCL run-time

Installable Client Driver Analogous to OpenGL ICDs in use for many years Used to handle multiple OpenGL implementations installed on a system Optional extension Platform vendor will choose whether to use ICD mechanisms Khronos OpenCL installable client driver loader Exposes multiple separate vendor installable client drivers (Vendor ICDs) Application can access all vendor implementations The ICD Loader acts as a de-multiplexor

Installable Client Driver Vendor #1 OpenCL ICD Loader ensures multiple implementations are installed cleanly Vendor #2 OpenCL Application ICD Loader ICD Loader enables application to use any of the installed implementations Vendor #3 OpenCL

OpenCL Desktop Implementations http://developer.amd.com/zones/OpenCLZone/ http://software.intel.com/en-us/articles/opencl-sdk/ http://developer.nvidia.com/opencl

OpenCL Books – Available Now! OpenCL Programming Guide - The “Red Book” of OpenCL http://www.amazon.com/OpenCL-Programming-Guide-Aaftab-Munshi/dp/0321749642 OpenCL in Action http://www.amazon.com/OpenCL-Action-Accelerate-Graphics-Computations/dp/1617290173/ Heterogeneous Computing with OpenCL http://www.amazon.com/Heterogeneous-Computing-with-OpenCL-ebook/dp/B005JRHYUS The OpenCL Programming Book http://www.fixstars.com/en/opencl/book/

OpenCL Books – Available Now!

Spec Translations Japanese OpenCL 1.1 spec translation available today http://www.cutt.co.jp/book/978-4-87783-256-8.html Valued partnership between Khronos and CUTT in Japan Working on OpenCL 1.2 specification translations Japanese, Korean and Chinese

Khronos OpenCL Resources OpenCL is 100% free for developers Download drivers from your silicon vendor OpenCL Registry www.khronos.org/registry/cl/ OpenCL 1.2 Reference Card PDF version http://www.khronos.org/files/opencl-1-2-quick-reference-card.pdf Online Man pages http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/ OpenCL Developer Forums Give us your feedback! www.khronos.org/message_boards/