Multi-Core Development Kyle Anderson
Overview History Pollack’s Law Moore’s Law CPU GPU OpenCL CUDA Parallelism
History First 4 bit microprocessor – ,000 instructions per second 2,300 transistors First 8 bit microprocessor – ,000 instructions per second 4,500 transistors Altair 8800 First 32 bit microprocessor – ,000 transistors
History First Pentium processor released – MHz Pentium 4 released – GHz 42,000,000 transistors Approach 4GHz Core 2 Duo released – ,000,000 tranisitors
History
Pollack’s Law Processor Performance grows with square root of area
Pollack’s Law
Moore’s Law “The Number of transistors incorporated in a chip will approximately double every 24 months.” – Gordon Moore, Intel co-founder Smaller and smaller transistors
Moore’s Law
CPU Sequential Fully functioning cores 16 cores maximum Currently Hyperthreading Little Latency
GPU Higher latency Thousands of cores Simple calculations Used for research
OpenCL Multitude of Devices Run-time compilation ensures most up to date features on device Lock-Step
OpenCL Data Structures Host Device Compute Units Work-Group Work-Item Command Queue Kernel Context
OpenCL Types of Memory Global Constant Local Private
OpenCL
OpenCL Example
CUDA NVidia's proprietary API for their GPU’s Stands for “Compute Unified Device Architecture” Compiles directly to hardware Used by Adobe, Autodesk, National Instruments, Microsoft and Wolfram Mathematica Faster than OpenCL because compiled directly on hardware and focus on a single architecture.
CUDA Indexing
CUDA Example
CUDA Function Call cudaMemcpy( dev_a, a, N * sizeof(int),cudaMemcpyHostToDevice ); cudaMemcpy( dev_b, b, N * sizeof(int),cudaMemcpyHostToDevice ); add >>( dev _ a, dev _ b, dev _ c );
Types of Parallelism SIMD MISD MIMD Instruction parallelism Task parallelism Data parallelism
SISD Stands for Single Instruction, Single Data Does not use multiple cores
SIMD Stands for “Single Instruction, Multiple Data Streams” Can be process multiple data streams concurrently
MISD Stands for “Multiple Instruction, Single Data” Risky because several instructions are processing the same data
MIMD Stands for “Multiple Instruction, Multiple Data” Instructions are processed sequentially
Instruction Parallelism Mutually exclusive MIMD and MISD often use this Allows multiple instructions to be run at once Instructions considered operations Not programmatically done Hardware Compiler
Task Parallelism Dividing up of main tasks or controls Runs multiple threads concurrently
Data Parallelism Used by SIMD and MIMD A list of instructions is able to work concurrently on a several data sets