The Free Lunch Ended 7 Years Ago The Programmer's Delusion: More transistors on chip will solve all my problems! We Need Concurrent Programming to Satisfy Demand CHIPS CANNOT WORK ANY FASTER! Two Distinct Solutions: - High Performance Computers - Heterogeneous Computers
The Concurrency Paradigm Structured programming 1965-1985 the turn towards structure Object-oriented programming 1985-2005 the turn towards objects Concurrent programming 2005- the turn towards concurrency
The Concurrency Solutions High Performance Computing (HPC) large collections of computers high-speed communication channels exclusive and expensive – Sci-net consortium plus others Heterogeneous Computers (HC) single computers execute independent tasks on different computational units are inexpensive and readily accessible
Heterogeneous Computers CPU Central Processing Unit Single Instruction, Single Data General Purpose and Complex Flexible and Sophisticated GPU Graphics Processing Unit Single Program, Multiple Data Specialized and Simple Focused and Fast + Chip Designs Multi-Core – several cores (CPU + GPU) on a chip – << ~100 Many-Core – many cores (GPU) on a chip – >> ~100
Multi-Core or Many-Core
Programming Languages CUDA Nvidia's extension to C/C++ ~ 350,000,000 GPUs with CUDA ~ 1,000,000 toolkit downloads ~ 120,000 active developers ~ 475 university teaching centres simple for novice students with C/C++ skills OpenCL platform agnostic extension to C/C++ Nvidia (chair), AMD, Apple, ARM, IBM, Intel, ...
Ontario's Landscape Seneca College University of Toronto CUDA Teaching Centre – PI – Dr. Chris Szalwinski University of Toronto CUDA Teaching Centre – PI – Dr. Daniel Gruner McMaster University CUDA Teaching Centre – PI – Dr. Alexandru Patriciu
GPU610 and DPS915 Working on this ...