FSOSS Dr. Chris Szalwinski Professor School of Information and Communication Technology Seneca College, Toronto, Canada GPU Research Capabilities at Seneca
2 A Fresh Initiative From Some Personal History To Heterogeneous Computing
3 A Fresh Initiative The 80287
4 A Fresh Initiative Floating-Point Co-Processor (1985)
5 A Fresh Initiative ATI 3D Rage II Co-Processor (1996)
6 A Fresh Initiative A Paradigm Shift In Programming
7 Paradigm Shift The Turn Towards Concurrency
8 Paradigm Shift
9 Can still increase transistor density – but it's getting more expensive
10 Paradigm Shift Can still increase transistor density – but it's getting more expensive Can't increase processor frequencies < 10 GHz chips
11 Paradigm Shift Can still increase transistor density – but it's getting more expensive Can't increase processor frequencies < 10 GHz chips power consumption – can't melt chips
12 Paradigm Shift Can still increase transistor density – but it's getting more expensive Can't increase processor frequencies < 10 GHz chips power consumption – can't melt chips The Free Lunch is Over we can't just wait for improvement like we did before we need new routes to improvement
13 Paradigm Shift Use Different Computational Units For Distinctly Different Tasks
14 Heterogeneous Computing Intel Core i7 (2008), NVIDIA GeForce GTX580 (2010)
15 Heterogeneous Computing
16 Heterogeneous Computing
17 Heterogeneous Computing Serial processing Parallel processing +
18 Heterogeneous Computing NVIDIA many-core GPUs vs Intel multi-core CPUs Floating point operations per sec (GFLOP/s) Memory bandwidth (GB/s)
19 Industry Momentum STI (Sony + Toshiba + IBM) Broadband Cell Processor – CPU + GPU on one chip
20 Industry Momentum STI (Sony + Toshiba + IBM) Broadband Cell Processor – CPU + GPU on one chip Intel Xeon Phi – MIC (Many Integrated Core)
21 Industry Momentum STI (Sony + Toshiba + IBM) Broadband Cell Processor – CPU + GPU on one chip Intel Xeon Phi – MIC (Many Integrated Core) AMD APUs (Fusion) – CPU + GPU on a single chip
22 Industry Momentum STI (Sony + Toshiba + IBM) Broadband Cell Processor – CPU + GPU on one chip Intel Xeon Phi – MIC (Many Integrated Core) AMD APUs (Fusion) – CPU + GPU on a single chip HSA Foundation (2012) – AMD + ARM + TI + Imagination + MediaTek + Samsung + Ateris + Multicore Ware + Apical + Sonics + Symbio + Vivante
23 Industry Momentum STI (Sony + Toshiba + IBM) Broadband Cell Processor – CPU + GPU on one chip Intel Xeon Phi – MIC (Many Integrated Core) AMD APUs (Fusion) – CPU + GPU on a single chip HSA Foundation (2012) – AMD + ARM + TI + Imagination + MediaTek + Samsung + Ateris + Multicore Ware + Apical + Sonics + Symbio + Vivante Radeon – Discrete GPUs
24 Industry Momentum STI (Sony + Toshiba + IBM) Cell Processor – CPU + GPU on one chip Intel Xeon Phi – MIC (Many Integrated Core) AMD APUs (Fusion) – CPU + GPU on a single chip HSA Foundation (2012) – AMD + ARM + TI + Imagination + MediaTek + Samsung + Ateris + Multicore Ware + Apical + Sonics + Symbio + Vivante Radeon – Discrete GPUs NVIDIA – Discrete GPUs GeForce (digital gaming) Quadro (engineering workstations - graphics) Tesla (scientific computations – double precision)
25 Industry Momentum Discrete GPUs - Add-in board shipments
26 Industry Momentum Predictions
27 Industry Predictions Computer Graphics Market
28 Industry Predictions Computer Graphics Market Traditional processors + low-cost graphics processors enable combinations of science and entertainment
29 Industry Predictions Embedded Graphics Processors (EGPs) are killing off Integrated Graphics Processors (IGPs)
30 Industry Predictions Embedded Graphics Processors (EGPs) are no threat to Discrete Graphics
31 Programming Heterogeneous Computers Concurrency-Oriented Programming Core Languages Fortran C C++
32 Programming Heterogeneous Computers Concurrency-Oriented Programming (COP) Core Languages Fortran C C++ Extensions for COP Cilk Plus (Intel) OpenCL (Khronos Group – AMD and HSA) CUDA C/C++ (NVIDIA) Fortran 2008, C-x86 (PGI) DirectCompute (Microsoft)
33 Programming Heterogeneous Computers CUDA Teaching Centers in Ontario McMaster University (2010) High Performance Parallel Computing on Graphical Processing Units – ECE709 – part of Master's Degree University of Toronto (2011) Special Topics in Software Engineering: Programming Massively Parallel Graphics Processors – ECE1724H – part of Master's Degree Seneca College (2012) Introduction to Parallel Programming – Professional Option – GPU610/DPS915 – CPA Diploma and BSD Degree
34 Programming Heterogeneous Computers School of Information and Communications Technology (ICT) Our Capabilities and Plans
35 ICT Facilities Fully Equipped Teaching Classroom and Lab 40 seats 38 CUDA enabled desktops with GTX480s (480 cores) Maximus Workstation Quadro 600 for visualization Tesla C2075 for computation SCI-Net Research Accelerator Research Cluster – research testbed 8 x [2 Intel Xeon X NVIDIA Tesla M2070]
36 ICT Facilities The 80287
37 ICT Courses Introductory Course – Student Skill Set Solid tested background in both C and C++ Profile for computationally intensive code Move critical code to the GPU using CUDA Optimize to hide memory latency with computations Programmer Training Workshops – on demand Advanced Course – (in the planning stage) Interactive Real-Time Computations + Visualization Parallelizing Fortran Applications OpenGL, DirectX Graphics Interoperability
38 ICT Faculty Areas of Interest or Domain Expertise Big Data – Geocomputation Cognition – Cognitive Tutors Intrusion Detection – Information Security Finite Element Analysis – Soft Matter
39 ICT Scope Areas of Application (source: NVIDIA) Image Processing Big Data Mining Gaming Advertising Genetics Quantum Chemistry Mathematics Product Design Scientific Computing Computational Finance
FSOSS Dr. Chris Szalwinski Professor School of Information and Communication Technology Seneca College, Toronto, Canada GPU Research Capabilities at Seneca
41 Science and Entertainment Science Art ComputationVisualization