Download presentation
Presentation is loading. Please wait.
Published byGinger Fletcher Modified over 9 years ago
1
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY Ocelot and the SST-MacSim Simulator Genie Hsieh §, Andrew Kerr, Hyesoon Kim, Jaekyu Lee, Nagesh Lakshminarayana, Arun Rodrigues §, Sudhakar Yalamanchili School of Computer Science and School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA. 30332 § Scalable Computer Architecture Department Sandia National Laboratories Albuquerque, NM. 87185
2
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY System Diversity Keeneland System Tianhe-1A Amazon EC2 GPU Instances Heterogeneity is Mainstream Mobile Platforms 2
3
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY Heterogeneity On-Chip Vector Extensions AES Instructions Programmable Pipeline (GEN6) Sandy Bridge Programmable Accelerator PowerEN 16, PowerPC cores Accelerators Crypto Engine RegEx Engine XML Engine ARM Style Memory Denver Multiple models of Computation Multi-ISA 3
4
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY Heterogeneous Systems: Keeneland 201 TFLOPS in 7 racks (90 sq ft incl service area)677 MFLOPS per watt on HPL (#9 on Green500, Nov 2010)Final delivery system planned for early 2012 Keeneland System (7 Racks) ProLiant SL390s G7 (2CPUs, 3GPUs) S6500 Chassis (4 Nodes) Rack (6 Chassis) M2070 Xeon 5660 12000-Series Director Switch Integrated with NICS Datacenter GPFS and TG Full PCIe X16 bandwidth to all GPUs 67 GFLOPS 515 GFLOPS 1679 GFLOPS 24/18 GB 6718 GFLOPS 40306 GFLOPS 201528 GFLOPS Courtesy J. Vetter (GT/ORNL) 4
5
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY Heterogeneous Architecture & Systems Research Lexical Analyzer Parser Semantic analysis Optimization Code generation Post pass optimization VLIW (Caymen)SIMT (Fermi)New Designs Microarchitecture Memory systems Network on Chip Power Management + Many more Memory Optimizations Program Transformations Control Flow Optimizations + Many more Common Research Themes Instruction set architecture Focus on explicitly data parallel languages – bulk synchronous models 5
6
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY Research Infrastructure Challenges 6 Microarch Simulator Power & Thermal Models Compiler Open source Compiler infrastructures for GPU computing Microarchitecture cycle-level timing simulators for heterogeneous architectures Integration between compiler, simulators, and models Scalable simulation infrastructures Simulation wall! Ability to integrate point tools Tile
7
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY Tutorial Overview 7 Low level Compiler Infrastructure for GPU Computing Ocelot Dynamic Execution Infrastructure Andrew Kerr, Sudhakar Yalamanchili Heterogeneous Cycle-level Architecture Models Parallel Simulation Infrastructure MacSim Heterogeneous Architecture Simulator SST: Structural Simulation Toolkit J. Lee, N. Lakshminarayana, H. Kim G. Hsieh, A. Rodrigues
8
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY Tutorial Schedule 8 Topical Description Part 1 (90 min.)Ocelot Overview: Architecture Ocelot: Supported Devices Part II (90 min.)Structural Simulation Toolkit MacSim: Overview Lunch Part III (90 min)MacSim: Simulator Architecture MacSim: Configuration Part IV (90 min.)Case Studies using Ocelot and SST-MacSIm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.