IBM Power system – HPC solution

Slides:



Advertisements
Similar presentations
Issues of HPC software From the experience of TH-1A Lu Yutong NUDT.
Advertisements

The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
1 Agenda … HPC Technology & Trends HPC Platforms & Roadmaps HP Supercomputing Vision HP Today.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape.
Enabling Coherent FPGA Acceleration Allan Cantle, President & Founder Nallatech Join the conversation at #OpenPOWERSummit1 #OpenPOWERSummit.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Processor history / DX/SX SX/DX Pentium 1997 Pentium MMX
Some Thoughts on Technology and Strategies for Petaflops.
Chapter Hardwired vs Microprogrammed Control Multithreading
1 AppliedMicro X-Gene ® ARM Processors Optimized Scale-Out Solutions for Supercomputing.
1 petaFLOPS+ in 10 racks TB2–TL system announcement Rev 1A.
Intel® 64-bit Platforms Platform Features. Agenda Introduction and Positioning of Intel® 64-bit Platforms Intel® 64-Bit Xeon™ Platforms Intel® Itanium®
Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Sep 2009 High Performance Interconnects for Distributed Computing (HPI-DC)
Lecture 2 : Introduction to Multicore Computing Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang University.
Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI CSCI.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
Computer System Architectures Computer System Software
1 Hardware Support for Collective Memory Transfers in Stencil Computations George Michelogiannakis, John Shalf Computer Architecture Laboratory Lawrence.
Implementation of Parallel Processing Techniques on Graphical Processing Units Brad Baker, Wayne Haney, Dr. Charles Choi.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
© 2012 MELLANOX TECHNOLOGIES 1 The Exascale Interconnect Technology Rich Graham – Sr. Solutions Architect.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 1 Programming Massively Parallel Processors Lecture Slides for Chapter 1: Introduction.
Gilad Shainer, VP of Marketing Dec 2013 Interconnect Your Future.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
March 9, 2015 San Jose Compute Engineering Workshop.
GPU Architecture and Programming
1 Latest Generations of Multi Core Processors
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Computer Organization & Assembly Language © by DR. M. Amer.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Brent Gorda LBNL – SOS7 3/5/03 1 Planned Machines: BluePlanet SOS7 March 5, 2003 Brent Gorda Future Technologies Group Lawrence Berkeley.
Alpha 21364: A Scalable Single-chip SMP Peter Bannon Senior Consulting Engineer Compaq Computer Corporation Shrewsbury, MA.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Update IDC HPC Forum.
Yang Yu, Tianyang Lei, Haibo Chen, Binyu Zang Fudan University, China Shanghai Jiao Tong University, China Institute of Parallel and Distributed Systems.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.
Revision - 01 Intel Confidential Page 1 Intel HPC Update Norfolk, VA April 2008.
AIX and PowerVM Workshop © 2013 IBM Corporation 1 POWER5POWER5+POWER6POWER7POWER7+ Technology130nm90nm65nm45nm32nm Size389 mm mm mm mm.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
Understanding Parallel Computers Parallel Processing EE 613.
Cray XD1 Reconfigurable Computing for Application Acceleration.
Tackling I/O Issues 1 David Race 16 March 2010.
Moore’s Law Electronics 19 April Moore’s Original Data Gordon Moore Electronics 19 April 1965.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
Jeff Stuecheli Hardware Architect POWER8 Technology
HP Network and Service Provider Business Unit Sebastiano Tevarotto February 2003.
Let’s get wild POWER8 Technology: Jørgen Johansen
Hardware Architecture
Benefits. CAAR Project Phases Each of the CAAR projects will consist of a: 1.three-year Application Readiness phase ( ) in which the code refactoring.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
APE group Many-core platforms and HEP experiments computing XVII SuperB Workshop and Kick-off Meeting Elba, May 29-June 1,
Jun Doi IBM Research – Tokyo Early Performance Evaluation of Lattice QCD on POWER+GPU Cluster 17 July 2015.
Feeding Parallel Machines – Any Silver Bullets? Novica Nosović ETF Sarajevo 8th Workshop “Software Engineering Education and Reverse Engineering” Durres,
What’s ? Laurent Vanel High Performance Infrastructure Specialist.
F1-17: Architecture Studies for New-Gen HPC Systems
IBM Data Centric Systems Directions
OpenPOWER Innovation: Redefining HPC
M. Bellato INFN Padova and U. Marconi INFN Bologna
Lynn Choi School of Electrical Engineering
Welcome: Intel Multicore Research Conference
IBM Power Systems.
Presentation transcript:

IBM Power system – HPC solution Real-time control for adaptive optics workshop 26-27 January 2016 Guillaume Villette (guillaume.villette@fr.ibm.com)

IBM Power system HPC ecosystem Applications AGENDA: IBM Power system HPC ecosystem Applications

Power system – An OpenPOWER initiative Moore’s law no longer satisfies performance gain Growing workload demands Numerous IT consumption models Mature Open software ecosystem Open Development open oftware, open hardware Rich software ecosystem Spectrum of power servers Multiple hardware options Derivative POWER chips Collaboration of thought leaders simultaneous innovation, multiple disciplines Performance of POWER architecture amplified capability Feeds back … resulting in client choice OpenPOWER is an open development community, using the POWER Architecture to serve the evolving needs of customers.

Power system – An OpenPOWER initiative GPU Memory NVLINK DMI CAPI/PCIe FPGA

L3 Cache & Chip Interconnect Core L2 L3 Cache & Chip Interconnect 8M L3 Region Mem. Ctrl. Accelerators SMP Links PCIe Power8 chip Cores 10-12 cores (SMT8) 8 dispatch, 10 issue, 16 exec pipe 2X internal data flows/queues Enhanced prefetching Caches 64K Data cache (L1) 512 KB SRAM L2 / core 96 MB eDRAM shared L3 Up to 128 MB eDRAM L4 (off-chip) Bus Interfaces Integrated PCIe Gen3 SMP Interconnect CAPI Accelerators Crypto & Random number generator Transactional Memory & Active memory expansion Data Move / VM Mobility Memory Dual memory Controllers 230 GB/sec Sustained bandwidth Technology 22nm SOI, eDRAM, 15 ML 650mm2

GPU - NVLink Future NVLink GPU Attachment Current GPU Attach NVIDIA GPU NVIDIA GPU with NVLink Power Chip PCIe x16 Graphics Memory System Memory 16+16 GB/s Power Chip with NVLink 80 GB/s Peak* Graphics Memory System Memory 40+40 GB/s Current GPU Attach Future NVLink GPU Attachment “NVLink will help improve GPU-GPU peer-to-peer communications, eliminating the need to transfer data via the PCIe bus. It would also allow one or more GPUs to access the system RAM much quicker. The new protocol will debut in 2016 with Pascal, which was revealed to be NVidia's next GPU architecture.”

CAPI – Coherent Accelerator Processor Interface POWER8 Core CAPP PCIe POWER8 Processor OS App Memory (Coherent) FPGA AFU IBM Supplied PSL Typical I/O Model Flow: Flow with a Coherent Model: Shared Mem. Notify Accelerator Acceleration Shared Memory Completion DD Call Copy or Pin Source Data MMIO Notify Accelerator Poll / Interrupt Copy or Unpin Result Data Ret. From DD Application Dependent, but Equal to below Equal to above 300 Instructions 10,000 Instructions 3,000 Instructions 1,000 Instructions 7.9µs 4.9µs Total ~13µs for data prep 400 Instructions 100 Instructions 0.3µs 0.06µs Total 0.36µs

IBM OpenPOWER – Based HPC Roadmap ESSL/PESSL MASS PE ESSL/PESSL + GPU MASS Plateform MPI ESSL/PESSL + GPU MASS Plateform MPI Libraries XL LLVM – OpenMP 4.1 GCC – OpenACC 2 PGI – OpenACC 2 XL – OpenMP 5 LLVM – OpenMP 5 GCC – OpenACC PGI – OpenACC 3 XL LLVM – OpenMP4 (C) GCC – OpenACC 1+ Compilers ConnectX-5 HDR (200Gb/s) Enhanced CAPI over PCIe Gen4 Mellanox Interconnect Technology Connect-IB FDR (50Gb/s) PCIe Gen3 ConnectX-4 EDR (100Gb/s) CAPI over PCIe Gen3 Tesla PCIe Gen3 Volta Enhanced NVLink NVIDIA GPUs Pascal NVLink POWER8 POWER8 POWER9 OpenPower CAPI Interface Enhanced CAPI & NVLink IBM CPUs NVLink 2015 2016 2017

IBM HPC ecosystem - Platform Computing RTM LSF PAC IBM Platform Computing Resource Management Spectrum Scale/ ESS Storage Platform Clustering Manager Provisioning xcat

Applications considered Application Category Ported GPU Enabled Performance Metrics Avail In-process Planned to be ported Under Consideration Astrophysics 2 3 Biological Sciences 55 9 4 14 15 Chemistry and Physics 7 12 16 37 Computer-aided Engineering (CAE) 8 5 Deep Learning 1 Defense & Intelligence Digital Content Creation and Distribution (DCC / DCD) Electronic Design Automation (EDA) Financial Modeling Geosciences 13 Weather and Climate 10 Benchmarks 11 6 Utilities

Applications considered POWER8: Application Performance Advantages over Intel Haswell

Questions? guillaume.villette@fr.ibm.com