Benefits. CAAR Project Phases Each of the CAAR projects will consist of a: 1.three-year Application Readiness phase (2015-17) in which the code refactoring.

Slides:

Advertisements

Similar presentations

STFC and the UK e-Infrastructure Initiative The Hartree Centre Prof. John Bancroft Project Director, the Hartree Centre Member, e-Infrastructure Leadership.

Advertisements

Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters

GPU Programming using BU Shared Computing Cluster

The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.

Test Automation Success: Choosing the Right People & Process

Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.

IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.

U.S. Department of Energy Office of Science Advanced Scientific Computing Research Program NERSC Users Group Meeting Department of Energy Update June 12,

2. Computer Clusters for Scalable Parallel Computing

BY MANISHA JOSHI.  Extremely fast data processing-oriented computers.  Speed is measured in “FLOPS”.  For highly calculation-intensive tasks.  For.

GPU System Architecture Alan Gray EPCC The University of Edinburgh.

GPGPU Introduction Alan Gray EPCC The University of Edinburgh.

Materials by Design G.E. Ice and T. Ozaki Park Vista Hotel Gatlinburg, Tennessee September 5-6, 2014.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO SDSC RP Update October 21, 2010.

HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

IDC HPC User Forum Conference Appro Product Update Anthony Kenisky, VP of Sales.

ORNL is managed by UT-Battelle for the US Department of Energy CORAL Acquisition Update: OLCF-4 Project Buddy Bland OLCF Project Director Presented to:

HIGH PERFORMANCE COMPUTING ENVIRONMENT The High Performance Computing environment consists of high-end systems used for executing complex number crunching.

CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.

IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.

1 AppliedMicro X-Gene ® ARM Processors Optimized Scale-Out Solutions for Supercomputing.

CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.

Plans for Exploitation of the ORNL Titan Machine Richard P. Mount ATLAS Distributed Computing Technical Interchange Meeting May 17, 2013.

HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.

An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.

OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.

Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.

Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.

1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,

November 13, 2006 Performance Engineering Research Institute 1 Scientific Discovery through Advanced Computation Performance Engineering.

The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.

Jaguar Super Computer Topics Covered Introduction Architecture Location & Cost Bench Mark Results Location & Manufacturer Machines in top 500 Operating.

Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,

Common Practices for Managing Small HPC Clusters Supercomputing 12

N*Grid – Korean Grid Research Initiative Funded by Government (Ministry of Information and Communication) 5 Years from 2002 to million US$ Including.

October 21, 2015 XSEDE Technology Insertion Service Identifying and Evaluating the Next Generation of Cyberinfrastructure Software for Science Tim Cockerill.

Massive Supercomputing Coping with Heterogeneity of Modern Accelerators Toshio Endo and Satoshi Matsuoka Tokyo Institute of Technology, Japan.

DOE PI Meeting at BNL 1 Lightweight High-performance I/O for Data-intensive Computing Jun Wang Computer Architecture and Storage System Laboratory (CASS)

ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.

Argonne Leadership Computing Facility ALCF at Argonne  Opened in 2006  Operated by the Department of Energy’s Office of Science  Located at Argonne.

1 Cray Inc. 11/28/2015 Cray Inc Slide 2 Cray Cray Adaptive Supercomputing Vision Cray moves to Linux-base OS Cray Introduces CX1 Cray moves.

Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.

ARCHES: GPU Ray Tracing I.Motivation – Emergence of Heterogeneous Systems II.Overview and Approach III.Uintah Hybrid CPU/GPU Scheduler IV.Current Uintah.

Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.

Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.

Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.

PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.

Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.

3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.

ORNL is managed by UT-Battelle for the US Department of Energy ORNL Activities: Summit Project at the Oak Ridge Leadership Computing Facility Jack Wells.

Petascale Computing Resource Allocations PRAC – NSF Ed Walker, NSF CISE/ACI March 3,

MAY 18, 2016 BARRY SMITH MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY WEBINAR SERIES: COLLABORATION AMONG THE IDEAS SCIENTIFIC.

IBM Power system – HPC solution

Track finding and fitting with GPUs Oleksiy Rybalchenko and Mohammad Al-Turany GSI-IT 4/29/11 PANDA FEE/DAQ/Trigger Workshop Grünberg.

Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.

Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb

Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi

Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.

Jun Doi IBM Research – Tokyo Early Performance Evaluation of Lattice QCD on POWER+GPU Cluster 17 July 2015.

Centre of Excellence in Physics at Extreme Scales Richard Kenway.

HPC Roadshow Overview of HPC systems and software available within the LinkSCEEM project.

A Brief Introduction to NERSC Resources and Allocations

Productive Performance Tools for Heterogeneous Parallel Computing

Oak Ridge Leadership Computing Facility: Summit and Beyond

Clouds , Grids and Clusters

Electron Ion Collider New aspects of EIC experiment instrumentation and computing, as well as their possible impact on and context in society (B) COMPUTING.

Performance Technology for Scalable Parallel Systems

Cray Announces Cray Inc.

Scientific Computing At Jefferson Lab

Presentation transcript:

Benefits

CAAR Project Phases Each of the CAAR projects will consist of a: 1.three-year Application Readiness phase ( ) in which the code refactoring and porting work will take place and an 2.Early Science phase (2018) for tuning of the code to the Summit architecture and demonstration of the application through a scientific grand-challenge project.

Partnership Teams The partnership teams, consisting of the core developers of the application and staff from the OLCF that will be assigned to the project

Partnership team responsibilities 1.Develop a technical application porting and performance improvement plan with reviewable milestones for the Application Readiness phase of the project 2.Develop a management plan with clear description of responsibilities of the CAAR team consisting of the core application developers, the Scientific Computing staff member assigned by the OLCF and the OLCF postdoctoral fellow, that will carry out the code optimization, refactoring, testing and profiling of the application 3.Develop a compelling scientific grand-challenge campaign for the Early Science phase of the project 4.Assign an application scientist who, together with the CAAR team, will carry out the Early Science campaign 5.Prepare the necessary documentation for semi-annual reviews of achieved milestones, and intermediate and final reports

Partnership resources 1.The core development team of the application, with a stated level of effort dedicated to the partnership 2.An ORNL Scientific Computing staff member, who will partner with the core application development team to jointly carry out the code profiling and optimization tasks, The OLCF commits a minimum of a third FTE per year to the partnership 3.A full-time postdoctoral fellow, located and mentored at the OLCF, who will engage with the CAAR team for code profiling, optimization and execution of the science challenge 4.Allocation of compute resources on Titan 5.Allocation of compute resources at the ALCF and at NERSC to enable performance portability to multiple architectures 6.Support from the IBM/NVidia Center of Excellence staff at the ORNL as needed 7.Access to early delivery systems and the Summit system as they become available 8.Allocation of compute resources on the full Summit system for the Early Science campaign

Support Provided Support from the IBM/NVIDIA Center of Excellence at Oak Ridge National Laboratory, and have access to computational resources including Titan at OLCF, Mira at ALCF and Edison and Cori at NERSC, early delivery systems and Summit as they become available.

CAAR Partnership Activities 1.Common training of all Application Readiness teams a.Architecture and performance portability b.Avoidance of duplicate efforts 2.Application Readiness Technical Plan Development and Execution a.Code analysis & benchmarking to understand application characteristics: code structure, code suitability for architecture port, algorithm structure, data structures and data movement patterns, code execution characteristics (“hot spots” or “flat” execution profile) b.Develop parallelization and optimization approach to determine the algorithms and code components to port, how to map algorithmic parallelism to architectural features, how to manage data locality and motion c.Decide on programming model such as compiler directives, libraries, explicit coding models d.Execute technical plan– benchmarking, code rewrite or refactor, porting and testing, managing portability, managing inclusion in main code repository 3.Development and Execution of and Early Science Project, i.e., challenging science problem that demonstrates the performance and scientific impact of the developed application port

Selection criteria CAAR projects will be selected on the basis 1.anticipated impact on the science and engineering fields, 2.the importance to the user programs of the OLCF, 3.the feasibility to achieve scalable performance on Summit, 4.the anticipated opportunity to achieve performance portability for other architectures, 5.the algorithmic and scientific diversity of the suite of CAAR applications. Decisions will be made by the OLCF Scientific Computing staff, in consultation with the IBM/NVidia Center of Excellence at Oak Ridge National Laboratory and the DOE Office of Advanced Scientific Computing Research.

Selection criteria 1.Anticipated impact on the science and engineering fields 2.Importance to the user program of the OLCF 3.Feasibility to achieve scalable performance on Summit 4.Anticipated opportunity to achieve performance portability for other architectures 5.Algorithmic and scientific diversity of the suite of CAAR applications 6.Optimizations incorporated into master repository 7.Size of the application’s user base

Portability

Performance portability to other architectures is an important consideration, and the CAAR is collaborating with the Argonne Leadership Computing Facility (ALCF) and the National Energy Research Supercomputing Center (NERSC) to enhance application portability across their respective architectures.

Portability Application portability among NERSC, ALCF and OLCF architectures is critical concern of ASCR Application developers target wide range of architectures Maintaining multiple code version is difficult Porting to different architectures is time-consuming Many Principal Investigators have allocations on multiple resources Applications far outlive any computer system Primary task is exposing parallelism and data locality Primary task is exposing parallelism and data locality

Summit System

Summit Architecture The architecture of Summit will consist of nodes with multiple IBM Power-9 CPUs and NVIDIA Volta GPU accelerators, using a coherent memory space that includes high bandwidth memory (HBM) on the GPUs and a high speed NVLink interconnect between the POWER9 CPU and Volta GPUs. Internode communication will be through a Mellanox InfiniBand EDR interconnect. The peak performance of this system is expected to be five to ten times that of Titan.

Summit Architecture Approximately 3,400 nodes, each with: Multiple IBM POWER9 CPUs and multiple NVIDIA Tesla® GPUs using the NVIDIA Volta™ architecture CPUs and GPUs completely connected with high speed NVLink™ Large coherent memory: over 512 GB (HBM + DDR4) –all directly addressable from the CPUs and GPUs An additional 800 GB of NVRAM, which can be configured as either a burst buffer or as extended memory over 40 TF peak performance Dual-rail Mellanox® EDR-IB full, non-blocking fat-tree interconnect IBM Elastic Storage (GPFS™) - 1TB/s I/O and 120 PB disk capacity.

Summit System Software System –Linux® –IBM Elastic Storage (GPFS™) –IBM Platform Computing™ (LSF) –IBM Platform Cluster Manager™ (xCAT)

Programming Environment –Compilers supporting OpenMP, OpenACC, CUDA IBM XL, PGI, LLVM, GNU, NVIDIA –Libraries IBM Engineering and Scientific Subroutine Library (ESSL) FFTW, ScaLAPACK, PETSc, Trilinos, BLAS-1,-2,-3, NVBLAS cuFFT, cuSPARSE, cuRAND, NPP, Thrust –Debugging Allinea DDT, IBM Parallel Environment Runtime Edition (pdb) Cuda-gdb, Cuda-memcheck, valgrind, memcheck, helgrind, stacktrace –Profiling IBM Parallel Environment Developer Edition (HPC Toolkit) VAMPIR, Tau, Open|Speedshop, nvprof, gprof, Rice HPCToolkit