Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.

Slides:



Advertisements
Similar presentations
U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
Advertisements

The Next I.T. Tsunami Paul A. Strassmann. Copyright © 2005, Paul A. Strassmann - IP4IT - 11/15/05 2 Perspective Months  Weeks.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
Click to add text Introduction to the new mainframe: Large-Scale Commercial Computing © Copyright IBM Corp., All rights reserved. Chapter 7: Systems.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Introduction to the new mainframe: Large-Scale Commercial Computing © Copyright IBM Corp., All rights reserved. Chapter 7: Systems Management.
Parallel Application Scaling, Performance, and Efficiency David Skinner NERSC/LBL.
OS Fall ’ 02 Introduction Operating Systems Fall 2002.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Parallel Application Scaling, Performance, and Efficiency David Skinner NERSC/LBL.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Parallel Application Scaling, Performance, and Efficiency David Skinner NERSC/LBL.
Performance Engineering and Debugging HPC Applications David Skinner
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Copyright © 2008 SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
KUAS.EE Parallel Computing at a Glance. KUAS.EE History Parallel Computing.
CS 21a: Intro to Computing I Department of Information Systems and Computer Science Ateneo de Manila University.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Microsoft ® SQL Server ® 2008 and SQL Server 2008 R2 Infrastructure Planning and Design Published: February 2009 Updated: January 2012.
1 Advanced Storage Technologies for High Performance Computing Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Office of Science U.S. Department of Energy Evaluating Checkpoint/Restart on the IBM SP Jay Srinivasan
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3: Operating Systems Computer Science: An Overview Tenth Edition.
U.S. Department of Energy Office of Science Advanced Scientific Computing Research Program NERSC Users Group Meeting Department of Energy Update September.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
The Research Computing Center Nicholas Labello
A performance evaluation approach openModeller: A Framework for species distribution Modelling.
Example: Sorting on Distributed Computing Environment Apr 20,
National Energy Research Scientific Computing Center (NERSC) Visualization Tools and Techniques on Seaborg and Escher Wes Bethel & Cristina Siegerist NERSC.
Magellan: Experiences from a Science Cloud Lavanya Ramakrishnan.
Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
MPI Performance in a Production Environment David E. Skinner, NERSC User Services ScicomP 10 Aug 12, 2004.
Numerical Libraries Project Microsoft Incubation Group Mary Beth Hribar Microsoft Corporation CSCAPES Workshop June 10, 2008 Copyright Microsoft Corporation,
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Capability Computing – High-End Resources Wayne Pfeiffer Deputy Director NPACI & SDSC NPACI.
Brent Gorda LBNL – SOS7 3/5/03 1 Planned Machines: BluePlanet SOS7 March 5, 2003 Brent Gorda Future Technologies Group Lawrence Berkeley.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.
Near Real-Time Verification At The Forecast Systems Laboratory: An Operational Perspective Michael P. Kay (CIRES/FSL/NOAA) Jennifer L. Mahoney (FSL/NOAA)
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.
National Energy Research Scientific Computing Center (NERSC) NERSC View of the Greenbook Bill Saphir Chief Architect NERSC Center Division, LBNL 6/23/2004.
Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.
Power and Cooling at Texas Advanced Computing Center Tommy Minyard, Ph.D. Director of Advanced Computing Systems 42 nd HPC User Forum September 8, 2011.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Scaling Up User Codes on the SP David Skinner, NERSC Division, Berkeley Lab.
ComPASS Summary, Budgets & Discussion Panagiotis Spentzouris, Fermilab ComPASS PI.
National Energy Research Scientific Computing Center (NERSC) Visportal : interface to grid enabled NERC resources Cristina Siegerist NERSC Center Division,
Tackling I/O Issues 1 David Race 16 March 2010.
HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Advanced User Support in the Swedish National HPC Infrastructure May 13, 2013NeIC Workshop: Center Operations best practices.
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
Petascale Computing Resource Allocations PRAC – NSF Ed Walker, NSF CISE/ACI March 3,
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Design and Planning Tools John Grosh Lawrence Livermore National Laboratory April 2016.
Defining the Competencies for Leadership- Class Computing Education and Training Steven I. Gordon and Judith D. Gardiner August 3, 2010.
VisIt Project Overview
COMPSCI 110 Operating Systems
A Brief Introduction to NERSC Resources and Allocations
Berkeley Cluster Projects
DOE Facilities - Drivers for Science: Experimental and Simulation Data
CS 21a: Intro to Computing I
Scheduled Accomplishments
Presentation transcript:

Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC

Overview What is NERSC? An insider’s view. What is Comprehensive Scientific Support? Summary of INCITE 2004

Focus on service to science projects Identification of research compute needs Experience with real scientific applications New tools and solutions Rapid introduction of new technology Perspectives on emerging architectures Research provides: Facility provides: Computational Research Division Production Computing

National Energy Research Scientific Computing Center ~2000 Users in ~400 projects Serves all disciplines of the DOE Office of Science NERSC Focus on large-scale computing

NERSC: Mission and Customers NERSC provides reliable computing infrastructure, HPC consultancy, and accurate resource accounting to a wide spectrum of science areas.

Comprehensive Scientific Support Here’s the machine. What more do you need?

Colony Switch PGFS seaborg.nersc.gov ResourceSpeedBytes Registers 3 ns2560 B L1 Cache 5 ns 32 KB L2 Cache 45 ns 8 MB Main Memory300 ns 16 GB Remote Memory 19 us 7 TB GPFS 10 ms 50 TB HPSS 5 s 9 PB 380 x HPS S CSS0 CSS1 Many Time Scales, Many Choices NERSC Center is Focused on Scientific Productivity Removing Bottlenecks Speeds Research 16 way SMP NHII Node Main Memory GPFS IBM SP

Focus on Achieving Scientific Goals Hardware and Software are only the start –Queue Policies, Allocation, Environment –User Training, Code analysis and Tuning –HW and SW testing, Reliability An integrated approach enables rapid scientific impact. Feedback from Researchers is valued: –NUG (NERSC Users’ Group) –Collaborative contact with NERSC Not just optimizing codes, NERSC optimizes the HPC process

Comprehensive Support HPC consultancy and visualization resources –A Breadth of Scientific and HPC Expertise Systems Administration and Operation –High availability, RAS Expertise Networking and Security –Secure access without jumping through hoops So researchers can spend more time researching For many projects NERSC is a hub for HPC services : data storage, visualization, CVS, web Quick Answers = Highly Productive Computing Why How

Examples of Comprehensive Support: Performance Tuning Analysis of Parallel Performance –Detailed view of how the code performs –Little effort by researcher Code Performance Tuning –Optimally tuned math libs –MPI Tuning –Scaling and Job structuring Parallel I/O strategies –Tuning for seaborg –Using optimal I/O libs

Examples of Comprehensive Support: Information Services An increasing number of information services are being offered through –Online documentation and tutorials –Machine Status / MOTD –Queue look, history, job performance records –Project level summaries via NIM Send us your ideas for improvement. Tell us what works for you.

Examples of Comprehensive Support: Information Services Login to Batch job history and performance data are web accessible

The HPC Center: A Chemist’s View Reg_16Reg_64Reg_128 Job S Science Impact! Code Performance Metrics Center Performance Feedback Queued Workload Computer

INCITE 2004: Science Impact Expanding Scientific Understanding A Quantum Mechanical understanding of how cells protect themselves from the photosynthetic engine. Cosmological understanding of the distribution of atomic species generated by Supernovae. Fundamental understanding of Turbulence and Mixing processes from astrophysical to microscopic scales.

INCITE 2004: How we got there MPI tuning (15-40% overall improvments) –Scalability and Load Balance –Advanced Math and MPI libraries Parallel I/O –Optimizing concurrent writes –Implementing Parallel HDF Misc. Programming –Batch “time remaining” routines –Threading and math library tests Networked Data Transfer – from scp to hsi (0.5  70 MB/s) HPC Support Yields Science Productivity

NERSC: Ready for New Challenges Emerging Software and Algorithms –Early testing of new HPC software –Performance analysis of new HPC applications Emerging HPC Systems Technology –Parallel File Systems, Parallel I/O profiling Emerging Architectures “Performance Characteristics of a Cosmology Package on Leading HPC Architectures”, HiPC 2004 “Scientific Computations on Modern Parallel Vector Systems”, Supercomputing 2004 “A Performance Evaluation of the Cray X1 for Scientific Applications”, VECPAR 2004 Partnerships to Improve HPC –BluePlanet / ViVA / Workload Analysis / PERC INCITE 2005!