CCSM Portability and Performance, Software Engineering Challenges, and Future Targets Tony Craig National Center for Atmospheric Research Boulder, Colorado,

Slides:



Advertisements
Similar presentations
Weather Research & Forecasting: A General Overview
Advertisements

Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT | U MICH Emergence of the Earth System Modeling Framework NSIPP Seasonal Forecast.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech.
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY Center for Computational Sciences Cray X1 and Black Widow at ORNL Center for Computational.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
CCSM cpl6 Design and Interfaces Tony Craig Brian Kauffman Tom Bettge National Center for Atmospheric Researc Robert Jacob Jay Larson Everest Ong Argonne.
CCSM Testing Status Tony Craig Lawrence Buja Wei Yu CCSM SEWG Meeting Feb 5, 2003.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Global Climate Modeling Research John Drake Computational Climate Dynamics Group Computer.
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
Parallel Computing Overview CS 524 – High-Performance Computing.
A Parallel Structured Ecological Model for High End Shared Memory Computers Dali Wang Department of Computer Science, University of Tennessee, Knoxville.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Coupling ROMS and WRF using MCT
WRF-VIC: The Flux Coupling Approach L. Ruby Leung Pacific Northwest National Laboratory BioEarth Project Kickoff Meeting April 11-12, 2011 Pullman, WA.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT Adoption and field tests of M.I.T General Circulation Model (MITgcm) with ESMF Chris Hill ESMF.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT | U MICH First Field Tests of ESMF GMAO Seasonal Forecast NCAR/LANL CCSM NCEP.
CCSM Software Engineering Coordination Plan Tony Craig SEWG Meeting Feb 14-15, 2002 NCAR.
Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
ESMF Development Status and Plans ESMF 4 th Community Meeting Cecelia DeLuca July 21, 2005 Climate Data Assimilation Weather.
CESM/RACM/RASM Update May 15, Since Nov, 2011 ccsm4_0_racm28:racm29:racm30 – vic parallelization – vic netcdf files – vic coupling mods and “273.15”
Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.
Initial Results from the Integration of Earth and Space Frameworks Cecelia DeLuca/NCAR, Alan Sussman/University of Maryland, Gabor Toth/University of Michigan.
Computer Science Section National Center for Atmospheric Research Department of Computer Science University of Colorado at Boulder Blue Gene Experience.
Mathematics and Computer Science & Environmental Research Divisions ARGONNE NATIONAL LABORATORY Regional Climate Simulation Analysis & Vizualization John.
ESMF Performance Evaluation and Optimization Peggy Li(1), Samson Cheung(2), Gerhard Theurich(2), Cecelia Deluca(3) (1)Jet Propulsion Laboratory, California.
CSEG Update Mariana Vertenstein CCSM Software Engineering Group Mariana Vertenstein CCSM Software Engineering Group.
CESM/ESMF Progress Report Mariana Vertenstein NCAR Earth System Laboratory CESM Software Engineering Group (CSEG) NCAR is sponsored by the National Science.
PetaApps: Update on software engineering and performance J. Dennis M. Vertenstein N. Hearn.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
Earth System Modeling Framework Status Cecelia DeLuca NOAA Cooperative Institute for Research in Environmental Sciences University of Colorado, Boulder.
1 CCSM Component Performance Benchmarking and Status of the CRAY X1 at ORNL Patrick H. Worley Oak Ridge National Laboratory Computing in Atmospheric Sciences.
Introduction to the Earth System Modeling Framework International Workshop on Next Generation Climate Models for Advanced High Performance Computing Facilities.
Regional Models in CCSM CCSM/POP/ROMS: Regional Nesting and Coupling Jon Wolfe (CSEG) Mariana Vertenstein (CSEG) Don Stark (ESMF)
Climate-Weather modeling studies Using a Prototype Global Cloud-System Resolving Model Zhi Liang (GFDL/DRC)
Petascale –LLNL Appro AMD: 9K processors [today] –TJ Watson Blue Gene/L: 40K processors [today] –NY Blue Gene/L: 32K processors –ORNL Cray XT3/4 : 44K.
Components, Coupling and Concurrency in the Earth System Modeling Framework N. Collins/NCAR, C. DeLuca/NCAR, V. Balaji/GFDL, G. Theurich/SGI, A. da Silva/GSFC,
Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.
High performance parallel computing of climate models towards the Earth Simulator --- computing science activities at CRIEPI --- Yoshikatsu Yoshida and.
Adrianne Middleton National Center for Atmospheric Research Boulder, Colorado CAM T340- Jim Hack Running the Community Climate Simulation Model (CCSM)
The CCSM2.0 Quick Start Guide Lawrence Buja CCSM Software Engineering Group June
1 OASIS3-MCT_3.0 OASIS overview OASIS3-MCT_3.0 Some recent performance results Summary and future efforts A. Craig, S. Valcke, L. Coquart, CERFACS April.
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
Coupling protocols – software strategy Question 1. Is it useful to create a coupling standard? YES, but … Question 2. Is the best approach to make a single.
NSF NCAR / NASA GSFC / DOE LANL ANL / NOAA NCEP GFDL / MIT / U MICH 15 May 2003 Cecelia DeLuca / NCAR 2 nd ESMF Community Meeting Princeton, NJ NSIPP Seasonal.
NSF NCAR / NASA GSFC / DOE LANL ANL / NOAA NCEP GFDL / MIT / U MICH May 14, 2003 Nancy Collins, NCAR Components Workshop, Princeton, NJ Components in the.
CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
On the Road to a Sequential CCSM Robert Jacob, Argonne National Laboratory Including work by: Mariana Vertenstein (NCAR), Ray Loy (ANL), Tony Craig (NCAR)
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Running CESM An overview
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Data Requirements for Climate and Carbon Research John Drake, Climate Dynamics Group Computer.
CCSM Software Engineering Update Tony Craig CCSM SEWG Meeting Feb 4, 2003.
Parallel IO for Cluster Computing Tran, Van Hoai.
Tackling I/O Issues 1 David Race 16 March 2010.
NSF NCAR / NASA GSFC / DOE LANL ANL / NOAA NCEP GFDL / MIT / U MICH C. DeLuca/NCAR, J. Anderson/NCAR, V. Balaji/GFDL, B. Boville/NCAR, N. Collins/NCAR,
The Community Climate System Model (CCSM): An Overview Jim Hurrell Director Climate and Global Dynamics Division Climate and Ecosystem.
SciDAC CCSM Consortium: Software Engineering Update Patrick Worley Oak Ridge National Laboratory (On behalf of all the consorts) Software Engineering Working.
Overview of the CCSM CCSM Software Engineering Group June
GMAO Seasonal Forecast
Software Practices for a Performance Portable Climate System Model
Mariana Vertenstein CCSM Software Engineering Group NCAR
Metadata Development in the Earth System Curator
Department of Computer Science, University of Tennessee, Knoxville
Progress of Interactions Among CCSM and Other Modeling Efforts
Presentation transcript:

CCSM Portability and Performance, Software Engineering Challenges, and Future Targets Tony Craig National Center for Atmospheric Research Boulder, Colorado, USA CAS Meeting, September 7-11, 2003, Annecy France

Topics CCSM SE and design overview Coupler design and performance Production and performance –portability –scaling SE Challenges The Future

CCSM Overview CCSM = Community Climate System Model (NCAR) Designed to evaluate and understand earth’s global climate, both historical and future. Multiple executables (5) –Atmosphere (CAM), MPI/OpenMP –Ocean (POP), MPI –Land (CLM), MPI/OpenMP –Sea Ice (CSIM), MPI –Coupler (CPL6), MPI

CCSM SE Overview Good science top priority Fortran 90 (mostly) 500k lines of code Community project, dozens of developers Collaborations are critical –University Community –DOE - SciDAC –NASA - ESMF Regular code releases Netcdf history files Binary restart files Many levels of parallelism-multiple executables, MPI, OpenMP

CCSM “Hub and Spoke” System cpl atm ocnice lnd Each component is a separate executable Each component on a unique set of hardware processors All communications go through coupler Coupler –communicates with all components –maps (interpolates) data –merges fields –computes some fluxes –has diagnostic, history, and restart capability

The CCSM coupler Recent redesign (cpl6) Create a fully parallel distributed memory coupler Implement M to N communication between components Improve communication performance to minimize bottlenecks at higher resolutions in the future Improve coupling interfaces, abstract communication method away from components Improve usability, flexibility, and extensibility of coupled system Improve overall performance

The Solution MCT* MPH** *Model Coupling Toolkit (DOE Argonne National Lab) ** Multi-Component Handshaking Library (DOE Lawrence Berkley National Lab) cpl6 Build a new coupler framework with abstracted, parallel communication software in the foundation. Create a coupler application instantiation called cpl6 which reproduces the functionality of cpl5:

cpl6 Design: Another view of CCSM In cpl5, MPI was the coupling interface In cpl6, the “coupler” is now attached to each component –Components unaware of coupling method –Coupling work can be carried out on component processors –Separate coupler no longer absolutely required atm lnd iceocn cpl coupling interface layer hardware processors

CCSM Communication: cpl5 vs cpl6 cpl5 gatherscatter comm (root to root) comm (root to root) copy Coupler on 8pes Ice component on 16pes 240 transfers, 21 fields Production configuration cpl6 comm (M to N) comm (M to N) copy NO copy cpl5 communication=61.5s cpl6 communication=18.5s

CCSM Production Forward integration of relatively coarse models –atm/land - T42 (128x64, L26) –ocn/ice - 1 degree (320x384, L40) Finite difference and spectral, explicit and implicit methods, vertical physics, global sums, nearest neighbor communication I/O not a bottleneck (5 Gb / simulated year) Restart capability (750 Mb) Separate harvesting to local mass storage system Auto resubmit

CCSM Throughput vs Resolution Atmosphere Resolution Ocean Resolution ProcessorsThroughput* (yrs/day) T31 (3.7 deg)3 deg T42 (2.8 deg)1 deg T421 deg T421 deg T85 (1.4 deg)1 deg T170 (0.7 deg)1 deg (estimate) *IBM power4 system, bluesky, as of 9/1/2003

CCSM Throughput vs Platform PlatformProcessorsThroughput*(yrs/day) IBM power3 wh (NCAR) IBM power3 nh (NERSC) SGI O3K (LANL) Linux cluster* (ANL) (estimate) HP/CPQ Alpha (PSC) (estimate) IBM power4 (NCAR) IBM power4 (NCAR) NEC Earth Simulator (estimate) Cray X1 (ORNL) (estimate) T42/1 degree atm/ocn resolution * ANL jazz machine, 2.4Ghz Pentium

Ocean Model Performance and Scaling **Courtesy of PW Jones, PH Worley, Y Yoshida, JB White III, J Levesque

CCSM Component Scaling CCSM2_2_beta08 T42_gx1v3 IBM Power4, bluesky

CCSM Load Balance Example 64 ocn 48 atm 16 ice 8 lnd 16 cpl 152 total Seconds per simulated day processors CCSM2_2_beta08 IBM Power4, bluesky

Challenges in the Environment (1) Machines often not well balanced –chip speed –interconnect –memory access –cache –I/O –vector capabilities Each machine is “balanced” differently Optimum coding strategy often depends largely on platform Need to develop “flexible” software strategies

RISC vs Vector Data layout; index order, data structure layout Floating operation count (if) versus pipelining (masking) Loop ordering and loop structure Vectorization impacts parallelization Memory access, cache blocking, array layouts, array usage Bottom Line (In My Opinion): –Truly effective cache reuse is very hard to achieve on real codes –Sustained performance on some RISC machines is disappointing –Poor vectorization costs an order of magnitude in performance on vector machines –We are now (re-)vectorizing and expect to pay little or no performance penalty on RISC machines

Challenges in the Environment (2) Startup and control of multiple executables Compilers and libraries Tools –Debuggers inadequate; multiple executables and MPI/OpenMP parallel models –Timing and profiling tools generally inadequate IBM HPM getting better Jumpshot works well Cray performance tools look promising –Have avoided instrumenting code (risk, robust, #if) –Use print statements and calls to system clock

Summary Science top priority, large community project, regular model releases SE improvements continuous, cpl6 is a success Machines change rapidly and are highly variable in architecture Component scaling and CCSM load balance are acceptable (Re)-Vectorization Tools and machine software can present significant challenges

Future Increased coupling flexibility –Single executable –Mixed concurrent/serial design Continue to work on scalar and parallel performance in all models Take advantage of libraries/collaborations for performance portability software, more layering, leverage external efforts –NASA/ESMF –DOE/SCIDAC –University Community –Others IBM is still an important production platform for CCSM CCSM is moving onto vector platforms and linux clusters, production capability on these platforms still to be determined

THE END