August 2001 Parallelizing ROMS for Distributed Memory Machines using the Scalable Modeling System (SMS) Dan Schaffer NOAA Forecast Systems Laboratory (FSL)

Slides:



Advertisements
Similar presentations
Parallel Processing with OpenMP
Advertisements

Introduction to Openmp & openACC
Introductions to Parallel Programming Using OpenMP
WRF Modeling System V2.0 Overview
The OpenUH Compiler: A Community Resource Barbara Chapman University of Houston March, 2007 High Performance Computing and Tools Group
NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
June 2003Yun (Helen) He1 Coupling MM5 with ISOLSM: Development, Testing, and Application W.J. Riley, H.S. Cooley, Y. He*, M.S. Torn Lawrence Berkeley National.
Application of Fortran 90 to ocean model codes Mark Hadfield National Institute of Water and Atmospheric Research New Zealand.
Weather Research & Forecasting Model (WRF) Stacey Pensgen ESC 452 – Spring ’06.
Support for Adaptive Computations Applied to Simulation of Fluids in Biological Systems Immersed Boundary Method Simulation in Titanium Siu Man Yau, Katherine.
Coupling ROMS and WRF using MCT
UPC at CRD/LBNL Kathy Yelick Dan Bonachea, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Mike Welcome, Christian Bell.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Budapest, November st ALADIN maintenance and phasing workshop Short introduction to OpenMP Jure Jerman, Environmental Agency of Slovenia.
Components and Concurrency in ESMF Nancy Collins Community Meeting July 21, GMAO Seasonal.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT Adoption and field tests of M.I.T General Circulation Model (MITgcm) with ESMF Chris Hill ESMF.
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT | U MICH First Field Tests of ESMF GMAO Seasonal Forecast NCAR/LANL CCSM NCEP.
1 Babak Behzad, Yan Liu 1,2,4, Eric Shook 1,2, Michael P. Finn 5, David M. Mattli 5 and Shaowen Wang 1,2,3,4 Babak Behzad 1,3, Yan Liu 1,2,4, Eric Shook.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Adaptive MPI Milind A. Bhandarkar
ESMF Development Status and Plans ESMF 4 th Community Meeting Cecelia DeLuca July 21, 2005 Climate Data Assimilation Weather.
Center for Programming Models for Scalable Parallel Computing: Project Meeting Report Libraries, Languages, and Execution Models for Terascale Applications.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.
Using HDF5 in WRF Part of MEAD - an alliance expedition.
UPC Applications Parry Husbands. Roadmap Benchmark small applications and kernels —SPMV (for iterative linear/eigen solvers) —Multigrid Develop sense.
 Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919) P.O. Box 569, Chapel Hill, NC.
Stochastic optimization of energy systems Cosmin Petra Argonne National Laboratory.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
Climate-Weather modeling studies Using a Prototype Global Cloud-System Resolving Model Zhi Liang (GFDL/DRC)
CAM Process, Redesign, and Plans Tom Henderson February 4, 2003.
Ligia Bernardet, S. Bao, C. Harrop, D. Stark, T. Brown, and L. Carson Technology Transfer in Tropical Cyclone Numerical Modeling – The Role of the DTC.
Improving I/O with Compiler-Supported Parallelism Why Should We Care About I/O? Disk access speeds are much slower than processor and memory access speeds.
Introduction to OpenMP Eric Aubanel Advanced Computational Research Laboratory Faculty of Computer Science, UNB Fredericton, New Brunswick.
ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
NCEP ESMF GFS Global Spectral Forecast Model Weiyu Yang, Mike Young and Joe Sela ESMF Community Meeting MIT, Cambridge, MA July 21, 2005.
CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.
Experiences with Achieving Portability across Heterogeneous Architectures Lukasz G. Szafaryn +, Todd Gamblin ++, Bronis R. de Supinski ++ and Kevin Skadron.
1 National Environmental Modeling System (NEMS) Status M. Iredell and EMC Staff.
Threaded Programming Lecture 2: Introduction to OpenMP.
Outline Why this subject? What is High Performance Computing?
WRF Software Development and Performance John Michalakes, NCAR NCAR: W. Skamarock, J. Dudhia, D. Gill, A. Bourgeois, W. Wang, C. Deluca, R. Loft NOAA/NCEP:
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
2/22/2001Greenbook 2001/OASCR1 Greenbook/OASCR Activities Focus on technology to enable SCIENCE to be conducted, i.e. Software tools Software libraries.
ESMF,WRF and ROMS. Purposes Not a tutorial Not a tutorial Educational and conceptual Educational and conceptual Relation to our work Relation to our work.
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
Slide 1 NEMOVAR-LEFE Workshop 22/ Slide 1 Current status of NEMOVAR Kristian Mogensen.
Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
1 HPJAVA I.K.UJJWAL 07M11A1217 Dept. of Information Technology B.S.I.T.
An overview of C Language. Overview of C C language is a general purpose and structured programming language developed by 'Dennis Ritchie' at AT &T's.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.
Hernán García CeCalcULA Universidad de los Andes.
3 Copyright © 2006, Oracle. All rights reserved. Designing and Developing for Performance.
Application of Emerging Computational Architectures (GPU, MIC) to Atmospheric Modeling Tom Henderson NOAA Global Systems Division
GMAO Seasonal Forecast
4D-VAR Optimization Efficiency Tuning
For Massively Parallel Computation The Chaotic State of the Art
GdX - Grid eXplorer parXXL: A Fine Grained Development Environment on Coarse Grained Architectures PARA 2006 – UMEǺ Jens Gustedt - Stéphane Vialle - Amelia.
SHARED MEMORY PROGRAMMING WITH OpenMP
Computer Engg, IIT(BHU)
Initial Adaptation of the Advanced Regional Prediction System to the Alliance Environmental Hydrology Workbench Dan Weber, Henry Neeman, Joe Garfield and.
J-Zephyr Sebastian D. Eastham
Milind A. Bhandarkar Adaptive MPI Milind A. Bhandarkar
FUJIN: a parallel framework for meteorological models
Department of Computer Science, University of Tennessee, Knoxville
Presentation transcript:

August 2001 Parallelizing ROMS for Distributed Memory Machines using the Scalable Modeling System (SMS) Dan Schaffer NOAA Forecast Systems Laboratory (FSL) August 2001

August 2001 Outline Who we are Intro to SMS Application of SMS to ROMS Ongoing Work Conclusion

August 2001 Who we are Mark Govett Leslie Hart Tom Henderson Jacques Middlecoff Dan Schaffer Developing SMS for 20+ man years

August 2001 Intro to SMS Overview –Directive based FORTRAN comments Enables single source parallelization –Distributed or shared memory machines –Performance portability

August 2001 Distributed Memory Parallelism

August 2001 Add SMS Directives Code Parallelization using SMS SMS Serial Code SMS Parallel Code Original SerialCode PPP Parallel Pre-Processor Serial Executable Parallel Executable

August 2001 Low-Level SMS MPI, SHMEM, etc. NNTSRSSST FDA Library Spectral Library Parallel I/O SMS Parallel Code

August 2001 Intro to SMS (contd) –Support for all of F77 plus much of F90 including: Dynamic memory allocation Modules (partially supported) User-defined types –Supported Machines COMPAQ Alpha-Linux Cluster (FSL “Jet”) PC-Linux Cluster SUN Sparcstation SGI Origin 2000 IBM SP-2

August 2001 Intro to SMS (contd) Models Parallelized –Ocean : ROMS, HYCOM, POM –Mesoscale Weather : FSL RUC, FSL QNH, NWS Eta, Taiwan TFS (Nested) –Global Weather : Taiwan GFS (Spectral) –Atmospheric Chemistry : NOAA Aeronomy Lab

August 2001 Key SMS Directives Data Decomposition –csms$declare_decomp –csms$create_decomp –csms$distribute Communication –csms$exchange –csms$reduce Index Translation –csms$parallel Incremental Parallelization –csms$serial Performance Tuning –csms$flush_output Debugging Support –csms$reduce (bitwise exact) –csms$compare_var –csms$check_halo

August 2001 SMS Serial Code

August 2001 Advanced Features Nesting Incremental Parallelization Debugging Support (Run-time configurable) –CSMS$REDUCE Enables bit-wise exact reductions –CSMS$CHECK_HALO Verifies a halo region is up-to-date –CSMS$COMPARE_VAR Compare variables for simultaneous runs with different numbers of processors HYCOM 1-D decomp parallelized in 9 days

August 2001 Incremental Parallelization “global” “local” “global” CALL NOT_PARALLEL(...) SMS Directive: CSMS$SERIAL

August 2001 Advanced Features (contd) Overlapping Output with Computations (FORTRAN Style I/O only) Run-time Process Configuration –Specify number of processors per decomposed dim or number of grid points per processor 15% performance boost for HYCOM –Support for irregular grids coming soon

August 2001 SMS Performance (Eta) Eta model run in production at NCEP for use in National Weather Service Forecasts Lines of Code (excluding comments) 198 SMS Directives added to the code

August 2001 ETA Performance Performance measured on NCEP SP2 I/O excluded Resolution : 223x365x45 88 PE run-time beats NCEP hand-coded MPI by 1% 88 PE Exchange time beats hand-coded MPI by 17%

August 2001 SMS Performance (HYCOM) 4500 Lines of Code (excluding comments) 108 openMP directives included in the code 143 SMS Directives added to the code

August 2001 HYCOM Performance Performance measured on O2K Resolution : 135x256x14 Serial code runs in 136 seconds

August 2001 Intro to SMS (contd) –Extensive documentation available on the web –New development aided by Regression test suite Web-based bug tracking system

August 2001 Outline Who we are Intro to SMS Application of SMS to ROMS Ongoing Work Conclusion

August 2001 SMS ROMS Implementation Used awk and cpp to convert to dynamic memory; simplifying SMS parallelization Leveraged existing shared memory parallelism do I = ISTR, IEND Directives added to handle NEP scenario Lines of Code, 132 SMS directives Handled netCDF I/O with CSMS$SERIAL

August 2001 Results and Performance Runs and produces correct answer on all supported SMS machines Low Resolution 128x128x30 –“Jet”, O2K, T3E Scaling –Run-times for main loop (21 time steps) excluding I/O High Resolution 210x550x30 –PMEL using in production –97% Efficiency between 8 and 16 processors on “Jet”

August 2001 SMS Low Res ROMS “Jet” Performance

August 2001 SMS Low Res ROMS O2K Performance

August 2001 SMS Low Res ROMS T3E Performance

August 2001 Outline Who we are Intro to SMS Application of SMS to ROMS Ongoing Work Conclusion

August 2001 Ongoing Work (funding dependent) Full F90 Support Support for parallel netCDF T3E port SHMEM implementation on T3E, O2K Parallelize other ROMS scenarios Implement SMS nested ROMS Implement SMS coupled ROMS/COAMPS

August 2001 Conclusion SMS is a high level directive-based tool Simple single source parallelization Performance optimizations provided Strong debugging support included Performance beats hand-coded MPI SMS is performance portable

August 2001 Web-Site www-ad.fsl.noaa.gov/ac/sms.html