Download presentation
Presentation is loading. Please wait.
1
CESM Infrastructure Update
Mariana Vertenstein CESM Software Engineering Group National Center For Atmospheric Research CESM is primarily sponsored by the National Science Foundation and the Department of Energy
2
Outline New approach to infrastructure development
Common Infrastructure for Modeling the Earth - CIME New coupling complexity New components, routing complexity grids Challenges of data assimilation ESMF collaboration NUPOC, On-line regridding New Infrastructure Capabilities (1) Statistical Ensemble Test (2) Creation of parallel workflow capabilities (3) PIO2
3
Common Infrastructure for Modeling the Earth
CIME A New Approach for Earth System Modeling – the CESM example
4
In past infrastructure (no IP) tied to science development (has IP)
Ocean (POP, Data) SEA ICE (CICE, Data) COUPLER Atmosphere (CAM, Data) Land (CLM, DATA) Wave (WW3, DATA) River (RTM, DATA) Land Ice (CISM)
5
Why CIME? Facilitate infrastructure modernization as a collaborative project (e.g. CESM infrastructure) Response to February summit of US Global Change Research Program (USGCRP) / Interagency Group on Integrative Modeling (IGIM) as a positive outcome from the February Summit IGIM is charged with coordinating global change-related modeling activities across the Federal Government and providing guidance to USGCRP on modeling priorities. Enable separation of infrastructure (no intellectual property) versus scientific development codes (intellectual property must be protected) Eliminate duplication of efforts
6
CIME current steps forward…..
ALL CESM infrastructure to PUBLIC github repository This will facilitate AND encourage outside collaboration frequent feedback on infrastructure development quick problem resolution rapid improvement in the productivity, reliability and extensibility of the CIME infrastructure CIME can developed and tested as a stand-alone system – independent of prognostic components
7
Old paradigm – everything in restricted developer repository
Infrastructure Restricted Subversion Repository - All model components Restricted Subversion Repository Driver-Coupler Code Share Code Scripts System and Unit Testing Mapping Utilities CAM (prognostic) DATM (data) SATM (stub) XATM (cpl test) ATM Models
8
New paradigm – all infrastructure is Open Source IP still in place for prognostic components
PUBLIC Open Source Github Repository - Only prognostic components Restricted Subversion Repository Driver-Coupler Share Code Scripts Mapping Utilities System/Unit Testing All Data Models All Stub Models All cpl-test Models CAM CLM CICE POP (MPAS) RTM(MOSART) CISM Prognostic
9
PUBLIC Open Source Github Repository -
CIME Infrastructure can be used to facilitate releases and external collaborations Infrastructure PUBLIC Open Source Github Repository - Driver-Coupler (ESMF Collaboration) Share Code Scripts System/Unit testing Mapping Utilities All Data Models All Stub Models All cpl-test Models Prognostic Components e.g. CESM ESMF/NUOPC HYCOM
10
CIME implementation Stand-alone capability New unit testing framework
CIME run and tested “stand-alone” with either all data models or all stub-models or all “test-cpl” models New unit testing framework new unit tests in coupler Framework can also be applied to prognostic components Consolidation of separate externals Each current part of CIME was previously developed independently and often led to inconsistencies Now CIME is a SINGLE entity and ensures consistency among its various parts This simplifies and adds robustness to development process
11
Coupling Challenges New component, routing and grid complexity,
DART data assimilation
12
Coupling Complexity Currently have 7 components
Ocean (POP, Data) SEA ICE (CICE, Data) COUPLER Atmosphere (CAM, Data) Land (CLM, DATA) Wave (WW3, DATA) River (RTM, DATA) Land Ice (CISM)
13
Routing and Regridding Complexity
Each component can run on its own grid – only assumption is that ocn/ice are on the same grid Multiple grids supported Regular lat/lon Dipole, Tripole (ocn-pop, ice-cice) Hexagonal (regular and unstructured Voronoi meshes) MPAS grids (atm dycore, ocn, ice, land-ice) Coupler is responsible for regridding currently using mapping files that are generated offline with ESMF parallel regridding tool Fluxes mapped conservatively States mapped with either bi-linear or higher order non-conservative Components communicate with coupler at potentially different frequencies and with unique routine patterns
14
Coupling Complexity (Routing)
Ocean (POP, Data) SEA ICE (CICE, Data) COUPLER Atmosphere (CAM, Data) Land Ice (CISM)
15
Coupling Complexity (Routing)
Ocean (POP, Data) SEA ICE (CICE, Data) COUPLER Land (CLM, DATA) River (RTM, DATA)
16
Coupling Complexity (Routing)
COUPLER Atmosphere (CAM, Data) Land (CLM, DATA) River (RTM, DATA) Land Ice (CISM)
19
Multiple instance capability DART data assimilation
ocn obs atm obs Atmosphere (CAM) Land (CLM) Ocean (POP) SEA ICE (CICE) COUPLER River-Runoff DART
20
Next steps in coupling with data assimilation
All data assimilation currently done via files and DART and CESM are separate executables CESM must be stopped and restarted every data assimilation interval (6 hours) – extremely inefficient and expensive use of system resources Limitations of only 1 coupler, but multiple component instances New project enable DART to be a component of CESM coupled system each CESM component will have pause/resume capability and will be able to start up from a restart file during the model run Extend coupler capability to permit multiple couplers within single executable
21
ESMF Collaboration NUOPC and Online Regridding
22
Two ESMF/CESM Collaborations
Online regridding ESMF is collaborating with CSEG and this will be brought into CIME Introduction of “ESMF/NUOPC” capability in CESM/CIME Vertenstein is co-PI on ESPC proposal “An Integration and Evaluation Framework for ESPC Coupled Models” Import the standalone version of HYCOM into the validated NUOPC version of the CESM coupled system and test using the CORE forcing Run reference CESM configurations (500 years present day climate + IPCC scenarios) for both POP and HYCOM
23
NUOPC and CESM What is NUOPC? Goal for Driver Goal for Components:
NUOPC layer “generic components” are templates that encode rules for drivers, models, mediators (custom coupling code) and connectors (for data transfer) Goal for Driver Maintain a single CESM driver – but restructure it to accommodate both MCT and NUOPC component interfaces Goal for Components: Implement ESMF-based NUOPC components as a CESM option – but build on existing ESMF component interfaces
24
Redesign of cpl7 as first step
Why? Driver code was one large routine (6K loc)and hard-coded to contain MCT data types Difficult to understand, modify and add an alternative coupling architecture What was the redesign? Introduced a new abstraction layer between driver and components – driver has no reference to MCT or ESMF types Much easier to incorporate new ESMF/NUOPC driver components Permits backwards compatibility and memory sharing between MCT and ESMF data structures
25
MCT Hub<-> CAM Exchange ESMF Hub<-> CAM Exchange
Original Redesigned CAM DRIVER Component-type based MCT Hub<-> CAM Exchange or ESMF Hub<-> CAM Exchange Component_type <-> ESMF Component_type <-> MCT CAM MCT CAM DRIVER MCT based MCT Hub<-> CAM Exchange MCT ESMF MCT CAM ESMF CAM ESMF CAM
26
Current and Future Work
Current Status: NUOPC implementation complete for all CESM components AND HYCOM Modified the data exchange (between coupler/Mediator and components) to use NUOPC Fields Future Work: Clean up and prepare NUOPC version for wider use Merge code back to CIME and reconcile with other developmental changes as well as MCT implementation Performance evaluation
27
ESMF online regridding
As more regionally refined grids are introduced (e.g. MPAS, SE) – need to minimize the number of mapping files that are needed simplify and streamline workflow for generating new user grid configurations Will be a requirement for run-time adaptive mesh refinement ESMF is the only tool that currently delivers this capability Status: Prototype implementation has been done and is being updated for newest coupler
28
New Infrastructure Capabilities (1)
Statistical Ensemble Test
29
CESM Ensemble Consistency Test
Motivation: Ensure that changes during the CESM development cycle (code modifications, compiler changes, new machine architectures) do not adversely effect the code Question: Is the new data statistically significant from the old one Old Method: compare multiple long simulations – time consuming and subjective New Method: evaluate new data in the context of an ensemble of CESM runs
30
CESM Ensemble Consistency Test
Part 1: Create “truth” ensemble of 1-year CESM runs (151) Use “accepted” machine and “accepted” software stack Differ by O (10-14) perturbations in initial atmospheric temp. Part 2: Create “accepted” statistical distribution Statistics based on ensemble Summary file included with CESM release Part 3: Evaluate “new” runs (new platform, code base, …) Create 3 “new” runs (randomly selected i.c. from ensemble) Principal Component Analysis (PCA)-based testing Provides false positive rates Part 1: Done for CESM release 1-deg atmosphere model (F-case): 120 variables Looking at annual averages Part 2: statistics summary file Included in CESM 1.3.x release – parallel python code ~20 min Part 3: Python tool included in release that a user can run
31
CESM Ensemble Consistency Test
Many uses: Port-verification (new CESM-supported architectures, heterogeneous computing platforms) Sanity check on climate similarity for new CESM sytem snapshots (caught recent bug that would not have been caught before!!!!) Exploration of new algorithms, solvers, compiler options, … Evaluation of data compression on CESM data Heterogeneous computing (GPU/CPU)
32
New Infrastructure Capabilities (2)
New End-to-End workflow Parallel Post-Processing as part of the model run
34
New Infrastructure Capabilities (3)
New Parallel IO Libarary (PIO2)
35
New Parallel IO Library : PIO2
PIO2 rewrite has new C language API added (F90 API retained) decomposition option improves scalability data aggregation which improves performance new testing framework Provides higher performance through data aggregation new subset rearranger provides higher scalability Provides more options for IO performance tuning but also provides tools to make tuning easier
36
Subset rearranger gives better scaling
Data rearranged to 32 IO tasks Box rearranger gives optimal data layout Example decomposition: CLM data on 2048 tasks.
37
Acknowledgements (plus many more)
ESMF Collaboration Cecelia Deluca, Fei Liu, Gerhard Theurich, Peggy Li, Robert Oehmke, Mathew Rothstein, Tony Craig, Jim Edwards Statistical Ensemble Test Allison Baker, Dorrit Hammerling, Mike Levy, Doug Nychka, John Dennis, Joe Tribbia, Dave Williamson, Jim Edwards Parallel IO Jim Edwards, John Dennis, Jayesh Krishna Data Assimilation Alicia Karspeck, Nancy Collins, Tim Hoar, Kevin Reader, Jeff Anderson, Tony Craig Parallel Workflow Alice Bertini, Sheri Mickelson, Jay Schollenberger CSEG – Ben Andre, David Bailey, Alice Bertini, Cheryl Craig, Tony Craig, Brian Eaton, Jim Edwards, Erik Kluzek, Mike Levy, Bill Sacks, Sean Santos, Jay Schollenberger
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.