Earth System Modeling Infrastructure Cecelia DeLuca/ESMF-NCAR March 31-April 1, 2009 CHyMP Meeting
Outline Elements of interoperability platforms Integrating across elements Summary
Elements of interoperability platforms 1.Tight coupling tools and interfaces - hierarchical and peer component relationships - frequent, high volume transfers on high performance computers 2. Loose coupling tools and interfaces - generally peer-peer component relationships - lower volume and infrequent transfers on desktop and distributed systems 3. Science gateways - browse, search, and distribution of model components, models, and datasets - visualization and analysis services - workspaces and management tools for collaboration 4. Metadata conventions and ontologies - ideally, with automated production of metadata from models 5. Governance - coordinated and controlled evolution of systems
Tight coupling tools and interfaces Examples: Earth System Modeling Framework (ESMF) - NASA, NOAA, Department of Defense, community weather and climate models, U.S. operational numerical weather prediction centers (HPC focus) Flexible Modeling System (FMS) – NOAA precursor to ESMF, still used at the Geophysical Fluid Dynamics Laboratory for climate modeling Space Weather Modeling Framework (SWMF) – NASA-funded, used at the University of Michigan for space weather prediction
How coupling tools work: Users wrap their native data in framework data structures Users adopt standard calling interfaces for a set of methods that enable data exchange between components Development toolkits help users with routine functions (regridding, time management, etc.)
ESMF: Standard interfaces Three ESMF component methods: Initialize, Run, and Finalize (I/R/F) Each can have multiple phases Users register their native I/R/F methods with an ESMF Component Small set of arguments: call ESMF_GridCompRun (myComp, importState, exportState, clock, phase, blockingFlag, rc)
ESMF: Distributed data representation 1. Representation in index space (Arrays) Simple, flexible multi-dimensional array structure Regridding via sparse matrix multiply with user-supplied interpolation weights Scalable to 10K+ processors - no global information held locally 2. Representation in physical space (Fields) Built on Arrays + some form of Grid Grids are: logically rectangular, unstructured mesh, or observational data streams Regridding via parallel on-line interpolation weight generation, bilinear or higher order options Intrinsically holds significant amounts of metadata - dynamic, usable for multiple purposes, limited annotation required Supported Array distributions
ESMF: Coupling options Generally single executable for simpler deployment Push mode of data communication is very efficient Coupling communications can be set up and called in a coupler, or called directly from within components (for I/O, data assimilation) Hierarchical components for organization into sub- processes Recursive components for nesting higher resolution regions Coupling across C/C++ and Fortran Ensemble management ESMF-based hierarchical structure of GEOS-5 atmospheric GCM
ESMF: Performance portability ESMF is highly performance portable, low (<5%) overhead regression tests run on 30+ platform/compiler combinations nightly See Newer ports include native Windows, Solaris Using TeraGrid Build and Test Service to simplify regression testing Performance at the petascale… Scaling of the ESMF sparse matrix multiply, used in regridding transformations, out to 16K processors. (ESMF v3.1.0rp2) Plot from Peggy Li, NASA/JPL Tested on ORNL XT4, -N1 means 1 core per node. msec ASMM Run-Time Comparison
ESMF: Higher order interpolation techniques in CCSM Interp. noise Interpolation noise in the derivative of the zonal wind stress grid index in latitudinal direction ESMF higher order interpolation weights were used to map from a 2-degree Community Atmospheric Model (CAM) grid to a POP ocean grid (384x320, irregularly spaced) 33% reduction in noise globally in quantity critical for ocean circulation compared to previous bilinear interpolation approach ESMF weights are now the CCSM default Black = bilinear Red = higher-order ESMF v3.1.1 Green = higher order ESMF v4.0.0
POP UCLA AGCM WRF NCOMHYCOM CICE pWASH123ADCIRC ROMS CICE ice POP Ocean CCSM4 COAMPS SWAN NMM-B Atm PhysNMM-B Atm Dynamics NEMS NMM History SWMF MITgcm AtmMITgcm Ocean MITgcm GFS Atm PhysGFS Atm Dynamics GFS GFS I/O Land Info System FV Cub Sph Dycore GEOS-5 GWDGEOS-5 FV Dycore GEOS-5 Atm Dynamics GEOS-5 GSI MOM4 GEOS-5 Moist Proc GEOS-5 Turbulence GEOS-5 LW RadGEOS-5 Solar Rad GEOS-5 Radiation GEOS-5 Aeros Chem GOCART Strat Chem Param Chem GEOS-5 Atm Chem GEOS-5 Ocean Biogeo GEOS-5 Salt Water Poseidon GEOS-5 Data Ocean GEOS-5 OGCM GEOS-5 Topology GEOS-5 Land Ice GEOS-5 Lake GEOS-5 Veg Dyn GEOS-5 Catchment GEOS-5 Land GEOS-5 Surface GEOS-5 Atm Physics GEOS-5 Hiistory ESMF: Model map NOAA Department of Defense University NASA Department of Energy National Science Foundation ESMF coupling complete ESMF coupling in progress Component (thin lines) Model (thick lines) Legend Ovals show ESMF components and models that are at the working prototype level or beyond. Tracer Advection HAF GAIM CLM Ice sheet Dead ocean Dead ice Data ocean Data ice Dead landData land Stub ocean Stub ice Stub land FIM Dead atmData atm
Loose coupling tools and interfaces Examples: OpenMI Web service approaches Coupling options: Generally multiple executable Pull mode of data communication simple but not efficient (ask for a data point based on coordinates) Generally peer-peer component relationships Coupling across multiple computer languages (Python, Java, C++, etc.)
Science gateways – access centers Examples: Earth System Grid (ESG) – DOE, NCAR, NOAA support, used to distribute Intergovernmental Panel on Climate Change data and for climate research Hydrologic Information System (HIS) - NSF funded, used to enhance access to data for hydrologic analysis Object Modeling System (OMS) - USDA effort, used for agricultural modeling and analysis
Metadata conventions and ontologies Examples: Climate and Forecast (CF) conventions - spatial and temporal properties of fields used in weather and climate METAFOR Common Information Model (CIM) – large EU-funded project, climate model component structure and properties (including technical and scientific properties) WaterML – Schema for hydrologic data developed by the Consortium of Universities for the Advancement of Hydrologic Science (CUAHSI)
Governance Pervasive issue in community modeling Divergent effects of Multiple institutions Geographic dispersion Multiple domains of interest (working groups) Must be balanced by strong integration body - strategies: Meets frequently enough to affect routine development (quarterly) Meets virtually to get sufficient representation Includes user and other stakeholder representatives Authorized to prioritize and set development schedule Supported by web-based management tools
Integrating across interoperability elements Examples from the Curator project (NSF and NASA) Automated output of CF and CIM XML schema from ESMF (tight coupling + ontology) Ingest of ESMF-generated schema into ESG, propagation into tools for search, browse, inter-comparison and distribution of model components and models (tight coupling + ontology + science gateway) Implementation of dataset “trackback” in ESG that connects datasets with detailed information about the models used to create the data (tight coupling + ontology + science gateway) Implementation of personal and group workspaces in ESG (science gateway + governance)
Integrating across interoperability elements (cont.) Translation of ESMF interfaces into web services to enable invocation of ESMF applications from a science gateway, and enable data and metadata from the run to be stored back to the gateway (tight coupling + loose coupling + science gateway + ontology, new TeraGrid funding) ESMF interface Web service interface Tightly coupled HPC components Loosely coupled components Issue of switch from push to pull data interactions…
Screenshot: Component trackback
Screenshot: Faceted search
Summary Cross-domain interoperability platforms have multiple elements Many of these elements already exist Integration activities (such as Earth System Curator) are the next focus Image courtesy of Rocky Dunlap, Georgia Institute of technology