Download presentation
Presentation is loading. Please wait.
Published byAbbigail Peyser Modified over 10 years ago
1
NCAR Cyberinfrastructure for Earth System Modeling Don Middleton NCAR Scientific Computing Division APAN eScience Workshop, Honolulu January 28, 2004
2
NCAR Cyberinfrastructure for Earth System Modeling l Supercomputers l High-bandwidth networks l Models l Data centers and Grids l Collaboratories l Analysis and Visualization
3
NCAR “Atkins Report” l “A new age has dawned…” “The Panel’s overarching recommendation is that the National Science Foundation should establish and lead a large-scale, interagency, and internationally coordinated Advanced Cyberinfrastructure Program (ACP) to create, deploy, and apply cyberinfrastructure in ways that radically empower all scientific and engineering research and allied education. We estimate that sustained new NSF funding of $1 billion per year is needed to achieve critical mass and to leverage the coordinated co-investment from other federal agencies, universities, industry, and international sources necessary to empower a revolution. The cost of not acting quickly or at a subcritical level could be high, both in opportunities lost and in increased fragmentation and balkanization of the research.” Atkins Report, Executive Summary
4
NCAR Characteristics of Infrastructure (from Kim Mish workshop presentation) l Essential –So important that it becomes ubiquitous l Reliable –Example: the built environment of the Roman Empire l Expensive –Nothing succeeds like excess (e.g. Interstate system) –Inherently one-off (often, few economies of scale) l Clear factorization between research and practice –Generally deploy what provably works
5
NCAR A Global Coupled Climate Model
6
NCAR Climate Model Data Production l T42 CCSM (current, 280km) –7.5GB/yr, 100 years ->.75TB l T85 CCSM (140km) –29GB/yr, 100 years -> 2.9TB l T170 CCSM (70km) –110GB/yr, 100 years -> 11TB
7
NCAR Capacity-related Improvements Increased turnaround, model development, ensemble of runs Increase by a factor of 10, linear data l Current T42 CCSM –7.5GB/yr, 100 years ->.75TB * 10 = 7.5TB
8
NCAR CCM at T170 Resolution
9
NCAR Capability-related Improvements Spatial Resolution: T42 -> T85 -> T170 Increase by factor of ~ 10-20, linear data Temporal Resolution: Study diurnal cycle, 3 hour data Increase by factor of ~ 4, linear data CCM3 at T170 (70km)
10
NCAR Capability-related Improvements Quality: Improved boundary layer, clouds, convection, ocean physics, land model, river runoff, sea ice Increase by another factor of 2-3, data flat Scope: Atmospheric chemistry (sulfates, ozone…), biogeochemistry (carbon cycle, ecosystem dynamics), middle Atmosphere Model… Increase by another factor of 10+, linear data
11
NCAR Model Improvement Wishlist Grand Total: Increase compute by a Factor O(1000- 10000)
12
NCAR Advances at the Earth Simulator ESC Climate Model at T1279 (approx. 10km)
13
NCAR Longer-term Missions - Observation of Key Earth System Interactions Terra Aura Aqua Landsat 7 Exploratory - Explore Specific Earth System Processes and Parameters and Demonstrate Technologies GRACE PICASSO Cloudsat QuikScat EO-1 ICEsatJason-1 SRTM VCL We Will Examine Practically Every Aspect of the Earth System from Space in This Decade Triana Courtesy of Tim Killeen, NCAR
14
NCAR The Earth System Grid U.S. DOE SciDAC funded R&D effort - a “ Collaboratory Pilot Project” U.S. DOE SciDAC funded R&D effort - a “ Collaboratory Pilot Project” Build an “Earth System Grid” that enables management, discovery, distributed access, processing, & analysis of distributed terascale climate research data Build an “Earth System Grid” that enables management, discovery, distributed access, processing, & analysis of distributed terascale climate research data l Build upon Globus Toolkit and DataGrid technologies and deploy l Potential broad application to other areas http://www.earthsystemgrid.org
15
NCAR ESG Team l ANL –Ian Foster (PI) –Veronika Nefedova –(John Bresenhan) –(Bill Allcock) l LBNL –Arie Shoshani –Alex Sim l ORNL –David Bernholdte –Kasidit Chanchio –Line Pouchard l LLNL/PCMDI –Bob Drach –Dean Williams (PI) l USC/ISI –Anne Chervenak –Carl Kesselman –(Laura Perlman) l NCAR –David Brown –Luca Cinquini –Peter Fox –Jose Garcia –Don Middleton (PI) –Gary Strand
16
NCAR
17
ESG Scenario l End 2002: 1.2 million files comprising ~75TB of data at NCAR, ORNL, LANL, NERSC, and PCMDI l End 2007: As much as 3 PB (3,000 TB) of data (!) l Current practice is already broken – the future will be even worse if something isn’t done…
18
NCAR ESG: Challenges l Enabling the simulation and data management team l Enabling the core research community in analyzing and visualizing results l Enabling broad multidisciplinary communities to access simulation results We need integrated scientific work environments that enable smooth WORKFLOW for knowledge development: computation, collaboration & collaboratories, data management, access, distribution, analysis, and visualization.
19
NCAR ESG: Strategies l Harness a federation of sites, web portals –Globus Toolkit -> The Earth System Grid -> The UltraDataGrid l Move data a minimal amount, keep it close to computational point of origin when possible –Data access protocols, distributed analysis l When we must move data, do it fast and with a minimum amount of human intervention –Storage Resource Management, fast networks l Keep track of what we have, particularly what’s on deep storage –Metadata and Replica Catalogs
20
NCAR
21
Server Tera/Peta-scale Archive HRM Tools for reliable staging, transport, and replication Server Tera/Peta-scale Archive HRM Client Selection Control Monitoring HRM Storage/Data Management
22
NCAR OPeNDAP An Open Source Project for a Network Data Access Protocol (originally DODS, the Distributed Oceanographic Data System)
23
NCAR OPeNDAP-g -Transparency -Performance -Security -Authorization -(Processing) Typical Application Data (local) netCDF lib Application Data (remote) OPeNDAP Client Application OPeNDAP Via http Big Data (remote) ESG client Application ESG + DODS OpenDAP Server ESG Server Distributed Application data Distributed Data Access Services OPeNDAP Via Grid
24
NCAR l For XML encoding of metadata (and data) of any generic netCDF file l Objects: netCDF, dimension, variable, attribute l Beta version reference implementation as Java Library (http://www.scd.ucar.edu/vets/luca/netcdf/extract_metadata.htm) ESG: NcML Core Schema netCDF nc:netCDFType nc:dimension nc:variable nc: attribute nc:values nc:VariableType
25
NCAR Object [1] id Object [1] id Activity [0,1] name [0,1] description [0,1] rights [0,n] date type= [0,n] note [0,n] participant role= [0,n] reference uri= Activity [0,1] name [0,1] description [0,1] rights [0,n] date type= [0,n] note [0,n] participant role= [0,n] reference uri= isA Investigation isA Project [0,n] topic type= [0,1] funding Project [0,n] topic type= [0,1] funding isA Ensemble Campaign isPartOf Simulation [0,n] simulationInput type= [0,n] simulationHardware Simulation [0,n] simulationInput type= [0,n] simulationHardware Observation Experiment Analysis isPartOf hasParent hasChild hasSibling Dataset [0,1] type [0,1] conventions [0,n] date type= [0,n] format type= uri= [0,1] timeCoverage [0,1] spaceCoverage Dataset [0,1] type [0,1] conventions [0,n] date type= [0,n] format type= uri= [0,1] timeCoverage [0,1] spaceCoverage isA generated By isPart Of Person [0,1] firstName [0,1] lastName [0,1] contact Person [0,1] firstName [0,1] lastName [0,1] contact Institution [0,1] name [0,1] type [0,1] contact Institution [0,1] name [0,1] type [0,1] contact isA worksFor participant role= Class AbstractClass inheritance association LEGEND Service [0,1] name [0,1] description Service [0,1] name [0,1] description serviceId
26
NCAR ESG Current Topology RLI MSS HRM HPSS HRM RLI HPSS HRM RLI DISK HRM RLI DISK OGSA-DAI MySQL RDBMS ESG WEB PORTAL Tomcat/Struts cross-update gridFTP query MyProxy authenticate GRAM GATEKEEPER submit execute gridFTP SERVER LAS SERVER visualize LBNL ISI LLNL NCAR ORNL CAS ANL LRC
27
NCAR Data->Knowledge Mass Storage System (1.3PB) Petascale Knowledge Repository Establish new paradigms for managing and accessing scientific data based on semantic organization.
28
NCAR Collaborations & Relationships l CCSM Data Management Group l The Globus Project l Other SciDAC Projects: Climate, Security & Policy for Group Collaboration, Scientific Data Management ISIC, & High- performance DataGrid Toolkit l OPeNDAP/DODS (multi-agency) l NSF National Science Digital Libraries Program (UCAR & Unidata THREDDS Project) l U.K. e-Science and British Atmospheric Data Center l NOAA NOMADS and CEOS-grid l Earth Science Portal group (multi-agency, intnl.) l ESMF (emerging)
29
NCAR NCAR Command Language (NCL)
30
NCAR
36
NCL: Core l Approx. 500 built-in functions and procedures –File I/O & data model for Earth sciences –Unique grids, Climate-modeling routines –Spherical harmonics, Regridding and interpolation –Graphics (wind barbs, simple 3D plots) l 36 NCL core visual representations –Contours, XY plots, vectors, streamlines, maps, histograms, text, markers, polygons l Supported on Unix, Linux, Mac, and PC 10 years, 20 People involved with development, 50 person-years of effort, about 1.5 million lines of source, 500K lines of documentation
37
NCAR NCL as CI for a Community l CAM & CCSM Processor – 100 functions, 200 examples, 20K lines of NCL code (CGD) l WGNE Climate Diagnostics Processor – 10K lines of NCL code (CGD) l Award-winning Aviation Weather Site (RAP) l MM5 Analysis Package (RIP) l Weather Research & Forecast Model: Initial community analysis software and RIP l Community Data Portal (SCD)
38
NCAR NCL http://ngwww.ucar.edu/ncl
39
Collaborative Environments and the AccessGrid Science Portals + AccessGrid: University of Michigan (Knoop, Hardin) Vegetation & Ecosystem Mapping Program (VEMAP) NCAR/SCD VETS/KEG Argonne National Labs
40
NCAR END
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.