1 Earth System Grid Center For Enabling Technologies (ESG-CET) Introduction and Overview Dean N. Williams, Don E. Middleton, Ian T. Foster, and David E.

Slides:



Advertisements
Similar presentations
A. Sim, CRD, L B N L 1 ANI and Magellan Launch, Nov. 18, 2009 Climate 100: Scaling the Earth System Grid to 100Gbps Networks Alex Sim, CRD, LBNL Dean N.
Advertisements

DOE Global Modeling Strategic Goals Anjuli Bamzai Program Manager Climate Change Prediction Program DOE/OBER/Climate Change Res Div
Earth System Curator Spanning the Gap Between Models and Datasets.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Earth System Grid: Model Data Distribution & Server-Side Analysis to Enable Intercomparison Projects PCMDI Software Team UCRL-PRES
ESIP Air Quality Workgroup and the GEO Air Quality Community of Practice collaboratively building an air quality community network for finding, accessing,
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
The Earth System Grid Discovery and Semantic Web Technologies Line Pouchard Oak Ridge National Laboratory Luca Cinquini, Gary Strand National Center for.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
CCSM Portal/ESG/ESGC Integration (a PY5 GIG project) Lan Zhao, Carol X. Song Rosen Center for Advanced Computing Purdue University With contributions by:
GIG Software Integration: Area Overview TeraGrid Annual Project Review April, 2008.
IS-ENES [ees-enes] InfraStructure for the European Network for Earth System Modelling IS-ENES will develop a virtual Earth System Modelling Resource Centre.
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
Climate Sciences: Use Case and Vision Summary Philip Kershaw CEDA, RAL Space, STFC.
NE II NOAA Environmental Software Infrastructure and Interoperability Program Cecelia DeLuca Sylvia Murphy V. Balaji GO-ESSP August 13, 2009 Germany NE.
Planning for Arctic GIS and Geographic Information Infrastructure Sponsored by the Arctic Research Support and Logistics Program 30 October 2003 Seattle,
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
The Earth System Grid (ESG) Goals, Objectives and Strategies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
Data Publication and Quality Control Procedure for CMIP5 / IPCC-AR5 Data WDC Climate / DKRZ:
A Flexible Component based Access Control Architecture for OPeNDAP Services Philip Kershaw STFC Rutherford Appleton Laboratory.
GBIF Mid Term Meetings 2011 Biodiversity Data Portals for GBIF Participants: The NPT Global Biodiversity Information Facility (GBIF) 3 rd May 2011.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The Earth System Grid: A Visualisation Solution Gary Strand.
Web Portal Design Workshop, Boulder (CO), Jan 2003 Luca Cinquini (NCAR, ESG) The ESG and NCAR Web Portals Luca Cinquini NCAR, ESG Outline: 1.ESG Data Services.
1 Earth System Modeling Framework Documenting and comparing models using Earth System Curator Sylvia Murphy: Julien Chastang:
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
- Vendredi 27 mars PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL.
ESG Observational Data Integration Presented by Feiyi Wang Technology Integration Group National Center of Computational Sciences.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
IPCC TGICA and IPCC DDC for AR5 Data GO-ESSP Meeting, Seattle, Michael Lautenschlager World Data Center Climate Model and Data / Max-Planck-Institute.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Access Control for NCAR Data Portals A report on work in progress about the future of the NCAR Community Data Portal Luca Cinquini GO-ESSP Workshop, 6-8.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
1 Overall Architectural Design of the Earth System Grid.
1 Gateways. 2 The Role of Gateways  Generally associated with primary sites in ESG-CET  Provides a community-facing web presence  Can be branded as.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
1 Summary. 2 ESG-CET Purpose and Objectives Purpose  Provide climate researchers worldwide with access to data, information, models, analysis tools,
1 Earth System Grid Center for Enabling Technologies (ESG-CET) Overview ESG-CET Team Climate change is not only a scientific challenge of the first order.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
SCD User Briefing The Community Data Portal and the Earth System Grid Don Middleton with presentation material developed by Luca Cinquini, Mary Haley,
Providing access to your data: Determining your audience Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Application of RDF-OWL in the ESG Ontology Sylvia Murphy: Julien Chastang: Luca Cinquini:
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
GO-ESSP The Earth System Grid The Challenges of Building Web Client Geo-Spatial Applications Eric Nienhouse NCAR.
ISWG / SIF / GEOSS OOS - August, 2008 GEOSS Interoperability Steven F. Browdy (ISWG, SIF, SCC)
International Planetary Data Alliance Registry Project Update September 16, 2011.
Technical Workshop for Climate Services Portal Development: Setting the Stage National Climatic Data Center Asheville, NC August 13-15, 2008.
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal.
Metadata Support for Model Intercomparison Projects Sylvia Murphy: Cecelia DeLuca: Julien.
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
The Earth System Grid: A Visualisation Solution
Data Requirements for Climate and Carbon Research
Metadata Development in the Earth System Curator
Presentation transcript:

1 Earth System Grid Center For Enabling Technologies (ESG-CET) Introduction and Overview Dean N. Williams, Don E. Middleton, Ian T. Foster, and David E. Bernholdt On behalf of the ESG-CET Team Project Web Site: Mid-Term Project Review Rockville, MD May 11, 2009

2 Agenda 1.Introduction and Overview 2.Overall Architecture Design 3.Gateway 4.Data Node  Break 5.Accomplishments 6.Collaborations and Partnerships 7.Recap of Morning Presentations  Lunch 8.Research and Development  Break 9.Demonstration 10.Future Work 11.Summary Review folder folder Review presentations folder/presentations

3 A Brief History: ESG-I,  The emerging challenge of climate data  Proposal to DOE’s Next Generation Internet (NGI) program in March 1999  ANL, LANL, LBNL, LLNL, NCAR, USC/ISI  Data movement and replication  Prototype climate “data browser”  “Hottest Infrastructure” award at SC2000  NGI cut short, follow-on funding from OBER & MICS  Ideas on the table, partnerships, experience  Minimal end-user deployment or use  Began development of SciDAC proposal

4 A Brief History: ESG-II,  SciDAC Program announced, began proposal in 2000  ANL, LANL, LBNL, LLNL, NCAR, ORNL, USC/ISI  “Turning Climate Datasets into Community Resources”  New focus on web-based portals, metadata, seamless access to archival storage, security, operational service  Uncertain about size of audience, hoping for  Very positive mid-term assessment in 2003  PCMDI accepted WGCM/CMIP role in 2004  Operational CCSM portal in 2004  Operational IPCC/CMIP portal later in 2004  In 2006, 200 TB of data, 4000 users, 130TB served

5 Purpose and Scope Purpose  Provide climate researchers worldwide with access to data, information, models, analysis tools, and computational resources required to make sense of enormous climate simulation datasets Scope  Petabyte-scale data volumes  Gateway to climate change data products, model outputs and informational sites (i.e., globally federated sites)  Comprehensive registry of climate change Earth Science research results and components  Support climate change and its partner scientists, analysts, data managers, educators and decision makers  Resource to national and international science and societal benefit initiatives  Resource to climate change data products through interoperable web service and climate analysis tools

6 Objectives  Meet specific distributed database, data access, and data movement needs of national and international climate projects  Provide a universal and secure web-based data access portal for broad multi-model data collections  Provide a wide-range of Grid-enabled climate data analysis tools and diagnostic methods to international climate centers and U.S. government agencies.  Develop Grid technology that enhances data accessibility and usability  Make newly developed tools and technologies available for use in other domains

7 Project Participants and Focus Areas

8 Project Team  ANL Rachana Ananthakrishnan Ian Foster Neill Miller Frank Siebenlist  LBNL Junmin Gu Vijaya Natarajan Arie Shoshani Alex Sim  LLNL Robert Drach Dean N. Williams  LANL Phil Jones  NCAR David Brown Julien Chastang Luca Cinquini Peter Fox Danielle Harper Nathan Hook  NCAR (cont.) Don Middleton Eric Nienhouse Gary Strand Patrick West Hannah Wilcox Nathaniel Wilhelmi Stephan Zednik  PMEL Steve Hankin Roland Schweitzer  ORNL David Bernholdt Meili Chen Jens Schwidder Sudharshan Vazhkudai  USC/ISI S. Bharathi Ann Chervenak Robert Schuler Mei-Hui Su Key Institutional PI Project Co-PI Project Lead PI Executive Committee

9 Project Organization

10 Concept Overview Workstation Applications, Thick Clients Standard Browser, Web Services

11 Capabilities, Usage, and Impact Capabilities  “Virtual Datasets” created through subsetting and aggregation  Metadata-based search and discovery  Bulk data access  Web-based access Usage  Archive Facts NCAR Gateway  Data holdings: 198 TB  Registered users: 13,000+  Data Downloaded:100 TB  PCMDI/LLNL CMIP3 Gateway  Data holdings: 35 TB  Registered users: 3,000+  Data Downloaded:600+ TB  Over 500 sites worldwide Over 500 scientific papers published based CMIP3 data Average downloads: 400 to 600 GB/day

12 Data Integration Challenges Facing Climate Science  Modeling groups will generate more data in the near future than exist today  Large part of research consists of writing programs to analyze data  How best to collect, distribute, and find data on a much larger scale? At each stage tools could be developed to improve efficiency Substantially more ambitious community modeling projects (Petabyte (PB ) and Exabyte (EB )) will require a distributed database  Metadata describing extended modeling simulations (e.g., atmospheric aerosols and chemistry, carbon cycle, dynamic vegetation, etc.) (But wait there’s more: economy, public health, energy, etc. )  How to make information understandable to end-users so that they can interpret the data correctly  More users than just Working Group (WG) 1-science. (WG2-impacts and WG3- mitigation) (Policy makers, economists, health officials, etc.)  Integration of multiple analysis tools, formats, data from unknown sources  Trust and security on a global scale (not just an agency or country, but worldwide )

13 Complexity of Data Distribution  Future coupled runs will produce much larger data sets  Storage and retrieval needs new thinking  Additional quality assurance data and software  Tools to facilitate publication and cataloging of output Publication - the act of putting data in the database and making it visible to others Cataloging - describes information about where a data set, file or database entity is located  Automated updating of output availability/status pages  Automated notification to users with updates tailored to their interests (new, withdrawn, replaced data)  Sophisticated discovery capabilities  Common data transfer tasks can be automated

14 It’s All About the Data  Data publication  Data access  Data viewing  Data sharing  Data versioning  Data replication  Data products  Data delivery  Standards and interoperability

15 Strategic Challenges for ESG-CET  Sustain and build upon the existing ESG archives  Address future scientific needs for data management and analysis by extending support for sharing and diagnosing climate simulation data Coupled Model Intercomparison Project, Phase 5 (CMIP5) for scientists contributing to the IPCC Fifth Assessment Report (AR5) in 2010 SciDAC II: A Scalable and Extensible Earth System Model for Climate Change Science The Climate Science Computational End Station (CCES) The North American Regional Climate Change Assessment Program (NARCCAP) Other wide-ranging climate model evaluation activities  How to make information understandable to end-users so that they can interpret the data correctly  Local and remote analysis and visualization tools in a distributed environment (i.e., subsetting, concatenating, regridding, filtering, …) Integrating analysis into a distributed environment Providing climate diagnostics Delivering climate component software to the community

16 CMIP5 (IPCC AR5) is a Major Driver for ESG Development  CMIP5 multi-model archive expected to include 3 suites of experiments (“Near-Term” decadal prediction, “Long-Term century & longer), and “Atmosphere-Only”) 40+ models 600+ TB “core” data, 6+ PB total data Contributed by 25+ modeling centers in 17+ countries  Driver for scale of data, global distribution  Timeline fixed by IPCC  Already working with key international partners to establish testbed Program for Climate Model Diagnosis and Intercomparison - PCMDI (U.S.) National Center for Atmospheric Research - NCAR (U.S.) Oak Ridge National Laboratory – ORNL (U.S.) Geophysical Fluid Dynamics Laboratory - GFDL (U.S.) British Atmosphere Data Centre - BADC (U.K.) Max Planck Institute for Meteorology - MPIM (Germany) JAMSTEC and University of Tokyo Center for Climate System Research (Japan)

17 ESG-CET AR5 Timeline  2008: Design and implement core functionality: Browse and search Registration Single sign-on / security Publication Distributed metadata Server-side processing  Early 2009: Testbed Plan to include at least seven centers in the US, Europe, and Japan  2009: Deal with system integration issues, develop production system  2010: Modeling centers publish data  : Research and journal articles submissions  2013: IPCC AR5 Assessment Report

18 Key: - Relying on ESG to reach their goals are highlighted in “italic blue” - Relying on ESG to develop tools and technologies are highlighted in “italic red” - Relying on ESG to deliver their products to the climate science community are in “italic green” ESG-CET Collaborates Extensively  Leverage best-in-class tools and capabilities developed elsewhere  Increase outreach, ability to serve scientific community, impact  Joint development of new ideas, technologies of common interest

19 Accomplishments: Development  Gateway web application (new)  Data Node components integration (new publishing client integrated with existing TDS and LAS servers, and with Gateway)  Security architecture for federation across Gateways and partner Data Centers OpenID for web SSO MyProxy integration for rich client access Web Services for user attributes retrieval  Architecture for metadata exchange among Gateways and partner Data Centers (based on OAI-PMH)  BeStMan middleware for deep storage files retrieval (new)  Handling and access of detailed model metadata (in collaboration with Earth System Curator) Two major accomplishments are the Gateway and the Data Node which form the main components of the ESG-CET architecture.

20 Accomplishments: Operational  Sustained data deliver from 2004 – present from three ESG data portals  Register over 16,000 users worldwide  Over 700 TB downloaded (coming up on 1 PB milestone)  Reached milestone of 500 scientific research papers published based on CMIP3  Added C-LAMP, NARCCAP, and CFMIP to the distributed archive

21 Future Plans Short-term: Packaging and documentation of Gateway software Packaging and documentation of the Data Node software Integration with Data Mover Lite (DML) Federation with partner data centers  Longer-term: Gateway customization Expanded visualization services Gateway and Data Node invoking more of the LAS functionality GIS services Google Earth services Remote query services for rich client access User and Group workspaces Server-side processing and analysis services