Fox 2 AISRP April 4-6, 2005  Earth System Grid  Grid-enabled OPeNDAP  Architecture - Server and Application access  Framework experience.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Data Management Expert Panel - WP2. WP2 Overview.
A. Sim, CRD, L B N L 1 ANI and Magellan Launch, Nov. 18, 2009 Climate 100: Scaling the Earth System Grid to 100Gbps Networks Alex Sim, CRD, LBNL Dean N.
Earth System Curator Spanning the Gap Between Models and Datasets.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech.
OPeNDAP’s Server4 Building a High Performance Data Server for the DAP Using Existing Software Building a High Performance Data Server for the DAP Using.
1 SRM-Lite: overcoming the firewall barrier for large scale file replication Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory April, 2007.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Toni Saarinen, Tite4 Tomi Ruuska, Tite4 Earth System Grid - ESG.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Commodity Grid (CoG) Kits Keith Jackson, Lawrence Berkeley National Laboratory Gregor von Laszewski, Argonne National Laboratory.
The Virtual Solar-Terrestrial Observatory The Virtual Observatory Peter Fox HAO/ESSL/NCAR November 28, 2005.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Fox 2 January 4, 2005 Project sponsors âEarth System Grid - DOE/SciDAC âCoupled Energetics and Dynamics of Atmospheric Regions - NSF/GEO/ATM âVirtual.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
ESG The Earth System Grid (ESG) Presented by Don Middleton & Luca Cinquini NCAR Scientific Computing Division On Behalf of the ESG Team SCD Executive Committee.
The Earth System Grid (ESG) Goals, Objectives and Strategies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Intergrid KoM Santander 22 june, 2006 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The Earth System Grid: A Visualisation Solution Gary Strand.
Web Portal Design Workshop, Boulder (CO), Jan 2003 Luca Cinquini (NCAR, ESG) The ESG and NCAR Web Portals Luca Cinquini NCAR, ESG Outline: 1.ESG Data Services.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
NIEeS Workshop, Cambridge (UK), Sep 2002 Luca Cinquini for the Earth System Grid METADATA DEVELOPMENT for the EARTH SYSTEM GRID Luca Cinquini (SCD/NCAR)
Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
The VIRTUAL SOLAR-TERRESTRIAL OBSERVATORY - Exploring paradigms for interdisciplinary data-driven science Peter Fox 1 Don Middleton 2,
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
A Data Access Framework for ESMF Model Outputs Roland Schweitzer Steve Hankin Jonathan Callahan Kevin O’Brien Ansley Manke.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Access Control for NCAR Data Portals A report on work in progress about the future of the NCAR Community Data Portal Luca Cinquini GO-ESSP Workshop, 6-8.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
1 Overall Architectural Design of the Earth System Grid.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
SCD User Briefing The Community Data Portal and the Earth System Grid Don Middleton with presentation material developed by Luca Cinquini, Mary Haley,
USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May
Sun Earth Connection Distributed Data Services Presented at the Principle Investigator's Meeting NASA's Applied Information Systems Research Program 5.
Globus Data Storage Interface (DSI) - Enabling Easy Access to Grid Datasets Raj Kettimuthu, ANL and U. Chicago DIALOGUE Workshop August 2, 2005.
1 Earth System Grid Center for Enabling Technologies OPeNDAP Services for ESG March 9, 2016 Peter Fox, Patrick West, Stephan Zednik RPI Performance Measures.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
The NOAA Operational Model Archive and Distribution System NOMADS CEOS-Grid Application Status Report Glenn K. Rutledge NOAA NCDC CEOS WGISS-19 Cordoba,
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
1 Scientific Data Management Group LBNL SRM related demos SC 2002 DemosDemos Robust File Replication of Massive Datasets on the Grid GridFTP-HPSS access.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
The Earth System Grid: A Visualisation Solution
improve the efficiency, collaborative potential, and
HAO/SCD: VO, metadata, catalogs, ontologies, querying
Metadata Development in the Earth System Curator
Data Management Components for a Research Data Archive
Presentation transcript:

Fox 2 AISRP April 4-6, 2005  Earth System Grid  Grid-enabled OPeNDAP  Architecture - Server and Application access  Framework experience  Summary  Plans for the coming year Overview

Fox 3 AISRP April 4-6, 2005  The goal of ESG is to make climate data – particularly climate model data – an easily accessible community resource. The project is funded by the SciDAC program: Scientific Discovery through Advanced Computing.  Enabling researchers to understand and make effective use of very large, distributed climate datasets is critical. The broad strategy is to develop a collection of server-side capabilities – minimize the amount of data movement.  Multiple interfaces to ESG will allow researchers to focus on science rather than issues of data transfer, format, and data set manipulation.  Foundation is Globus Grid technology Earth System Grid Overview

Fox 4 AISRP April 4-6, 2005 Earth System Grid Portal

Fox 5 AISRP April 4-6, 2005 ESG: U.S. Collaborations & Development ORNL: Climate storage & computational resources ORNL: Climate storage & computational resources LANL: Next generation coupled models & computing LANL: Next generation coupled models & computing ANL: Computational grids, & grid-based applications ANL: Computational grids, & grid-based applications USC/ISI: Computational grids, & grid-based applications USC/ISI: Computational grids, & grid-based applications NCAR: Climate change predication and scenarios NCAR: Climate change predication and scenarios LBNL: Climate storage facility LBNL: Climate storage facility LLNL: Model diagnostics & inter-comparison LLNL: Model diagnostics & inter-comparison

Fox 6 AISRP April 4-6, 2005 ESG areas of development âAuthentication and Authorization services : application of Globus technologies for secure data management and access (PKI certificates, proxy delegation, Community Authentication Services, web interfaces) âData Transport Services: based on gridFTP protocol and implementation (high speed, tunable, multi-stream, reliable), extensions for multi-file management and connection to offline storage systems (Hierarchical Storage Management), and for transparent data access and operations (grid-enabled OPeNDAP) âMetadata services (for data management, access, search & discovery, annotation, analysis, etc.) âOther services: Data Analysis and Visualization, Task Management, Monitoring and Control, Replication of data, etc.

Fox 7 AISRP April 4-6, 2005 ESG: ESG-II Architecture

Fox 8 AISRP April 4-6, 2005 TOMCAT Servlet engine TOMCAT Servlet engine MCS Metadata Cataloguing Services MCS Metadata Cataloguing Services RLS Replica Location Services RLS Replica Location Services SOAP RMI MyProxy server MyProxy server MCS client RLS client MyProxy client GRAM gatekeeper GRAM gatekeeper CAS Community Authorization Services CAS Community Authorization Services CAS client disk MSS Mass Storage System HPSS High Performance Storage System disk HPSS High Performance Storage System disk SRM Storage Resource Management SRM Storage Resource Management SRM Storage Resource Management SRM Storage Resource Management SRM Storage Resource Management SRM Storage Resource Management SRM Storage Resource Management SRM Storage Resource Management gridFTP server gridFTP server gridFTP server gridFTP server gridFTP server gridFTP server gridFTP server gridFTP server openDAPg server openDAPg server CAS-enabled Striped-gridFTP server CAS-enabled Striped-gridFTP server LBNL LLNL ISI NCAR ORNL ANL Striped gridFTP client Striped gridFTP client gridFTP openDAPg server openDAPg server CAS-enabled Striped-gridFTP server CAS-enabled Striped-gridFTP server gridFTP openDAPg server openDAPg server CAS-enabled Striped-gridFTP server CAS-enabled Striped-gridFTP server gridFTP LAS Live Access Server LAS Live Access Server

Fox 9 AISRP April 4-6, 2005 NCAR LBNL LLNL ISI ANL ORNL GSI CAS server CAS client MyProxy clientMyProxy server TOMCAT SECURITY services GRAM METADATA services FRAMEWORK services Auth metadata RLS MySQL RLS MySQL RLS MySQL RLS MySQL NERSC HPSS NCAR MSS DISK ORNL HPSS DATA storage The Earth System Grid THREDDS catalogs Xindice MySQL OGSA-DAISMCS TRANSPORT services gridFTP server/client HRM openDAPg server ANALYSIS & VIZ services NCL openDAPg clientLAS server CDAT openDAPg client MONITORING services SLAMON daemon TOMCAT AXIS

Fox 10 AISRP April 4-6, 2005 Metadata-centric view of ESG services METADATA SERVICES METADATA SERVICES USER AUTHENTICATION AND AUTHORIZATION USER AUTHENTICATION AND AUTHORIZATION ACCESS AND AUTHORIZATION METADATA DATA TRANSPORT LOCATION METADATA SYSTEM MONITORING AND CONTROL SYSTEM MONITORING AND CONTROL LOGGING METADATA DATA SEARCH & DISCOVERY CONTENT METADATA ANNOTATION & HISTORY METADATA DATA ANALYSIS & VISUALIZATION DATA ANALYSIS & VISUALIZATION AGGREGATION METADATA DATA BROWSING CATALOGUING METADATA

Fox 11 AISRP April 4-6, 2005 OPeNDAP and Grid systems âDODS since ~ 1995 was based on http and cgi-style architecture âTwo concerns âApplication support and performance of HTTP âHousekeeping abilities of cgi architecture âSolution: evolve OPeNDAP, the discipline neutral aspect of DODS

Fox 12 AISRP April 4-6, 2005 OPeNDAP ctd. âData transport protocol and access protocol separated âRevised server architecture âAddress Grid-style authentication âMemory management âException handling âAll these changes and retain interoperation with HTTP and cgi âAdvanced requirements: URL should support more than one dataset, or object, i.e. aggregation

Fox 13 AISRP April 4-6, 2005 OPeNDAP 3.x vs OPeNDAP-g Architecture Simple and easy to install One CGI process per URL request Limited memory management – external Limited scalability Limited status reporting to web server Returns data stream from one format Standalone server or httpd module Can manage multiple daemon processes Strong memory management – internal Reuse processes, scales Coupled to OPeNDAP server for status Returns multiple formats in a single stream, multiple protocols

Fox 14 AISRP April 4-6, 2005

Fox 15 AISRP April 4-6, 2005 Application development

Fox 16 AISRP April 4-6, 2005 Status âOperational/production release of standalone OPeNDAP server (no dependence on web server) for ESG âRun OPeNDAP server as a client to GridFTP or HTTP server âMulti-protocol support: file, http, GridFTP, ftp, etc. âFile format support: netCDF, CDF, FITS, CEDAR, … âRe-architected for aggregation support and performance âPortal application client in production, netCDF client operational âAuthentication is handled outside OPeNDAP server framework âURL syntax is more complex but more expressive âWill become part of community OPeNDAP release very soon

Fox 17 AISRP April 4-6, 2005 ESG: Framework experience  ESG is a highly collaborative effort allowing users to quickly access data (petabytes of raw or processed data in an application independent manner).  Payoffs of this distributed collaborative infrastructure have included: Distributed data-sharing, RLS works! SRM/HRM work! OPeNDAP-g works! Simplified data discovery of climate data, the work on metadata paid off! Scalability? Large-scale climate data processing and analysis via highly integrated portal Increased collaboration among climate research scientists, people use it! Aid in climate assessments and estimates of future climate variability and trends, IPCC!

Fox 18 AISRP April 4-6, 2005 ESG: Framework experience âTransport - GridFTP versus HTTP 3Server to server 3Very good performance 7Depends on a very specific version of GRIDftp server (stripped) 7Clients are not as capable due to ‘weight’ of globus, revert to HTTP âScalability and response times (data AND metadata) 3Framework architecture supports re-layered for tuning âService monitoring 3to support the distributed collaborative infrastructure 7need lots or all services to really make a production environment work  Try out ESG by visiting the website at:

Fox 19 AISRP April 4-6, 2005 Success?  Users are generally happy, developers are very happy  Exploited new technology components  Integration - when and how does it work and scale? 3XML -> SQL 3DODS -> OPeNDAP and OPeNDAP-g  Globus provides a suite of framework components, some are easier to integrate than others, some just don’t fit our use-cases and architecture  Data framework - e.g. OPeNDAP has been extremely successful  Carrying this to space science (solar-terrestrial)

Fox 20 AISRP April 4-6, 2005 Vision for building science cyberinfrastructure  Use-case, then requirements  Then derive architecture and choose technology components  Build a working system for users from the start  Get your funding source and community to commit to an evolving architecture  If you choose a major framework technology, e.g. Globus, OPeNDAP, THREDDS, partner with them

Fox 21 AISRP April 4-6, 2005 One paradigm Goal - find the right balance of data/model holdings, portals and client software that a researchers can use without effort or interference as if all the materials were available on his/her local computer. E.g. The Virtual Solar-Terrestrial Observatory (VSTO) is proposed to be: a distributed, scalable education and research environment for searching, integrating, and analyzing observational, experimental and model databases in the fields of solar, solar-terrestrial and space physics Comprises: a system-like framework which provides virtual access to specific data, model, tool and material archives containing items from a variety of space- and ground-based instruments and experiments, as well as individual and community modeling and software efforts bridging research and educational use

Fox 22 AISRP April 4-6, 2005 Virtual Observatory? Need better glue Basic problem: schema are categorized rather than developed from an object model/class hierarchy -> significantly limits non-human use. However, they all form the basis to organize catalog interfaces for all types of data, images, etc. This limits data systems utilizing frameworks and prevents frameworks from truly interoperating (SOAP, WSDL only a start) Directories, e.g. NASA GCMD, CEDAR catalog, FITS (flat) keyword/ value pairs, are being turned into ontologies (SWEET, VSTO) Markup languages, e.g. ESML, SPDML, ESG/ncML are excellent bases Evolve, recast, merge (where appropriate) using formal processes, tools with intended use in mind - for interface specifications, reasoning, validation, etc. beyond the usual search and access

Fox 23 AISRP April 4-6, 2005 Summary  Basic success in both data systems and data frameworks  Satisfying user and sponsor needs (from ‘just’ to ‘outstanding’)  Experience with Globus ranges from very good, to not ready for our need  Experience with OPeNDAP is very good, esp. with core services  Scalability and performance require an adaptable architecture which is something system-level interfaces can still hide from the user  Challenge - to bring these attributes to a framework, i.e. in which the user is more exposed

Fox 24 AISRP April 4-6, 2005 Plans  IDL application level access to new OPeNDAP server framework  Outreach to NASA communities/data centers to install and test new capabilities (server and client)  Joint development of accompanying semantic catalogs for Sun-Earth Connection datasets within the OPeNDAP framework  SPDML-enabled OPeNDAP server