The Earth System Grid Discovery and Semantic Web Technologies Line Pouchard Oak Ridge National Laboratory Luca Cinquini, Gary Strand National Center for.

Slides:



Advertisements
Similar presentations
X-SIGMA (An XML based Simple data Integration system for Gathering, Managing and Accessing scientific experimental data in grid environments) Karpjoo
Advertisements

National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Earth System Curator Spanning the Gap Between Models and Datasets.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Toni Saarinen, Tite4 Tomi Ruuska, Tite4 Earth System Grid - ESG.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
January, 23, 2006 Ilkay Altintas
CCSM Portal/ESG/ESGC Integration (a PY5 GIG project) Lan Zhao, Carol X. Song Rosen Center for Advanced Computing Purdue University With contributions by:
SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014.
The Earth System Curator Metadata Representations Prototype Portal in Collaboration with ESMF and ESG Rocky Dunlap Spencer Rugaber Georgia Tech.
NCAR NCAR Data and Grid Efforts: The Earth System Grid & The Community Data Portal Don Middleton NCAR Scientific Computing Division CAS2003 September 11,
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Dataset Citation: From Pilot to Production Mark Martin Assistant Director, Office of Scientific and Technical Information U.S. Department of Energy.
References: [1] [2] [3] Acknowledgments:
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
The Earth System Grid: A Visualisation Solution Gary Strand.
Web Portal Design Workshop, Boulder (CO), Jan 2003 Luca Cinquini (NCAR, ESG) The ESG and NCAR Web Portals Luca Cinquini NCAR, ESG Outline: 1.ESG Data Services.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
NIEeS Workshop, Cambridge (UK), Sep 2002 Luca Cinquini for the Earth System Grid METADATA DEVELOPMENT for the EARTH SYSTEM GRID Luca Cinquini (SCD/NCAR)
- Vendredi 27 mars PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL.
ESG Observational Data Integration Presented by Feiyi Wang Technology Integration Group National Center of Computational Sciences.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Access Control for NCAR Data Portals A report on work in progress about the future of the NCAR Community Data Portal Luca Cinquini GO-ESSP Workshop, 6-8.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
1 Overall Architectural Design of the Earth System Grid.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
XMC Cat: An Adaptive Catalog for Scientific Metadata Scott Jensen and Beth Plale School of Informatics and Computing Indiana University-Bloomington Current.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Application of RDF-OWL in the ESG Ontology Sylvia Murphy: Julien Chastang: Luca Cinquini:
Distributed Archives Interoperability Cynthia Y. Cheung NASA Goddard Space Flight Center IAU 2000 Commission 5 Manchester, UK August 12, 2000.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech 5 th GO-ESSP Community Meeting.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
International Planetary Data Alliance Registry Project Update September 16, 2011.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
World Conference on Climate Change October 24-26, 2016 Valencia, Spain
INTAROS WP5 Data integration and management
The Earth System Grid: A Visualisation Solution
HAO/SCD: VO, metadata, catalogs, ontologies, querying
Robert Dattore and Steven Worley
Presentation transcript:

The Earth System Grid Discovery and Semantic Web Technologies Line Pouchard Oak Ridge National Laboratory Luca Cinquini, Gary Strand National Center for Atmospheric Research Scientific Web Technologies for Searching and Retrieving Scientific Data ISWCII, Sanibel Island, FL, October 20, 2003

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 A geographically distributed team of climate and computer scientists: –Climate scientists are our target users – simultaneous users –Scientists providing expertise and leadership to the Inter- Governmental Panel on Climate Change (IPCC) A computing and data Grid collaboratory sponsored by the US Department of Energy. A distributed system for storage, access, and discovery of post-processing data resulting from climate simulations on super-computers.

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 Grid and Network Infrastructure Online storage systems Computational resources ? R CAS ESG services: information, replica, metadata, community authorization M Data consumers Data producers ESG: Collaboration Network

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 Current Status of Climate Data Data sizes (estimated to be produced in the next 3-4 years for IPCC), types of storage, location of storage –NCAR (Boulder, CO): Terabytes, NERSC (Berkeley CA): TB, ORNL (Oak Ridge, TN): TB. Total: TB. –Stored on mass storage archives, disk caches and tapes. –Data replicated at 3 locations in the US. Data format conventions and simulation output formats –Minimal metadata produced or associated by current simulations. –Multiple output formats. –Many complex standards. Discovery and retrieval –Datasets are not described in details. –Metadata resides in the data manager’s head. –Largely manual access. –Different access mechanisms at different sites. Far from seamless automated data discovery and access

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 ESG goals for search and retrieval Enable searches and downloads through a seamless process –Data search across multiple sites and storage locations. –Access to all ESG functionality from the desktop through a single point of entry (a Web Data portal). –Some degree of access control (authentication, certificates). Keep track of datasets particularly on deep storage (archives, caches, tapes) –Data formats –Find related datasets: “campaign,” “ensembles” –Simulation model descriptions and configurations –Related simulations: “parent,” “child,” “sibling” –Browse-able, search-able, and extensible metadata Several levels of users –easy-to-use, integrated tools (otherwise, no one will use them) Collaborate with other groups: CCLRC e-Science Center and the British Atmospheric Data Center.

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 Discovery: Ontology and Metadata Services ESG CLIENTS API & USER INTERFACES PUBLISHING SEARCH & DISCOVERY BROWSING & DISPLAY METADATA DISPLAY METADATA DISPLAY METADATA BROWSING METADATA BROWSING METADATA QUERY METADATA QUERY METADATA DISCOVERY METADATA DISCOVERY METADATA REGISTRATION METADATA REGISTRATION HIGH LEVEL METADATA SERVICES METADATA ACCESS (update, insert, delete, query) METADATA ACCESS (update, insert, delete, query) CORE METADATA SERVICES METADATA HOLDINGS Metadata Catalogs Legacy Data Catalogs

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 Motivations for a prototype ontology Development of an ESG metadata schema –Help structure and guide the development efforts –Provide a context Trust –Provenance and logistic information –Data quality and curation Prepare for a federation of data sources and inter-operability between metadata schemas –the ability to perform searches across these sources from a single point of entry.

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 ESG ontology concepts and relationships Datasets –Files names (tells a lot) –Formats and conventions –Coverage (space, time, multi- dimensional physical grids) –Calendar years –Parameters –Related datasets –Campaigns ESG Service –Used_by Pedigree –Participants, roles in ESG –Provenance – traces origins and transformations –Is_generated_by –Storage location Scientific Use: Simulations –has_parent, has-child, has_sibling –Input_type –hardware_type

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 Guiding principles for the development of an ESG ontology Separate entities describing “things” from entities describing processes. Decouple concepts specific to a domain area from those common to other (Grid) projects. Keep terminology intuitive to users. Make explicit relationships between XML elements. Ontology tools were used to analyze current ESG schemas at every stage of development.

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 Object [1] id Object [1] id Activity [0,1] name [0,1] description [0,1] rights [0,n] date type= [0,n] note [0,n] participant role= [0,n] reference uri= Activity [0,1] name [0,1] description [0,1] rights [0,n] date type= [0,n] note [0,n] participant role= [0,n] reference uri= isA Investigation isA Project [0,n] topic type= [0,1] funding Project [0,n] topic type= [0,1] funding isA Ensemble Campaign isPart Of Simulation [0,n] simulationInput type= [0,n] simulationHardware Simulation [0,n] simulationInput type= [0,n] simulationHardware Observation Experiment Analysis isPartOf hasParent hasChild hasSibling Dataset [0,1] type [0,1] conventions [0,n] date type= [0,n] format type= uri= [0,1] timeCoverage [0,1] spaceCoverage Dataset [0,1] type [0,1] conventions [0,n] date type= [0,n] format type= uri= [0,1] timeCoverage [0,1] spaceCoverage isA generated By isPart Of Person [0,1] firstName [0,1] lastName [0,1] contact Person [0,1] firstName [0,1] lastName [0,1] contact Institution [0,1] name [0,1] type [0,1] contact Institution [0,1] name [0,1] type [0,1] contact isA works For participant role= Class AbstractClass inheritance association LEGEND Service [0,1] name [0,1] description Service [0,1] name [0,1] description serviceId ParameterList hasParamet ers Parameter [1] name [0,1] mapping authority= Parameter [1] name [0,1] mapping authority= hasParameter

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003

Physical File Names Storage Location Storage ESG Portal Discovery Service Metadata Catalog Service Replica Location Services Searches Logical File Names Metadata Logical File Names Searches Download Discovery Services Architecture

Line Pouchard, Oak Ridge National LaboratoryOctober 20, 2003 Leveraging Semantic Web efforts in Grid projects The Semantic Web –Highlighted the need for sharing information based on content. –Provided web-based languages for knowledge acquisition and reasoning. –Offers directions for ontology reconciliation. –There exists ontologies in the Earth Sciences. Challenges presented by ESG –Real-life complexity. –Scientists as beginners and expert users demand usability … –Measures of success. –Changing a scientist ’s work habits requires an immediate and visible payoff –Data sizes: scalability of the approach.