1 Using Semantically-Enabled Data Frameworks for Data Integration in Virtual Observatories Peter Fox * * HAO/ESSL/NCAR Deborah McGuinness $#, Luca Cinquini.

Slides:



Advertisements
Similar presentations
Geoinformatics 2008 Fox Semantic Provenance 1 Semantic Provenance for Image Data Processing Peter Fox (HAO/ESSL/NCAR) Deborah McGuinness (RPI) Jose Garcia,
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Chapter 1: The Database Environment
Chapter 7 System Models.
Page 1 CSISS LCenter for Spatial Information Science and Systems 03/19/2008 GeoBrain BPELPower Workflow Engine Liping Di, Genong Yu Center.
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
1 Development of a Community Ontology for Earth System Science Rob Raskin NASA/Jet Propulsion Laboratory Pasadena, CA March 20, 2008.
1 Community Ontologies for Earth System Science Rob Raskin NASA/Jet Propulsion Laboratory Pasadena, CA April 3, 2008.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
1 Preliminary results of the Environmental Data Exchange Network for Inland Waters (EDEN-IW) project Practical lessons. P. Haastrup.
Copyright 2006 Digital Enterprise Research Institute. All rights reserved. MarcOnt Initiative Tools for collaborative ontology development.
Layering in Provenance Systems Margo Seltzer May 13, 2009 Provenance in Secure and Advanced Computer Systems.
Server Access The REST of the Story David Cleary
Solve Multi-step Equations
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Week 2 The Object-Oriented Approach to Requirements
Computer Literacy BASICS
Fact-finding Techniques Transparencies
Chapter 11: Models of Computation
1 The phone in the cloud Utilizing resources hosted anywhere Claes Nilsson.
© 2011 TIBCO Software Inc. All Rights Reserved. Confidential and Proprietary. Towards a Model-Based Characterization of Data and Services Integration Paul.
26/10/2008 SWESE'08 1 Enhanced Semantic Access to Software Artefacts Danica Damljanović and Kalina Bontcheva.
XML and Databases Exercise Session 3 (courtesy of Ghislain Fourny/ETH)
1 IC GS J. Broome, Mar Introduction to the Informatics and Data Aspects John Broome (Canada)
1 A Case Study in E- Science: Building Ecological Informatics Solutions for Multi-Decadal Research ARL/CNI 2008 Conference Washington, DC 16 October 2008.
31242/32549 Advanced Internet Programming Advanced Java Programming
Science as a Process Chapter 1 Section 2.
Executional Architecture
Who are the Experts?Simon KampaSlide 1 Who are the Experts? Simon Kampa IAM Group University of Southampton
1 Using one or more of your senses to gather information.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
PSSA Preparation.
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
Chapter 13 The Data Warehouse
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy.
RefWorks: The Basics October 12, What is RefWorks? A personal bibliographic software manager –Manages citations –Creates bibliogaphies Accessible.
Management Information Systems, 10/e
Introduction Peter Dolog dolog [at] cs [dot] aau [dot] dk Intelligent Web and Information Systems September 9, 2010.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
From Model-based to Model-driven Design of User Interfaces.
Technology Exploration – Semantics Karen Moe NASA Earth Science Technology Office WGISS-37 Meeting April 14-18, 2014.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
1 Shifting the Burden from the User to the Data Provider Peter Fox High Altitude Observatory, NCAR (***) With thanks to eGY and various NSF, DoE and NASA.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Peter Fox CSCI Week 9, October 27, 2008.
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Version 2.0 [Review Date]
Configurable User Interface Framework for Cross-Disciplinary and Citizen Science Presented by: Peter Fox Authors: Eric Rozell, Han Wang, Patrick West,
Semantic Web Cyberinfrastructure for Virtual Observatories Deborah L. McGuinness Acting Director and Senior Research Scientist Knowledge Systems, AI Laboratory.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
McGuinness Geon 5/5/2005 SOLAR-TERRESTRIAL ONTOLOGIES (for VSTO and Beyond) Peter Fox 1, Deborah McGuinness 3, Don Middleton 2, Stan Solomon 1, Jose Garcia.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI Week 10, November.
1 The Virtual Observatory Exposed Peter Fox * * HAO/ESSL/NCAR Thanks to Deborah McGuinness $#, Luca Cinquini %, Patrick West *, Jose Garcia *, Tony Darnell.
Semantically-Enabled Science Data Integration (SESDI) and The Virtual Solar-Terrestrial Observatory (VSTO) Semantically-enabled (large-scale) Scientific.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
Semantically-Enabled Virtual Observatories: VSTO Highlights for Observational Data Deborah McGuinness Acting Director and Senior Research Scientist Knowledge.
Solar Terrestrial Ontologies – in Support of Virtual Observatories and Large Scale Semantic Scientific Data Integration Deborah McGuinness Co-Director.
The VIRTUAL SOLAR-TERRESTRIAL OBSERVATORY - Exploring paradigms for interdisciplinary data-driven science Peter Fox 1 Don Middleton 2,
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
Marine Metadata Interoperability Acknowledgements Ongoing funding for this project is provided by the National Science Foundation.
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
The Role of Virtual Observatories and Data Frameworks in an Era of Big Data NIST bIG dATA June 14, 2012, Gaithersburg, MD Peter Fox (RPI and WHOI)
improve the efficiency, collaborative potential, and
HAO/SCD: VO, metadata, catalogs, ontologies, querying
Presentation transcript:

1 Using Semantically-Enabled Data Frameworks for Data Integration in Virtual Observatories Peter Fox * * HAO/ESSL/NCAR Deborah McGuinness $#, Luca Cinquini %, Rob Raskin !, Krishna Sinha ^ Patrick West *, Jose Garcia *, Tony Darnell *, James Benedict $, Don Middleton %, Stan Solomon * $ McGuinness Associates # Knowledge Systems and AI Lab, Stanford Univ. % SCD/CISL/NCAR ! JPL/NASA ^ Virginia Tech Work funded by NSF and NASA

2 Paradigm shift for NASA From: Instrument-based To: Measurement-based (to find and use data irrespective of which instrument obtained it) Requires: bridging the discipline data divide Overall vision: To integrate information technology in support of advancing measurement-based processing systems for NASA by integrating existing diverse science discipline and mission-specific data sources. Science data IntegrationSemantics SWEET++ ontology Process-oriented semantic content represented in SWSL Articulation axioms E.g. Statistical Analysis Application

3 Compilation of distribution of volcanic ash associated with large eruptions. Note the continental scale ash fall associated with Yellowstone eruption ~600,000 years ago. Geologic databases provide the information about the magnitude of the eruption, and its impact on atmospheric chemistry and reflectance associated with particulate matter requires integration of concepts that bridge terrestrial and atmospheric ontologies.

4 Why we were led to semantics When we integrate, we integrate concepts, terms, etc. In the past we would: ask, guess, research a lot, or give up Its pretty much about meaning Semantics can really help find, access, integrate, use, explain, trust… What if you… -could not only use your data and tools but remote colleagues data and tools? -understood their assumptions, constraints, etc and could evaluate applicability? -knew whose research currently (or in the future) would benefit from your results? -knew whose results were consistent (or inconsistent) with yours?…

5

6

7 Excerpt from Plate Tectonics Ont.

8 High Level Ontology Packages : representing relationships between geologic concepts These capabilities are used to register details of databases A.K.Sinha, Virginia Tech, 2005

9 Virtual Observatory schematics a suite of software applications on a set of computers that allows users to uniformly find, access, and use resources (data, software, etc.) from a collection of distributed product repositories and service providers. a service that unites services and/or multiple repositories Conceptual examples: In-situ: Virtual measurements –Sensors, etc. everywhere –Related measurements Remote sensing: Virtual, integrative measurements –Data integration Both usage patterns lead to additional data management challenges at the source and for users; now managing virtual datasets

10 What should a VO do? Make standard scientific research much more efficient. –Even the principal investigator (PI) teams should want to use them. –Must improve on existing services (mission and PI sites, etc.). VOs will not replace these, but will use them in new ways. –Access for young researchers, non-experts, other disciplines, Enable new, global problems to be solved. –Rapidly gain integrated views, e.g. from the solar origin to the terrestrial effects of an event. –Find meaningful data related to any particular observation or model. –(Ultimately) answer higher-order queries such as Show me the data from cases where a large coronal mass ejection observed by the Solar-Orbiting Heliospheric Observatory was also observed in situ. (science-speak) or What happens when the Sun disrupts the Earths environment (general public)

11 The Astronomy approach … … VO App 1 VO App 2 VO App 3 DB 2 DB 3 DB n DB 1 VOTable Simple Image Access Protocol Simple Spectrum Access Protocol Simple Time Access Protocol VO layer Limited interoperability Lightweight semantics Limited meaning, hard coded Limited extensibility Under review

12 … … VO 1 VO 2 VO 3 DB 2 DB 3 DB n DB 1 Semantic mediation layer - VSTO - low level Semantic mediation layer - mid-upper-level Education, clearinghouses, other services, disciplines, etc. Metadata, schema, data Query, access and use of data Semantic query, hypothesis and inference Semantic interoperability Added value Mediation Layer Ontology - capturing concepts of Parameters, Instruments, Date/Time, Data Product (and associated classes, properties) and Service Classes Maps queries to underlying data Generates access requests for metadata, data Allows queries, reasoning, analysis, new hypothesis generation, testing, explanation, etc.

13 Integrative use-case: Find data which represents the state of the neutral atmosphere anywhere above 100km and toward the arctic circle (above 45N) at any time of high geomagnetic activity. Translate this into a complete query for data. What information do we have to extract from the use- case? What information we can infer (and integrate)? Returns data from instruments, indices and models!!

14 Translating the Use-Case - non-monotonic? Input Physical properties: State of neutral atmosphere Spatial: Above 100km Toward arctic circle (above 45N) Conditions: High geomagnetic activity Action: Return Data Specification needed for query to CEDARWEB Instrument Parameter(s) Operating Mode Observatory Date/time Return-type: data GeoMagneticActivity has ProxyRepresentation GeophysicalIndex is a ProxyRepresentation (in Realm of Neutral Atmosphere) Kp is a GeophysicalIndex hasTemporalDomain: daily hasHighThreshold: xsd_number = 8 Date/time when KP => 8

15 Translating the Use-Case - ctd. Input Physical properties: State of neutral atmosphere Spatial: Above 100km Toward arctic circle (above 45N) Conditions: High geomagnetic activity Action: Return Data Specification needed for query to CEDARWEB Instrument Parameter(s) Operating Mode Observatory Date/time Return-type: data NeutralAtmosphere is a subRealm of TerrestrialAtmosphere hasPhysicalProperties: NeutralTemperature, Neutral Wind, etc. hasSpatialDomain: [0,360],[0,180],[100,150] hasTemporalDomain: NeutralTemperature is a Temperature (which) is a Parameter FabryPerotInterferometer is a Interferometer, (which) is a Optical Instrument (which) is a Instrument hasFilterCentralWavelength: Wavelength hasLowerBoundFormationHeight: Height ArcticCircle is a GeographicRegion hasLatitudeBoundary: hasLatitudeUpperBoundary: GeoMagneticActivity has ProxyRepresentation GeophysicalIndex is a ProxyRepresentation (in Realm of Neutral Atmosphere) Kp is a GeophysicalIndex hasTemporalDomain: daily hasHighThreshold: xsd_number = 8 Date/time when KP => 8

16 Partial exposure of Instrument class hierarchy - users seem to LIKE THIS Semantic filtering by domain or instrument hierarchy

17

18 Inferred plot type and return formats for data products

19 Inferred plot type and return required axes data

VSTO - semantics and ontologies in an operational environment: vsto.hao.ucar.edu, Web Service

21

22 VSTO Notable progress Conceptual model and architecture developed by combined team; KR experts, domain experts, and software engineers Semantic framework developed and built with a small, cohesive, carefully chosen team in a relatively short time (deployments in 1st year) Production portal released, includes security, etc. with community migration (and so far endorsement) VSTO ontology version 1.0, (vsto.owl) Web Services encapsulation of semantic interfaces More Solar Terrestrial use-cases to drive the completion of the ontologies - filling out the instrument ontology Using ontologies in other applications (volcanoes, climate, …)

23 Developing ontologies Use cases and small team (7-8; 2-3 domain experts, 2 knowledge experts, 1 software engineer, 1 facilitator, 1 scribe) Identify classes and properties (leverage controlled vocab.) –VSTO - narrower terms, generalized easily –Data integration - required broader terms –Adopted conceptual decomposition (SWEET) –Imported modules when concepts were orthogonal Minimal properties to start, add only when needed Mid-level to depth - i.e. neither TD nor BU Review, review, review, vet, vet, vet, publish - (experiences, results, lessons learned, AND your ontologies AND discussions) Only code them (in RDF or OWL) when needed (CMAP, …) Ontologies: small and modular

24 What has KR done for us? In addition to valued added noted previously - some of which is transparent Allowed scientists to get data from instruments they never knew of before Reduced the need for 8 steps to query to 3 and reduced choices at each stage Allowed augmentation and validation of data Useful and related data provided without having to be an expert to ask for it Integration and use (e.g. plotting) based on inference Ask and answer questions not possible before

25 Scaling to large numbers of data providers Crossing disciplines Security, access to resources, policies Branding and attribution (where did this data come from and who gets the credit, is it the correct version, is this an authoritative source?) Provenance/derivation (propagating key information as it passes through a variety of services, copies of processing algorithms, …) Data quality, preservation, stewardship, rescue Interoperability at a variety of levels (~3) Issues for Virtual Observatories Semantics can help with many of these

26 Final remarks Many geoscience VOs are in production Unified workflow around Instrument/Parameter/ Data-Time Working on event/phenomenon To date, the ontology creation and evolution process is working well Informatics efforts in Geosciences are exploding –GeoInformatics Town Hall at EGU meeting Thu lunchtime, Apr in Vienna and 3 geoinformatics sessions (US10, GI10/GI11 and NH12) –VO conference - June in Denver, CO –e-monograph to document state of VOs –NEW Journal of Earth Science Informatics –Special issue of Computers and Geosciences: Knowledge Representation in Earth and Space Science Cyberinfrastructure Ongoing activities for VOs through 2008 under the auspices of the Electronic Geophysical Year (eGY; Contact

27 Garage

28

29

30 [SO 2 ] t,X WOVODATCDML Data constr. WOVODAT MD CMDL MD Spectrometer Mass Spec MultiCollect. Mass Spec

31 What about Earth Science? SWEET (Semantic Web for Earth and Environmental Terminology) – –based on GCMD terms –modular using faceted and integrative concepts VSTO (Virtual Solar-Terrestrial Observatory) – –captures observational data (from instruments) –modular, using application domains GEON – –Planetary material, rocks, minerals, elements –modular, in packages SESDI – –broad discipline coverage, diverse data –highly modular MMI – –captures aspects of marine data, ocean observing systems –partly modular, mostly by developed project GeoSciML – –is a GML (Geography ML) application language for Geoscience –modular, in packages More… Visit swoogle.umbc.edu and planetont.org

32

33

34

35

36 Data Types Volcanic System Climate PhenomenonMaterial Instruments Import NASA: Semantic Web for Earth Science Numerics Ontology Import NASA: Semantic Web for Earth Science Units Ontology Import NASA: Semantic Web for Earth Science Physical Property OntologyPhysical Property Import NASA: Semantic Web for Earth Science Physical Phenomena Ontology Planetary Material Planetary Structure Physical Properties PlanetaryLocation Geologic Time GeoImage Planetary Phenomenon IMPORT EXISTING ONTOLOGIES SWEETGEON