In Search of What Some of It Means RDA Semantics and Metadata Workshop Feb 23, 2015 Peter Fox (RPI) Tetherless World Constellation.

Slides:



Advertisements
Similar presentations
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Advertisements

DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
XInformatics; bridging the gap between science and discipline neutral cyberinfrastructure with semantics: The Journey from 2004 to 2010 and Beyond Peter.
Evolving the BCO-DMO search interface - experience with semantic and smart search Cyndy Chandler (WHOI) Peter Fox (RPI and WHOI) Robert Groman, Dicky Allison.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Information Fusion: Moving from domain independent to domain literate approaches Professor Deborah L. McGuinness Tetherless World Constellation, Rensselaer.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Scientific Knowledge Discovery in Complex Semantic Networks of Geophysical Systems (no pressure…) EGU2012, NP2.6 April 25, 2012, Vienna, Austria Peter.
“Semantics” for Innovation in Visualization and Multimedia: Smarter Information Science ICSTI Workshop February 8, 2011, Redmond WA Peter Fox (RPI)
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Using DCO Data (Infrastructure, Management, Analysis, Visualization, …) Peter (Marshall Ma) and the Data Science
Balancing Expressivity and Implementability in OWL Ontologies for Semantic Data Frameworks: The Journey from 2004 to 2009 and Beyond Peter Fox Tetherless.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Peter Fox CSCI Week 9, October 27, 2008.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
Configurable User Interface Framework for Cross-Disciplinary and Citizen Science Presented by: Peter Fox Authors: Eric Rozell, Han Wang, Patrick West,
Fox OOS meeting 1 Ontologies and Semantic Applications in Earth Sciences Peter Fox (TWC/RPI; formerly HAO/NCAR) Thanks to many. Projects funded.
Knowledge representation
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI Week 10, November.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
The Rise of Informatics as-a Research Domain WIRADA Science Symposium August 2, 2011, Melbourne Peter Fox (RPI and WHOI)
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
1 Practical aspects of creating semantic web applications Peter Fox (RPI) ESIP Summer Meeting Knoxville, TN, July 21, 2010, 15:30pm Slides at:
Transparency, applications, and ab- stuff – effect on tools for e-science: it’s all about Informatics June 21, 2010, IATUL 2010 Peter Fox (RPI and WHOI)
Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Service Service metadata what Service is who responsible for service constraints service creation service maintenance service deployment rules rules processing.
Semantics and analytics = making the data and the decisions smarter? Digital Antiquity CI Feb 7-8, 2013, Arlington VA Peter Fox (RPI and WHOI)
Knowledge Networks and Science Data Ecosystems December 7, 2012, AGU12 IN54A-02. Peter Fox (RPI/ Tetherless World Constellation and WHOI/AOP&E)
Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) Tetherless World.
PHS / Department of General Practice Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Knowledge representation in TRANSFoRm AMIA.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
1 RDA and Metadata Peter Fox (my view) Metadata session
Facilitating Next Generation Science Collaboration: Marine Ecosystems Status Reports and Assessments June 24, 2014 IMBER – D2 Peter Fox (RPI/ Tetherless.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
An Open-World Iterative Methodology for the Development and Evaluation of Semantically-Enabled Applications IAAI - Session 23F Robert S. Engelmore Award*
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
TWC A use case-driven iterative method for building a provenance-aware GCIS ontology Xiaogang Ma a, Jin Guang Zheng a, Justin Goldstein b,c, Linyun Fu.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
NMFS Use Case 1 review/ evaluation and next steps April 19, 2012 Woods Hole, MA Peter Fox (RPI* and WHOI**) and Andrew Maffei (WHOI) *Tetherless World.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
CIMA and Semantic Interoperability for Networked Instruments and Sensors Donald F. (Rick) McMullen Pervasive Technology Labs at Indiana University
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Information Model Driven Semantic Framework Architecture and Design for Distributed Data Repositories AGU 2011, IN51D-04 December 9, 2011 Peter Fox (RPI)
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
The Role of Virtual Observatories and Data Frameworks in an Era of Big Data NIST bIG dATA June 14, 2012, Gaithersburg, MD Peter Fox (RPI and WHOI)
1 Ontological Foundations For SysML Henson Graves September 2010.
The Semantic eScience Framework AGU FM10 IN22A-02 Deborah McGuinness and Peter Fox (RPI) Tetherless World Constellation.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
‘Ontology Management’ Peter Fox (Semantic Web Cluster lead)
The Semantic Web By: Maulik Parikh.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
The state of VOEvent semantics THE US NATIONAL VIRTUAL OBSERVATORY
Presentation transcript:

In Search of What Some of It Means RDA Semantics and Metadata Workshop Feb 23, 2015 Peter Fox (RPI) Tetherless World Constellation

Metadata and documentation

Not more code!

Spectral synthesis components and flow

Getting the metadata?

6 What I wanted ~ Scientists should be able to access a global, distributed knowledge base of scientific data that: appears to be integrated appears to be locally available But… data is obtained by multiple means (instruments, models, analysis) using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non- existent) metadata. It may be inconsistent, incomplete, evolving, and distributed. And, it is almost always created in a manner to facilitate its generation not its use. And… there exist(ed) significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology…

What I was doing… pro read_spec, spectra_name, description, auxiliary_info, model_size, mu_size, wave_size, model, smodel, mu, wave0, wavelength, intensity, brightness_temperature, index1, index2, percent ncopts = 0; description_start=0 description_edges=80 i=0 j=0 k=0 ; Construct the DB filename ncid=ncdf_open(string(getenv("SPECTRA "))) inq_struct=ncdf_inquire(ncid) ; /* get dimension info */ tmp_id = ncdf_dimid(ncid, "comment_dim") ncdf_diminq,ncid, tmp_id, dummy, comment_dim tmp_id=ncdf_dimid(ncid, "mu_dim") ncdf_diminq,ncid, tmp_id, dummy, mu_dim tmp_id=ncdf_dimid(ncid, "wave_dim") ncdf_diminq,ncid, tmp_id, dummy, wave_dim tmp_id=ncdf_dimid(ncid, "model_dim") ncdf_diminq,ncid, tmp_id, dummy, model_dim tmp_id=ncdf_dimid(ncid, "smodel_dim") ncdf_diminq,ncid, tmp_id, dummy, smodel_dim tmp_id=ncdf_dimid(ncid, "item_dim") ncdf_diminq,ncid, tmp_id, dummy, item_dim

What I was doing… etc. tmp_id = ncdf_varid (ncid, "description") ncdf_varget,ncid, tmp_id, OFFSET=0,COUNT=comment_dim, description ; Id's for variables tmp_id=ncdf_varid(ncid, "spectra_name") ncdf_varget,ncid, tmp_id, OFFSET=0,COUNT=comment_dim, spectra_name tmp_id=ncdf_varid(ncid, "auxiliary_info") ncdf_varget,ncid, tmp_id, OFFSET=0,COUNT=comment_dim, auxiliary_info tmp_id=ncdf_varid(ncid, "model_size") ncdf_varget,ncid, tmp_id, OFFSET=0,COUNT=item_dim, model_size start=intarr(1) edges=intarr(1) start(0)=0 edges(0)=model_size tmp_id=ncdf_varid(ncid, "mu_size") ncdf_varget,ncid, tmp_id, mu_size, OFFSET=start, COUNT=edges tmp_id=ncdf_varid(ncid, "model") ncdf_varget,ncid, tmp_id, model, OFFSET=start, COUNT=edges start=intarr(2) edges=intarr(2) start(0)=0 edges(0)=smodel_dim start(1)=0 edges(1)=model_size tmp_id=ncdf_varid(ncid, "smodel") ncdf_varget,ncid, tmp_id, smodel, OFFSET=start, COUNT=edges

What does It all Mean?

Some version of this… 10 DataInformationKnowledge Context Presentation Organization Integration Conversation Creation Gathering Experience ~Metadata?

It and Meaning It = things that matter –Context Meaning = duh -> semantics Relations!! Real ones! But it was more than that, though that often comes later… –Syntax (structure/form) –Semantics (meaning) –Pragmatics (use)

Metadata-Information- Knowledge Ecosystem 12 MetadataInformationKnowledge Context Formalization Organization Integration Shared Conceptualization Creation Gathering Experience

Provenance Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility Provenance: metadata in a given context! Swallow that. Knowledge provenance; meaning and relations in multiple contexts!

Perfect is the enemy of the good… (thanks Voltaire)

Origins … In the need for capturing and preserving knowledge in science data became very clear but the barriers were high In 2004 we started a virtual observatory project based on semantic technologies Use case driven – in solar and solar-terrestrial physics with an emphasis on instrument-based measurements and real data pipelines; we needed implementations We knew we also needed integration and provenance (but that came later) We aimed to push semantics into our systems to build new ‘prototypes’ but we ‘failed’ ;-) Tetherless World Constellation15

In – OWL was a W3 recommendation!! Protégé 2.x and the Protégé-Java-OWL API SWOOP was a viable editor Jena and the Jena API were in good shape Pellet worked SPARQL was still a twinkle in the RDF working group’s eye Semantics were still the realm of computer scientists Tetherless World Constellation16

Design and Development We made a conscious decision only to develop ontologies that were required to answer specific use cases and migrate metadata –Both Classes AND Properties (uh-oh…) We made a conscious effort to use whatever ontologies were available (cf. trends in metadata… nuff said) We were pretty sure that rules would be needed (complex logic or late semantic binding) We ignored query (see implementation) Tetherless World Constellation17

18 Use Case example Plot the neutral temperature from the Millstone-Hill Fabry Perot, operating in the non-vertical mode during January 2000 as a time series. –Meanings and relations Objects=Things! –Neutral temperature is a (temperature is a) parameter –Millstone Hill is a (ground-based observatory is a) observatory –Fabry-Perot is a interferometer is a optical instrument is a instrument –Non-vertical mode is a instrument operating mode –January 2000 is a date-time range –Time is a independent variable/ coordinate –Time series is a data plot is a data product Metadata just appeared everywhere…

19 Knowledge representation Statements as triples: {subject-predicate-object} interferometer is-a optical instrument Fabry-Perot is-a interferometer Optical instrument has focal length Optical instrument is-a instrument Instrument has instrument operating mode Instrument has measured parameter Instrument operating mode has measured parameter NeutralTemperature is-a temperature Temperature is-a parameter A query*: select all optical instruments which have operating mode vertical An inference: infer operating modes for a Fabry-Perot Interferometer which measures neutral temperature

Semantics - Modern informatics enables a new scale-free** framework approach Use cases Stakeholders Distributed authority Access control Ontologies Maintaining Identity

21 Developing ontologies (c. 2005) Use cases and small team (7-8; 2-3 domain/ data experts, 2 knowledge experts, 1 software engineer, 1 facilitator, 1 scribe) Identify classes and minimal properties (leverage controlled vocab.) –Start with narrower terms, generalize when needed or possible –Adopt a suitable conceptual decomposition (e.g. SWEET) –Import modules when concepts are orthogonal –Add service classes and properties where needed Review, vet, publish Only code them (in RDF or OWL) when needed (CMAP, …) Ontologies: small and modular

Semantics between 2004 and 2009 Ontologies were needed for data integration and provenance and mediation for data mining Protégé 3.x and then 4.0 came out SWOOP development was interrupted Cmap added OWL predicate support* SPARQL became a recommendation Triple stores exploded in use and capability Linked Open Data started to take off Pellet 2.0 came out I used the “M” word less frequently! Tetherless World Constellation22

Working with knowledge Expressivity Maintainability/ Extensibility Implementability

Working with semantics Query Rule execution Inference

Semantics between 2009 and now Semantic data framework (SeSF) Substantial knowledge provenance work Data quality, uncertainty and bias representations and applications (oh, these are in production at NASA) Multi-sensor data synergy advisor Applications: –Sea Ice, Carbon Observatory, Integrated Ecosystem Assessments, globalchange.gov, ocean.data.gov, energy.data.gov …. Tetherless World Constellation25

Respect and Mediation … how

Discovering new data

NCA links to GCIS entities 28

Information model 29 Ontology

Core and Framework Semantics - Multi-tiered interoperability used by

Closing thoughts Go ahead, create all the metadata you want, we’ll “materialize” some of it into triples based on semantics for use! Go ahead, create all the schema and encodings you want but remember – semantics now lives in an open-world (some of it). You are not the only source of metadata. Not all formal. Link over map. Semantics make metadata useful but we do not need all of your metadata Tetherless World Constellation31

Contact