References: [1] Lebo, T., Sahoo, S., McGuinness, D. L. (eds.), 2013. PROV-O: The PROV Ontology. Available via: [2]

Slides:



Advertisements
Similar presentations
Towards a Common Provenance Model for Research Publications Linyun Fu Xiaogang Ma Patrick West Stace Beaulieu.
Advertisements

TWC Why Data Science Matters Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic Institute
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Experiences Developing a User- centric Presentation of A Domain- enhanced Provenance Data Model Cynthia Chang 1, Stephan Zednik 1, Chris Lynnes 2, Peter.
Applying Semantics in Dataset Summarization for Solar Data Ingest Pipelines James Michaelis ( ), Deborah L. McGuinness
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Global Change Information System: Information Model and Semantic Application Prototypes (GCIS-IMSAP) Status 01/08/2013 Stephan Zednik 1, Curt Tilmes 2,
Provenance Capture in Data Access And Data Manipulation Software Patrick West 1 Peter Fox
An Example in The DCO Data Portal Formal Specification of Data Types in the Deep Carbon Observatory Data Portal Xiaogang (Marshall) Ma
References: [1] [2] [3] Acknowledgments:
What has been lacking, until recently, is a successful method to develop, implement and sustain informatics solutions to modern application problems, such.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
Persistent Identification of Agents and Objects of Global Change: Progress in the Global Change Information System Peter Fox, RPI Curt Tilmes, NASA Xiaogang.
References: [1] Branch, B.D., Fosmire, M., The role of interdisciplinary GIS and data curation librarians in enhancing authentic scientific research.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Motivations and Challenges: Proper data management hinges on recording and maintaining “steps” applied to create data. Consumers require methods to assess.
Traceability, reproducibility, and scalability in Integrated Ecosystem Assessments: July 2013 ECO-OP is supported by NSF Grant # PIs: Peter Fox.
Local global disambiguation of terms and concepts The BCO-DMO metadata database uses controlled vocabularies to record many of the important pieces of.
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
DOAP – Description of a Project Ontology DOAP provides us with the ability to represent software, software projects, releases of software, licensing information,
TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
TWC Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Observatory Community Xiaogang (Marshall) Ma, Yu Chen, Han Wang, Patrick West,
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
Applying Provenance Extensions to OPeNDAP Framework Patrick West, James Michaelis, Tim Lebo, Deborah L. McGuinness Rensselaer Polytechnic Institute Tetherless.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Beginning with an NSF INTEROP project whose goal is to facilitate the deployment of an Integrated Ecosystem Approach (IEA) to management in the Northeast.
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
TWC Ontology Development for Provenance Tracing in National Climate Assessment of the US Global Change Research Program Xiaogang Ma a, Jin Guang Zheng.
TWC-SWQP: A Semantically-Enabled Provenance-Aware Water Quality Portal Ping Wang, Jin Guang Zheng, Linyun Fu, Evan W. Patton, Timothy Lebo, Li Ding, Joanne.
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
VIVO Conference 2013 Panel on VIVO Use-Cases for Collaborative Science: From Researcher Networks to Semantic User Interfaces for Data Patrick West – Tetherless.
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Facilitating Next Generation Science Collaboration: Marine Ecosystems Status Reports and Assessments June 24, 2014 IMBER – D2 Peter Fox (RPI/ Tetherless.
Toward verifiable science: iPython meets PROV-O (Semantics in Ecosystems Assessments). April 16, 2014 ERRT Peter Fox (RPI/ Tetherless World Constellation.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
TWC A use case-driven iterative method for building a provenance-aware GCIS ontology Xiaogang Ma a, Jin Guang Zheng a, Justin Goldstein b,c, Linyun Fu.
Supported by ESIP Semantic Web Cluster A service based on community-built semantic web applications Provide users with the means to match their datasets.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
How Environmental Informatics is Preparing Us for the Era of Big Data AGU FM 2013 GC11F-01 December 09, 2013, MW 3001 Peter
NOAA's Northeast Shelf Ecosystem Status Report: collaborating with IPython Notebooks for reproducibility July 2013 ECO-OP is supported by NSF Grant #
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
Annotating and Embedding Provenance in Science Data Repositories to Enable Next Generation Science Applications Deborah L. McGuinness.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
Get the poster at Semantic Visualization Provenance Records:
Provenance Capture in Data Access And Data Manipulation Software
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Ontology Evolution: A Methodological Overview
Deep Carbon Observatory Data Science Platform
Data types and persistent identifiers in
Modeling Data Set Versioning Operations
Ecosystem Status Report: collaborating with IPython Notebooks
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West1 Nancy
Towards Executable Provenance Graphs for Reported Results in Research Publications Linyun Fu Xiaogang Ma Patrick West
Modeling Data Set Versioning Operations
Presentation transcript:

References: [1] Lebo, T., Sahoo, S., McGuinness, D. L. (eds.), PROV-O: The PROV Ontology. Available via: [2] Acknowledgments : Deliberations of Reusing the PROV Ontology for Provenance Capture in Earth and Environmental Sciences Xiaogang (Marshall) Ma 1, Peter Fox 1, Curt Tilmes 2,3, Stace Beaulieu 4, Patrick West 1, Linyun Fu 1, Jin Guang Zheng 1 Tetherless World Constellation, Rensselaer Polytechnic Institute, Troy, NY, USA; 2 NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA; 3 U.S. Global Change Research Program, Suite 250, 1717 Pennsylvania Avenue, NW, Washington, DC 20006, USA; 4 Woods Hole Oceanographic Institution, Woods Hole, MA 02543, USA Background The W3C PROV Ontology (PROV-O) [1] provides an excellent model for representing, capturing and sharing provenance information on the Web. Meanwhile, the Earth and environmental science domain faces increasing demand for documenting and publishing provenance information of scientific data, methods and workflows. Currently we are still short of experience and best practices of deploying the PROV-O for Earth and environmental sciences. This presentation will discuss our experience of reusing the PROV-O for two recent research projects. The PROV-O provides three starting-points classes, Entity, Agent and Activity, and properties that relate them, as well as a number other classes and properties enriching the provenance representation (Fig. 1). In practices one can design domain ontologies including specific classes and properties and map those classes and properties into the PROV-O. Deliberations happen in the domain ontology design and the mapping. In our first project we faced the provenance reconstruction of a report so we took definition of entities and relationships among them as the main stream of provenance capture. For the second project we shifted the main stream to provenance capture of activities because that work was designed to be associated with the workflow of data processing and information of activities is available. We hope this discussion will benefit colleagues who are also working on provenance capture and presentation. Concluding remarks From the deliberations of reusing the PROV Ontology we see three core elements in the development of domain ontologies: expressivity, implementability and maintainability. Expressivity is about level of details and precision in semantic representation of ontologies; implementability is about that the developed ontologies can be used efficiently in practices; and maintainability is about to keep the ontologies updating and evolving in a long-term perspective. The balance is application-dependent, but the three elements are often encountered together in domain ontology works. It is recommended that future domain ontology works can seek approaches to address the challenge of balancing these three elements. Deliberations may cause confusion in practices of ontology development, but those approaches can provide some guidelines. Provenance capture with a report and associated documents The first example is with an ontology (the GCIS ontology) [2] developed for the Global Change Information System (GCIS), which is a part of the U.S. Global Change Research Program (USGCRP). The initial focus of GCIS is to support the third National Climate Assessment (NCA3) in the United States. It will present the NCA3 report and also incorporate integrated access to inter-linked resources supporting that report. When we started to develop the GCIS ontology in 2012 summer, the NCA3 draft report was about to finish (version for public comment released in 2013 January). The first-hand materials we had are the draft report and the figures and tables inside it, as well as the references cited by the report and the internal structure of it (i.e., chapters, sections, etc.). We developed the GCIS ontology accordingly (Fig. 2). The Instances of those classes and the relationships among them can be captured from the documents we have, but without details about the activities that generated those documents. Fig. 1. The three starting-point classes Entity, Agent and Activity in PROV-O and the properties that relate them. Figure is from [1]. Get poster at: Provenance capture in a workflow The second example is with an ontology for provenance capture in the project Employing Cyber Infrastructure Data Technologies to Facilitate IEA for Climate Impacts in NE & CA LME's (#3 & #7) (ECO-OP). In the ECO-OP project the iPython Notebook is applied as an environment for scientific data processing, figure and table generation and report writing. This allows provenance information to be captured in a workflow. We are currently developing an ontology serving that purpose (Fig. 3). In Fig.3 two classes, ecoop:IPythonNotebookRun and ecoop:CodeCellRun, constitute the trunk of the ontology. They both are subclasses of prov:Activity. The resources used and generated in the activities, such as program libraries and datasets, are subclasses of prov:Entity, and information about them can be captured when iPython Notebook runs. In this way the second example is different from the first example. Fig. 2. Basic classes and properties in the current GCIS ontology (v. 1.1). Classes and properties represent a brief structure of the draft NCA3. There is no subclass of prov:Activity in this figure. Fig. 3. A part of the ECO-OP provenance ontology under development. In this ontology, subclasses of prov:Activity are the trunk.