A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

April 24, 2007McGuinness NIST Interoperability Week Ontology Summit Semantic Web Perspective Deborah L. McGuinness Acting Director & Senior Research Scientist.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Open Provenance Model Tutorial Session 3: OPM Serializations Luc Moreau University of Southampton.
1 OWL Instance Data Evaluation Li Ding, Jiao Tao, and Deborah L. McGuinness Tetherless World Constellation Computer Science Department.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
Provenance-aware faceted search Peter Fox Stephan Zednik Patrick West Tetherless World Constellation, RPI EGU 2010.
From SHIQ and RDF to OWL: The Making of a Web Ontology Language
Knowledge Provenance in Semantic Wikis Li Ding, Jie Bao, and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute Troy,
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
Triple Stores.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 2 nd European Semantic Web Conference (ESWC), 1 June 2005, Crete, Greece work in collaboration.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Peter Fox CSCI Week 9, October 27, 2008.
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.
Provenance-Aware Faceted Search Deborah L. McGuinness 1,2 Peter Fox 1 Cynthia Chang 1 Li Ding 1.
Practical RDF Chapter 1. RDF: An Introduction
Mash-up of Linked Government Data from Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.
SemantAqua: A Semantically-Enabled Provenance-Aware Water Quality Portal Evan W. Patton, Ping Wang, Jin Guang Zheng, Timothy Lebo, Li Ding, Joanne Luciano,
Provenance Metadata for Shared Product Model Databases Etiel Petrinja, Vlado Stankovski & Žiga Turk University of Ljubljana Faculty of Civil and Geodetic.
Understanding PML Paulo Pinheiro da Silva. PML PML is a provenance language (a language used to encode provenance knowledge) that has been proudly derived.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI Week 10, November.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Semantic Web Ontology Design Pattern Li Ding Department of Computer Science Rensselaer Polytechnic Institute October 3, 2007 Class notes for CSCI-6962.
1 © 1999 Microsoft Corp.. Microsoft Repository Phil Bernstein Microsoft Corp.
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
Introduction to Tetherless World RPI by Jie Bao Slides will be available from:
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
Applying Provenance Extensions to OPeNDAP Framework Patrick West, James Michaelis, Tim Lebo, Deborah L. McGuinness Rensselaer Polytechnic Institute Tetherless.
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
TWC-SWQP: A Semantically-Enabled Provenance-Aware Water Quality Portal Ping Wang, Jin Guang Zheng, Linyun Fu, Evan W. Patton, Timothy Lebo, Li Ding, Joanne.
123 Jiao Tao 1, Li Ding 2, Deborah L. McGuinness 3 Tetherless World Constellation Rensselaer Polytechnic Institute Troy, NY, USA 1 PhD Student 2 Postdoctoral.
PHS / Department of General Practice Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Knowledge representation in TRANSFoRm AMIA.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Explainable Adaptive Assistants Deborah L. McGuinness, Tetherless World Constellation, RPI Alyssa Glass, Stanford University Michael Wolverton, SRI International.
Lessons learned from Semantic Wiki Jie Bao and Li Ding June 19, 2008.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
© The ATHENA Consortium. Susan Thomas SAP AG, Research Department How do you do semantics? Semantic Web Drawings by Sebastian Cremers Unit 3:
Ewa Deelman, Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,
Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Annotating and Embedding Provenance in Science Data Repositories to Enable Next Generation Science Applications Deborah L. McGuinness.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Digital Image Annotation Tool. INTRODUCTION Incorporation of digital media types Unstructured digital data Portal for managing annotations and tracking.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Middleware independent Information Service
Triple Stores.
Review CSE116 2/21/2019 B.Ramamurthy.
Semantic Markup for Semantic Web Tools:
Modeling Data Set Versioning Operations
Presentation transcript:

A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding, Rui Huang, Zhenning Shangguan, Deborah L. McGuinness 11/3/2016

Introduction Our approach the Third Provenance Challenge (called TetherlessPC3) is designed to leverage Semantic Web technologies Support for two things useful for answering the provided queries: Declarative inference – SPARQL + OWL Syntax Augmenting provenance data derived from the workflow execution with supplementary information – SPARQL 21/3/2016

TetherlessPC3 Approach 1/3/20163 Provenance Generator Query Front-End Import/Export Component 12 3

Provenance Generator Query Front-End Import/Export Component TetherlessPC3 Approach Trace (OPM) Run TW’s Workflow code Run other team’s Workflow code Trace (OPM’) Trace (OWL) PC3OPM (OWL) Trace (PML) Run Query (Pellet/Jena) Query (SPARQL) Results (Text) Normalization (OPM’-OPM) Query (English) 12 3 Translation (OPM-PC3OPM) Translation (PC3OPM-PML) Translation (English-Sparql)

Provenance Generator Query Front-End Import/Export Component TetherlessPC3 Approach Trace (OPM) Run TW’s Workflow code Run other team’s Workflow code Trace (OPM’) Trace (OWL) PC3OPM (OWL) Trace (PML) Run Query (Pellet/Jena) Query (SPARQL) Results (Text) Normalization (OPM’-OPM) Query (English) 12 3 Translation (OPM-PC3OPM) Translation (PC3OPM-PML) Translation (English-Sparql) Produces provenance traces in Web Ontology Language (OWL) format, using Jena – a Java-based Semantic Web framework These are structured based on the PC3OPM Ontology at PC3OPM is designed to be compatible with the OPM Specification v1.01

Provenance Generator Query Front-End Import/Export Component TetherlessPC3 Approach Trace (OPM) Run TW’s Workflow code Run other team’s Workflow code Trace (OPM’) Trace (OWL) PC3OPM (OWL) Trace (PML) Run Query (Pellet/Jena) Query (SPARQL) Results (Text) Normalization (OPM’-OPM) Query (English) 12 3 Translation (OPM-PC3OPM) Translation (PC3OPM-PML) Translation (English-Sparql) To get the provenance workflow execution service used This is designed to run a modified version of the workflow emulation code provided by Yogesh Simmhan (Microsoft Research) This modified version contains injected code (in section for executing high level workflow) to recording provenance information based on PC3OPM

Three properties of PC3OPM Provide direct mappings to OPM concepts Example: PC3OPM:Artifact to the OPM concept “Artifact” Reification of OPM relations Example: For the relationship (Process1, WasTriggeredBy, Process2) Declare an instance of the class PC3OPM:WasTriggeredBy. Extend the definitions in OPM through new concepts Domain dependent: some terminology specific to Third Provenance Challenge workflow Example: CSVFileEntry (subclass of OPM Artifact) Domain independent: Terminology from the Proof Markup Language (PML) We added a new concept “Function” based on (pmlp:inferenceRule), where an OPM process is an execution of a “Function” 1/3/20167

WHAT IS IT? A Provenance interlingua designed for representing and sharing explanations generated by various intelligent systems. Originally designed to explain activity of theorem proof generators Part of the Inference Web framework (which includes tools for browsing, validating PML) THREE PARTS Justification: Provides structure for describing how a conclusion was derived Provenance: Metadata on information referenced in Justification Trust: Metadata on trust for information referenced in Justification 1/3/20168 Proof Markup Language (PML)

Provenance Generator Query Front-End Import/Export Component TetherlessPC3 Approach Trace (OPM) Run TW’s Workflow code Run other team’s Workflow code Trace (OPM’) Trace (OWL) PC3OPM (OWL) Trace (PML) Run Query (Pellet/Jena) Query (SPARQL) Results (Text) Normalization (OPM’-OPM) Query (English) 12 3 Translation (OPM-PC3OPM) Translation (PC3OPM-PML) Translation (English-Sparql) What we have done 1.Review given English-based queries and form corresponding SPARQL Queries 2.Update PC3OPM ontology to assist with (1) and re-generate the Provenance trace 3.Run queries, and get back results

Provenance Generator Query Front-End Import/Export Component TetherlessPC3 Approach Trace (OPM) Run TW’s Workflow code Run other team’s Workflow code Trace (OPM’) Trace (OWL) PC3OPM (OWL) Trace (PML) Run Query (Pellet/Jena) Query (SPARQL) Results (Text) Normalization (OPM’-OPM) Query (English) 12 3 Translation (OPM-PC3OPM) Translation (PC3OPM-PML) Translation (English-Sparql) Technologies used SPARQL - RDF Query Language Pellet – an Open Source OWL Reasoner

Query Answering Example Provenance Challenge core question 3: “Which operation executions were strictly necessary for the Image table to contain a particular (non-computed) value?” Our interpretation: Find the Process X which generated the Image table Look for the processes X T (directly or indirectly) triggered X Return X and as X T as query results Handling this query: Rather than write a recursive program, we use OWL-based transitive properties in the answer 111/3/2016

Enhancing Provenance Trace To keep our provenance trace simple and concise, we don’t put in transitive properties – since most of the queries don’t need them To insert them when necessary, we create additional RDF data through a SPARQL CONSTRUCT query 1/3/201612

SPARQL SELECT Query PREFIX rdf: PREFIX PC3: PREFIX PC3OPM: SELECT ?fxn1 ?fxn2 FROM FROM FROM WHERE { ?wgb PC3OPM:wgbSource PC3:provVarDbEntryP2ImageMeta_0. ?wgb PC3OPM:wgbTarget ?fxn1. OPTIONAL { ?fxn1 PC3OPM:opWasTriggeredBy ?fxn2. } } 131/3/2016

SPARQL CONSTRUCT Query PREFIX rdf: PREFIX PC3: PREFIX PC3OPM: CONSTRUCT { ?FXN PC3OPM:opWasTriggeredBy ?FXN2 } FROM WHERE { ?USD PC3OPM:usdSource ?FXN. ?USD PC3OPM:usdTarget ?VAR. ?WGB PC3OPM:wgbSource ?VAR. ?WGB PC3OPM:wgbTarget ?FXN2 } 141/3/2016

Provenance Generator Query Front-End Import/Export Component TetherlessPC3 Approach Trace (OPM) Run TW’s Workflow code Run other team’s Workflow code Trace (OPM’) Trace (OWL) PC3OPM (OWL) Trace (PML) Run Query (Pellet/Jena) Query (SPARQL) Results (Text) Normalization (OPM’-OPM) Query (English) 12 3 Translation (OPM-PC3OPM) Translation (PC3OPM-PML) Translation (English-Sparql) Can Import: OPM Graphs Can Export: OPM Graphs PML Proofs The Import/Export protocols for OPM are handled through the OPM API Likewise, the import/export Protocols for PML are handled Through a PML API developed by our lab.

Discussion: Importing From Other Teams Some OPM graphs generated by other teams were not parsable by OPM API, so normalization was needed Our SPAQRL queries (used on our provenance trace) only needed slight modification to handle imported provenance (change URIs of artifacts) Some information loss was observed with many teams dumping provenance traces to OPM Control flow traces were not captured by some teams 1/3/201616

Comparing with other Teams: Answering Core Query 3 Blue Team Our TeamGreen Team

Conclusions Semantic Web technologies useful for handling provenance data Provenance generation – RDF/OWL helps clarify the domain specific concepts/entities in provenance metadata Provenance Query – supported by SPARQL + OWL inference We can capture control flow and data flow Using transitive inference rules, we don’t need to write a program to implement a recursive query Provenance integration – RDF/OWL syntax of OPM (with references to domain terminology) will help avoid information loss issues when exporting OPM data 1/3/201618

References OWL SPARQL Pellet Jena PML API OPM API 1/3/201619

BACK 1/3/201620

PC3 OPM Ontology 1/3/201621