Ontology-driven Provenance Management in eScience: An Application in Parasite Research Ontology-driven Provenance Management in eScience: An Application.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
1 Ontolog OOR Use Case Review Todd Schneider 1 April 2010 (v 1.2)
August 6, 2009 Joint Ontolog-OOR Panel 1 Ontology Repository Research Issues Joint Ontolog-OOR Panel Discussion Ken Baclawski August 6, 2009.
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Intelligent Technologies Module: Ontologies and their use in Information Systems Revision lecture Alex Poulovassilis November/December 2009.
1 ICS-FORTH & Univ. of Crete SeLene November 15, 2002 A View Definition Language for the Semantic Web Maganaraki Aimilia.
Kino : Making Semantic Annotations Easier Ajith Ranabahu #, Priti Parikh #, Maryam Panahiazar #, Amit Sheth # and Flora Logan- Klumpler* # Ohio Center.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
IPAW'08 – Salt Lake City, Utah, June 2008 Data lineage model for Taverna workflows with lightweight annotation requirements Paolo Missier, Khalid Belhajjame,
Open Provenance Model Tutorial Session 2: OPM Overview and Semantics Luc Moreau University of Southampton.
RDB2RDF: Incorporating Domain Semantics in Structured Data Satya S. Sahoo Kno.e.sis CenterKno.e.sis Center, Computer Science and Engineering Department,
Semantic Web Thanks to folks at LAIT lab Sources include :
Logics for Data and Knowledge Representation Projects and thesis introduction.
1 UIM with DAML-S Service Description Team Members: Jean-Yves Ouellet Kevin Lam Yun Xu.
Provenance Management Framework Satya S. Sahoo Kno.e.sis Center, Wright State University In Collaboration with Tarleton Lab, University of Georgia and.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Data Intensive Techniques to Boost the Real-time Performance of Global Agricultural Data Infrastructures SEMAGROW U SING A POWDER T RIPLE S TORE FOR BOOSTING.
Knowledge Enabled Information and Services Science What can SW do for HCLS today? Panel at HCSL Workshop, WWW2007 Amit Sheth Kno.e.sis Center Wright State.
Linked Sensor Data Harshal Patni, Cory Henson, Amit P. Sheth Ohio Center of Excellence in Knowledge enabled Computing (Kno.e.sis) Wright State University,
Ontologies and the Semantic Web by Ian Horrocks presented by Thomas Packer 1.
Ontology Classifications Acknowledgement Abstract Content from simulation systems is useful in defining domain ontologies. We describe a digital library.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
1 Where do spatial context-models end and where do ontologies start? A proposal of a combined approach Christian Becker Distributed Systems Daniela Nicklas.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
ONTOLOGY SUPPORT For the Semantic Web. THE BIG PICTURE  Diagram, page 9  html5  xml can be used as a syntactic model for RDF and DAML/OIL  RDF, RDF.
Trykipedia: Collaborative Bio-Ontology Development using Wiki Environment Introduction: Biomedical ontology development is an intensely collaborative process.
The SADI plug-in to the IO Informatics’ Knowledge Explorer...a quick explanation of how we “boot-strap” semantics...
Krishnaprasad Thirunarayan, Pramod Anantharam, Cory A. Henson, and Amit P. Sheth Kno.e.sis Center, Ohio Center of Excellence on Knowledge-enabled Computing,
EXCS Sept Knowledge Engineering Meets Software Engineering Hele-Mai Haav Institute of Cybernetics at TUT Software department.
Knowledge based Learning Experience Management on the Semantic Web Feng (Barry) TAO, Hugh Davis Learning Society Lab University of Southampton.
Knowledge Enabled Information and Services Science Extending SPARQL to Support Spatially and Temporally Related Information Prateek Jain, Amit Sheth Peter.
DBrev: Dreaming of a Database Revolution Gjergji Kasneci, Jurgen Van Gael, Thore Graepel Microsoft Research Cambridge, UK.
Knowledge Enabled Information and Services Science SPARQL Query Re-writing for Spatial Datasets Using Partonomy Based Transformation Rules Prateek Jain,
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
1 Lessons from the TSIMMIS Project Yannis Papakonstantinou Department of Computer Science & Engineering University of California, San Diego.
Requirements for RDF Validation Harold Solbrig Mayo Clinic.
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
Mining the Biomedical Research Literature Ken Baclawski.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Proposed Research Problem Solving Environment for T. cruzi Intuitive querying of multiple sets of heterogeneous databases Formulate scientific workflows.
A facilitator to discover and compose services Oussama Kassem Zein Yvon Kermarrec ENST Bretagne.
Of 38 lecture 6: rdf – axiomatic semantics and query.
Semantic sewer pipe failure detection: Linked data approaches for discovering events Jonathan Yu | Research software engineer Environmental Information.
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Ontology Technology applied to Catalogues Paul Kopp.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
26/02/ WSMO – UDDI Semantics Review Taxonomies and Value Sets Discussion Paper Max Voskob – February 2004 UDDI Spec TC V4 Requirements.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Query Rewriting Framework for Spatial Data
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Data challenges in the pharmaceutical industry
Web Ontology Language for Service (OWL-S)
Analyzing and Securing Social Networks
Collaborative RO1 with NCBO
Web archives as a research subject
Presentation transcript:

Ontology-driven Provenance Management in eScience: An Application in Parasite Research Ontology-driven Provenance Management in eScience: An Application in Parasite Research Satya S. Sahoo 1, D. Brent Weatherly 2, Raghava Mutharaju 1, Pramod Anantharam 1, Amit Sheth 1, Rick L. Tarleton 2 ODBASE2009 Vilamoura, Algarve-Portugal November 05, Kno.e.sis Center, Wright State University; 2 Center for Tropical and Emerging Diseases, University of Georgia

Parasite Research: Creation of Parasite Strains New Parasite Strains

Provenance in Parasite Research *T.cruzi Semantic Problem Solving Environment Project, Courtesy of D.B. Weatherly and Flora Logan, Tarleton Lab, University of GeorgiaT.cruzi Semantic Problem Solving Environment Project Sequence Extraction Plasmid Construction Transfection Drug Selection Cell Cloning Gene Name 3 & 5 Region Knockout Construct Plasmid Drug Resistant Plasmid Transfected Sample Selected Sample Cloned Sample T.Cruzi sample Cloned Sample Gene Name ? Gene Knockout and Strain Creation * Related Queries from Biologists Q2: List all groups in the lab that used a Target Region Plasmid? Q3: Which researcher created a new strain of the parasite (with ID = 66)? An experiment was not successful – has this experiment been conducted earlier? What were the results?

Provenance Management in Science Provenance from the French word provenir describes the lineage or history of a data entity For Verification and Validation of Data Integrity, Process Quality, and Trust Issues in Provenance Management Provenance Modeling A Dedicated Query Infrastructure Practical Provenance Management Systems

Outline Provenance Modeling: Provenir Parasite Experiment ontology Provenance Query Infrastructure Provenance Query Engine Evaluation Results Query Optimization: Materialized Provenance Views

Ontologies for Provenance Modeling Advantages of using Ontologies Formal Description: Machine Readability, Consistent Interpretation Use Reasoning: Knowledge Discovery over Large Datasets Problem: A gigantic, monolithic Provenance Ontology! – not feasible Solution: Modular Approach using a Foundational Ontology GLYCOPROTEIN EXPERIMENT GLYCOPROTEIN EXPERIMENT OCEANOGRAPHY PARASITE EXPERIMENT PARASITE EXPERIMENT FOUNDATIONAL ONTOLOGY FOUNDATIONAL ONTOLOGY

Provenir Ontology PROCESS AGENT DATA has_agent participates_in Transfection Machine Sequence Extraction Plasmid Construction Transfection Drug Selection Cell Cloning Gene Name 3 & 5 Region Knockout Construct Plasmid Drug Resistant Plasmid Transfected Sample Selected Sample Cloned Sample T.Cruzi sample

Provenir Ontology Schema located_in PROCESS AGENT DATA DATA COLLECTION PARAMETER SPATIAL THEMATIC TEMPORAL has_agent participates_in preceded_by has_temporal_value is_a

Domain-specific Provenance: Parasite Experiment ontology agent process data_collection data parameter spatial_parameter domain_parameter temporal_parameter sample Time:DateTime Descritption transfection_buffercell_cloning strain_creation_ protocol transfection_machine transfection drug_selection Tcruzi_sample location has_agent is_a has_participant has_parameter has_participant PROVENIR ONTOLOGY PROVENIR ONTOLOGY PARASITE EXPERIMENT ONTOLOGY PARASITE EXPERIMENT ONTOLOGY *Parasite Experiment ontology available at:

Outline Provenance Modeling: Provenir Parasite Experiment ontology Provenance Query Infrastructure Provenance Query Engine Evaluation Results Query Optimization: Materialized Provenance Views

Provenance Query Classification Classified Provenance Queries into Three Categories Type 1: Querying for Provenance Metadata o Example: Which gene was used create the cloned sample with ID = 66? Type 2: Querying for Specific Data Set o Example: Find all knockout construct plasmids created by researcher Michelle using Hygromycin drug resistant plasmid between April 25, 2008 and August 15, 2008 Type 3: Operations on Provenance Metadata o Example: Were the two cloned samples 65 and 46 prepared under similar conditions – compare the associated provenance information

Provenance Query Operators Four Query Operators – based on Query Classification provenance () – Closure operation, returns the complete set of provenance metadata for input data entity provenance_context() - Given set of constraints defined on provenance, retrieves datasets that satisfy constraints provenance_compare () - adapt the RDF graph equivalence definition provenance_merge () - Two sets of provenance information are combined using the RDF graph merge

Answering Provenance Queries using provenance () Operator

Outline Provenance Modeling: Provenir Parasite Experiment ontology Provenance Query Infrastructure Provenance Query Engine Evaluation Results Query Optimization: Materialized Provenance Views

Provenance Query Engine Available as API for integration with provenance management systems Layer on top of a RDF Data Store (Oracle 10g), requires support for: o Rule-based reasoning o SPARQL query execution Input: o Type of provenance query operator : provenance () o Input value to query operator: cloned sample 66 o User details to connect to underlying RDF store

Outline Provenance Modeling: Provenir Parasite Experiment ontology Provenance Query Infrastructure Provenance Query Engine Evaluation Results Query Optimization: Materialized Provenance Views

Evaluation Results Queries expressed in SPARQL Datasets using real experiment data Query IDNumber of Variables Total Number of Triples Nesting Levels using OPTIONAL Query 1: Target plasmid Query 2: Plasmid_ Query 3: Transfection attempts Query 4: cloned_sample Dataset IDNumber of RDF Inferred Triples Total Number of RDF Triples DS 12,6733,553 DS 23,4704,490 DS 34,9886,288 DS 447,13360,912

Evaluation Results

Outline Provenance Modeling: Provenir Parasite Experiment ontology Provenance Query Infrastructure Provenance Query Engine Evaluation Results Query Optimization: Materialized Provenance Views

Materializes a single logical unit of provenance Does not require query- rewriting View updates: addressed by characteristics of provenance Created using a memoization approach

Provenance Query Engine Architecture TRANSITIVE CLOSURE QUERY OPTIMIZER

Evaluation Results using Materialized Provenance Views

Provenance Management System for Parasite Research

Acknowledgement Flora Logan – The Wellcome Trust Sanger Institute, Cambridge, UK Priti Parikh – Kno.e.sis Center, Wright State University Roger Barga – Microsoft Research, Redmond Jonathan Goldstein – Microsoft Research, Redmond

Thank you Contact Web: