Ewa Deelman, www.isi.edu/~deelman Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar,

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
1 Building scientific Virtual Research Environments in D4Science Paul Polydoras University of Athens, Greece.
OASIS OData Technical Committee. AGENDA Introduction OASIS OData Technical Committee OData Overview Work of the Technical Committee Q&A.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Chronos: A Tool for Handling Temporal Ontologies in Protégé
Chapter 51 Scripting With JSP Elements JavaServer Pages By Xue Bai.
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
11/8/20051 Ontology Translation on the Semantic Web D. Dou, D. McDermott, P. Qi Computer Science, Yale University Presented by Z. Chen CIS 607 SII, Week.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Data Integration in Service Oriented Architectures Rahul Patel Sr. Director R & D, BEA Systems Liquid Data – XML-based data access and integration for.
Database Design - Lecture 1
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Workshop on Integrated Application of Formal Languages, Geneva J.Fischer Mappings, Use of MOF for Language Families Joachim Fischer Workshop on.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
Data on the Web Life Cycle Bernadette Farias Lóscio March, 2014.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Knowledge based Learning Experience Management on the Semantic Web Feng (Barry) TAO, Hugh Davis Learning Society Lab University of Southampton.
Workshop – 10, December 2014, Berlin ICCS / NTUA Greece Efthymios Chondrogiannis An Intelligent Ontology Alignment Tool Dealing with Complicated Mismatches.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
© 2006 IBM Corporation IBM WebSphere Portlet Factory Architecture.
Introduction to MDA (Model Driven Architecture) CYT.
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Dimitrios Skoutas Alkis Simitsis
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
Mining Structured vs. Unstructured Data Where is the structure and where did the semantics go? Rahim Yaseen SAP Labs LLC.
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW /04-01.
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
PHS / Department of General Practice Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Knowledge representation in TRANSFoRm AMIA.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Long Term Ecological Research Network Office Trends Project Spaghetti & Linguine (aka Trends Data Store) Mark Servilla 14 September.
Ch- 8. Class Diagrams Class diagrams are the most common diagram found in modeling object- oriented systems. Class diagrams are important not only for.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Dictionary based interchanges for iSURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains David Webber.
© 2006 University of Kansas An LSID resolver for specimens and a digression into issues raised by the use of GUIDs Steve Perry
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
1 Pegasus and wings WINGS/Pegasus Provenance Challenge Ewa Deelman Yolanda Gil Jihie Kim Gaurang Mehta Varun Ratnakar USC Information Sciences Institute.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
SCEC: An NSF + USGS Research Center Focus on Forecasts Motivation.
1 Artemis: Integrating Scientific Data on the Grid Rattapoom Tuchinda Snehal Thakkar Yolanda Gil Ewa Deelman.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
USC Information Sciences Institute {jihie, gil,
Data Model.
Presentation transcript:

Ewa Deelman, Virtual Metadata Catalogs: Augmenting Existing Metadata Catalogs with Semantic Representations Yolanda Gil, Varun Ratnakar, and Ewa Deelman USC Information Sciences Institute Yolanda Gil, Varun Ratnakar, and Ewa Deelman USC Information Sciences Institute

Ewa Deelman, D a t a a n d / o r A n a l y s i s D i s c o v e r y D a t a, M e t a d a t a a n d P r o v e n a n c e M g n t E x e c u t i o n A n a l y s i s D e f i n i t i o n a n d M a p p i n g A na l ys i s D e sc r i p t i on Scientific Analysis Derived Data Metadata Provenance Raw Data Metadata

Ewa Deelman, D a t a a n d / o r A n a l y s i s D i s c o v e r y D a t a, M e t a d a t a a n d P r o v e n a n c e M g n t E x e c u t i o n A n a l y s i s D e f i n i t i o n a n d M a p p i n g A na l ys i s D e sc r i p t i on Scientific Analysis Derived Data Metadata Provenance Raw Data Metadata

Ewa Deelman, D a t a a n d / o r A n a l y s i s D i s c o v e r y D a t a, M e t a d a t a a n d P r o v e n a n c e M g n t E x e c u t i o n A n a l y s i s D e f i n i t i o n a n d M a p p i n g A na l ys i s D e sc r i p t i on Scientific Analysis Derived Data Metadata Provenance Raw Data Metadata

Ewa Deelman, D a t a a n d / o r A n a l y s i s D i s c o v e r y D a t a, M e t a d a t a a n d P r o v e n a n c e M g n t E x e c u t i o n A n a l y s i s D e f i n i t i o n a n d M a p p i n g A na l ys i s D e sc r i p t i on Scientific Analysis Derived Data Metadata Provenance Raw Data Metadata

Ewa Deelman, Raw Data Metadata Derived Data Metadata Provenance Raw Data Metadata Derived Data Metadata Provenance Raw Data Metadata Derived Data Metadata Provenance Raw Data Metadata Derived Data Metadata Provenance

Ewa Deelman, Raw Data Metadata Derived Data Metadata Provenance Raw Data Metadata Derived Data Metadata Provenance Raw Data Metadata Derived Data Metadata Provenance Raw Data Metadata Derived Data Metadata Provenance Problem: How to find the data in this environment? How to specify the characteristics of the data (the metadata attributes)? How to manage distributed provenance records? Today clients must figure out manually the meaning of the attributes, identify what are the relevant ones to query, and formulate queries

Ewa Deelman, Approach  Expose the information in the catalogs at a semantic level  Expose the attributes  Provide access to the content based on a user-selected ontology  Support semantic queries to the catalog contents  Expose the information in the catalogs at a semantic level  Expose the attributes  Provide access to the content based on a user-selected ontology  Support semantic queries to the catalog contents

Ewa Deelman, The Virtual Metadata Catalog  Augments the existing metadata catalogs with semantic representations  Do not modify the original sources  Maps virtual metadata attributes to the original attributes  Represents virtual attributes declaratively  Represents constraints as relations between attributes  Expands and translates a query to the Virtual Catalog into the original attributes  Explored approach with temporal attributes  Augments the existing metadata catalogs with semantic representations  Do not modify the original sources  Maps virtual metadata attributes to the original attributes  Represents virtual attributes declaratively  Represents constraints as relations between attributes  Expands and translates a query to the Virtual Catalog into the original attributes  Explored approach with temporal attributes

Ewa Deelman, Queries to other metadata catalogs Mappings Virtual Metadata Shared ontologies & vocabularies Metadata Catalog Metadata Attributes Mappings Virtual Metadata Virtual Metadata Catalog Service Metadata Catalog Service Queries with virtual metadata attributes Queries with metadata attributes Queries to multiple catalogs Query Mediator Queries to a specific metadata catalog

Ewa Deelman, Metadata Catalog Service (MCS)  Models logical files, logical collections and views  Provides a standard set of attributes for each object type  Supports the dynamic definition of attributes of type  String, integer, float, date  Provides a command-line interface and a Java API  Used in the Pegasus portal, SCEC, myLEAD  Models logical files, logical collections and views  Provides a standard set of attributes for each object type  Supports the dynamic definition of attributes of type  String, integer, float, date  Provides a command-line interface and a Java API  Used in the Pegasus portal, SCEC, myLEAD

Ewa Deelman, Execution time  Wall clock time or CPU time?  Could be specified as:  begin-execution-time, end-execution-time  begin-execution-time, duration  Diversity of these attributes can be represented at a semantic level  Define the virtual attributes in OWL  Wall clock time or CPU time?  Could be specified as:  begin-execution-time, end-execution-time  begin-execution-time, duration  Diversity of these attributes can be represented at a semantic level  Define the virtual attributes in OWL

Ewa Deelman, Gratuitous OWL Slide 1 1

Ewa Deelman,  Add rules to represent more expressive relations and constraints [r1: (?x rdf:type tme:IntervalThing), (?x tme:from ?a),(?x tme:duration ?t2), (?a tme:at ?t1), sum(?t1, ?t2, ?t3)makeTemp(?v)-> (?v rdf:type tme:InstantThing) (?v tme:at ?t3) (?x tme:to ?v)]  Add rules to represent more expressive relations and constraints [r1: (?x rdf:type tme:IntervalThing), (?x tme:from ?a),(?x tme:duration ?t2), (?a tme:at ?t1), sum(?t1, ?t2, ?t3)makeTemp(?v)-> (?v rdf:type tme:InstantThing) (?v tme:at ?t3) (?x tme:to ?v)]

Ewa Deelman, MCS Query Generic Catalog Ontology (file, view, collection) Query Virtual Metadata Attributes and Mappings Reasoner Distributed domain ontologies Virtual Metadata Catalog Metadata Catalog Metadata Attributes Metadata Catalog Metadata Catalog Service (MCS) Query Mapping Answer

Ewa Deelman, Query Mapping MCS Query 1. Generate query constituents 2. Convert to MCS Attribute names 3. Construct query formula OWL Query Virtual metadata attribute value pairs MCS Attribute value pairs Distributed domain ontologies Virtual Metadata Attributes and Mappings Metadata Catalog Service (MCS) Generic OWL + Rules Reasoner answer “from T10:00:00” and “duration PT30S” “startDate” and “endDate” Load OWL ontologies and rules referenced in the query generate new attributes “from T10:00:00” “to T10:00:30Z”t Convert “from” to “StartDate” Convert “to” to “Enddate” Convert from XML schema to simple strings expected by MCS

Ewa Deelman, Evaluation  Developed a prototype system  Performed queries across data sets from different domains  Climate modeling, earthquake science, workflow execution  Supported temporal queries using the OWL time ontology as the target ontology  Developed a prototype system  Performed queries across data sets from different domains  Climate modeling, earthquake science, workflow execution  Supported temporal queries using the OWL time ontology as the target ontology

Ewa Deelman, Discussion  Support only for the query functionality  Need to expand to  data publication  structuring data into collections  Supporting multiple catalogs  Need to support richer catalog structures  Support only for the query functionality  Need to expand to  data publication  structuring data into collections  Supporting multiple catalogs  Need to support richer catalog structures

Ewa Deelman, Conclusions  Example of building semantic services on top of legacy catalogs  Provided customized views at the semantic level  Views are customized to a particular user-selected ontology  Need to expand the functionality to publication  Need to address more the handling of alternative data formats  Syntactic versus semantic transformations  Transformations between date/time formats, coordinate systems, etc.  Are these transformations workflows?  Example of building semantic services on top of legacy catalogs  Provided customized views at the semantic level  Views are customized to a particular user-selected ontology  Need to expand the functionality to publication  Need to address more the handling of alternative data formats  Syntactic versus semantic transformations  Transformations between date/time formats, coordinate systems, etc.  Are these transformations workflows?