1 Semantic Web and Retrieval of Scientific Data Semantics Goran Soldar University of Brighton UK Dan Smith University of East Anglia UK.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
R 2 O+ODEMapster : Upgrading Relational Legacy Data to the Semantic Web Jesús Barrasa Rodríguez
1 ICS-FORTH EU-NSF Semantic Web Workshop 3-5 Oct Christophides Vassilis Database Technology for the Semantic Web Vassilis Christophides Dimitris Plexousakis.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
XML: Extensible Markup Language
The Semantic Web – WEEK 4: RDF
1 UIM with DAML-S Service Description Team Members: Jean-Yves Ouellet Kevin Lam Yun Xu.
RDF Tutorial.
The Semantic Web. The Web Today Designed for Human to read Cannot express meaning Architecture: URL –Decentralized: Link structure Language: html.
CHAITALI GUPTA, RAJDEEP BHOWMIK, MICHAEL R. HEAD, MADHUSUDHAN GOVINDARAJU, WEIYI MENG PRESENTED BY: SIDDHARTH PALANISWAMI A Query-based System for Automatic.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
ModelicaXML A Modelica XML representation with Applications Adrian Pop, Peter Fritzson Programming Environments Laboratory Linköping University.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Overview of Search Engines
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Practical RDF Chapter 1. RDF: An Introduction
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
Logics for Data and Knowledge Representation
The Semantic Web Web Science Systems Development Spring 2015.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
By: Dan Johnson & Jena Block. RDF definition What is Semantic web? Search Engine Example What is RDF? Triples Vocabularies RDF/XML Why RDF?
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Dimitrios Skoutas Alkis Simitsis
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Semantically Processing The Semantic Web Presented by: Kunal Patel Dr. Gopal Gupta UNIVERSITY OF TEXAS AT DALLAS.
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW /04-01.
1 Artificial Intelligence Applications Institute Centre for Intelligent Systems and their Applications Stuart Aitken Artificial Intelligence Applications.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Understanding RDF. 2/30 What is RDF? Resource Description Framework is an XML-based language to describe resources. A common understanding of a resource.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Problems with XML & XML Schemas XML falls apart on the Scalability design goal. 1.The order in which elements appear in an XML document is significant.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Object storage and object interoperability
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Chapter 5 The Semantic Web 1. The Semantic Web  Initiated by Tim Berners-Lee, the inventor of the World Wide Web.  A common framework that allows data.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Linked Open Data Dataset from Related Documents Petya Osenova and Kiril Simov IICT-BAS LDL-2016, LREC, Portoroz.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
26/02/ WSMO – UDDI Semantics Review Taxonomies and Value Sets Discussion Paper Max Voskob – February 2004 UDDI Spec TC V4 Requirements.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
1 Introduction to XML Babak Esfandiari. 2 What is XML? introduced by W3C in 98 Stands for eXtensible Markup Language it is more general than HTML, but.
Session: Towards systematically curating and integrating
Web Ontology Language for Service (OWL-S)
RDF 1.1 Concepts and Abstract Syntax
LOD reference architecture
Semantic-Web, Triple-Strores, and SPARQL
Presentation transcript:

1 Semantic Web and Retrieval of Scientific Data Semantics Goran Soldar University of Brighton UK Dan Smith University of East Anglia UK

2  Semantic Web  Introduced by Tim Berners-Lee  Data and resources described, interchanged, and processed  Machine understanding of heterogeneous data  Most search engines on the Web are human use oriented  Finding and processing scientific data on the web is time- consuming process Example  Search: Web pages containing the word temperature  Search engine: Google  Search domain:  Results: 773 web pages Introduction

3 Inefficiency of the traditional search  Humans have to browse through web pages  No guarantee that the wanted information will be found Preferred approach  Describe the semantics of data using RDF/XML format  Store the data in a DBMS  Automatically retrieve desired information based on users requests  Enable client machines to learn the semantics of RDF format described data

4 Introduction Objectives of this work  To address the problem of extracting semantics from data files within the meteorology domain.  To build the ontology for the meteorology domain.  To create semantic cases with RDF Model/RDF Schema.  To employDB2 DBMS as the data repository.  To enhance standard DBMS with RDF Triples Engine.  To manage the RDF graph structure with RDF Triples Engine.

5 RDF and Domain Ontology  RDF is a framework for describing metadata.  It enables interoperability between machines by interchanging information about information resources  It is represented with a Directed Labeled Graph Name File ltgrid.dat Resource Property Value (Subject) (Predicate) (Object) RDF structure

6 RDF and Domain Ontology  Specific domains represented with RDF  Our focus: The Meteorology domain  The concepts, semantics and the relations between the concepts defined with RDF Schema.  Ontology: An explicit specification of an information domain  RDF Schema: Uses the syntax of RDF Model  Corresponds to XML’s DTD or XML Schema  RDF Schema is a basis for RDF instances

7 Modelling RDF Model for Meteorology Three phases of modelling  Development of the vocabulary (ontology)  Design of semantic cases to capture resource description  Creation of semantic case instances  The vocabulary is comprised of main concepts and classes represented by classes and properties  RDF Schema uses RDF Model encoding syntax  rdf:type separates RDF classes from properties  rdfs:subClassOf allows expression of inheritance-relationship between RDF classes

8 Modelling RDF Model for Meteorology The Meteorology domain at cru.sys.uea.ac.uk:  Contains about 1000 data files  Made of 9 meteorological topic (sub-domains)  Have all sub-domains designed as RDF classes  have all concepts and elements defined in its Namespace The ontology is defined in two RDF files:  Class.rdf  Property.rdf  Semantic cases are based on the existing vocabulary  Simple semantic cases designed first  Complex cases are the combination of complex ones

9 Modelling RDF Model for Meteorology Our prototype model:  Describes 100 data sets  Contains 4 semantic cases HeaderCase  URL  FormatType  DataParameter  Comment  Domain SizeCase  Compression  FileSize  Value ObservationCase  Frequency  TimePeriod  Value The semantic cases PeriodCase  TimeRange  TimePeriod  Value

10 ASCII GeopotentialHeight_AtPressure 6-Hourly GeopotentialHeight at 1000mb cru:Height RDF Instance of HeaderCase for a data file Modelling RDF Model for Meteorology

11 From RDF to Relational Model Our prototype model:  Comprises of 12 RDF files  One holds semantic case descriptions  Two hold RDF Schema descriptions  Nine contain RDF onstances of semantic cases Management of RDF-described data  W3C does not recommend any method for manipulating RDF Triples  RDF structure is similar to XML  XML comes with APIs for data manipulation (SAX, DOM), RDF does not

12 Mapping RDF model for Meteorology into RDBMS DB2 CRU Meteorological Domain RDF Triple Engine SiRPAC RDF Triples Model Ontology Semantic Cases  We utilise RDF triple structure to achieve the manipulation of data  XML parsers check the syntax of RDF  RDF parsers converts it into triples  RDF tags removed  Triples converted onto Relational model  Stored in DB2 DBMS Modelling RDF Model for Meteorology

13 Modelling RDF Model for Meteorology

14 Retrieval of Semantic Information  RDF Triple Engine is responsible for manipulating triples and executing semantic queries  Based on Client/Server architecture with specialised RDF servers  Records in DBMS have graph structure  Not semantically atomic  Additional query processing added to RTE  RTE is aware of graph structure of triples  Able to produce results that reconstruct the graph structure and present in format specified by users

15 Property Resource Value frequency temperature daily domain temperature weather recorded temperature file name file ltgrid.dat url file size file size_id value size_id 40 temperature file recorded domain weather frequency name daily size unit ltgrid.dat size_id 40 Kb value url unit size_id Kb RDF graph for the Weather domain Relational structure of the RDF graph Retrieval of Semantic Information

16 Property Resource Value cru:URL hgt h.w1.53x21.dat.gz 6hourly/pressure/hgt/hgt1000_6h cru:FormatType hgt h.w1.53x21.dat.gz ASCII cru:DataParameter hgt h.w1.53x21.dat.gz GeopotentialHeight_AtPressure rdfs:comment, hgt h.w1.53x21.dat.gz 6-Hourly GeopotentialHeight at 1000mb rdfs:domain hgt h.w1.53x21.dat.gz cru:Height rdf:type cru:Height#genid2 Rdf:Seq rdf:_1 cru:Height#genid2 Compressed rdf:_2 cru:Height#genid2 Kilobyte rdf:_3 cru:Height#genid cru:size hgt h.w1.53x21.dat.gz cru:Height#genid2 rdf:type cru:Height#genid3 rdf:Seq rdf:_1 cru:Height#genid3 Frequency rdf:_2 cru:Height#genid3 Hour rdf:_3 cru:Height#genid3 6 cru:observation hgt h.w1.53x21.dat.gz cru:Height#genid3 rdf:type cru:Height#genid4 rdf#Seq rdf:_1 cru:Height#genid4 TimeRange rdf:_2 cru:Height#genid4 Year rdf:_3 cru:Height#genid cru:period hgt h.w1.53x21.dat.gz cru:Height#genid4 RDF instance“MetInstance”converted into a relational table Retrieval of Semantic Information

17 Retrieval of Semantic Information  RTE relies on SQL query processor to extract relevant triples  Semantics Retrieval Language (SRL) prototype developed  SQL-similar syntax Example DESCRIBE RESOURCE “hgt h.w1.53x21.dat.gz”; Processing of the above SRL query Step 1: Transform the query into a standard SQL sentence and submit it to DB2 SELECT * FROM MetInstance WHERE RESOURCE=“hgt h.w1.53x21.dat.gz”;

18 Retrieval of Semantic Information Step 2 RTE applies the rules to generate XML as the output: 1. Extract name space prefixes and generate XML namespace node. 2. For all (real) atomic value create XML elements with Property values as XML elements 3. For all non-atomic values, create XML nodes as sub-elements of the resources where they appear as values 4. Ensure that if the node type is Seq container, all elements must be ordered

19 Conclusion  RTE-DBS approach enables querying and retrieval of semantic information from scientific data files available on the Web  Such retrieved information can be further processed by a machine or used by humans  Future work will be based on building a user interface into RTE to maintain individual triples to prevent removal of triples who are nodes  A method for for identifying data semantics of data sets, based on reasoning over semantic cases will be developed