Co-funded by the European Union Semantic CMS Community Semantic Data Access Copyright IKS Consortium 1 SRDC Ltd. August, 2011.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1 ICS-FORTH EU-NSF Semantic Web Workshop 3-5 Oct Christophides Vassilis Database Technology for the Semantic Web Vassilis Christophides Dimitris Plexousakis.
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
XML: Extensible Markup Language
ESDSWG2011 – Semantic Web session Semantic Web Sub-group Session ESDSWG 2011 Meeting – Semantic Web sub-group session Wednesday, November 2, 2011 Norfolk,
Master Informatique 1 Semantic Technologies Part 4Jena Werner Nutt.
RDF Tutorial.
Semantic Web Introduction
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
 Copyright 2004 Digital Enterprise Research Institute. All rights reserved. SPARQL Query Language for RDF presented by Cristina Feier.
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
RDF Databases By: Chris Halaschek. Outline Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction.
Chapter 3 Querying RDF stores with SPARQL. TL;DR We will want to query large RDF datasets, e.g. LOD SPARQL is the SQL of RDF SPARQL is a language to query.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Michael Povolotsky CMSC491s/691s. What is Virtuoso? Virtuoso, known as Virtuoso Universal Server, is a multi-protocol RDBMS Includes an object-relational.
Triple Stores
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
1 An Introduction to RDF and the Jena RDF API. 2 Outline Introduction Statements Writing RDF Reading RDF Navigating a Graph Querying a Graph Operations.
RIZWAN REHMAN, CCS, DU. Advantages of ORDBMSs  The main advantages of extending the relational data model come from reuse and sharing.  Reuse comes.
Semantic Web Andrejs Lesovskis. Publishing on the Web Making information available without knowing the eventual use; reuse, collaboration; reproduction.
Triple Stores.
Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Linked Open Data: a new resource for eResearch Dr Anne Cregan eResearch Analyst, Intersect and ANDS
Information Integration Intelligence with TopBraid Suite SemTech, San Jose, Holger Knublauch
RDF Triple Stores Nipun Bhatia Department of Computer Science. Stanford University.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Entity Recognition via Querying DBpedia ElShaimaa Ali.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Logics for Data and Knowledge Representation
The Semantic Web Web Science Systems Development Spring 2015.
Chapter 3 Querying RDF stores with SPARQL. Why an RDF Query Language? Why not use an XML query language? XML at a lower level of abstraction than RDF.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
Ontology Query. What is an Ontology Ontologies resemble faceted taxonomies but use richer semantic relationships among terms and attributes, as well as.
Co-funded by the European Union Semantic CMS Community Tutorial: Knowledge Interaction and Presentation Copyright IKS Consortium 1 DFKI GmbH. September,
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
RDF languages and storages part 1 - expressivness Maciej Janik Conrad Ibanez CSCI 8350, Fall 2004.
Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen 정홍석
Practical RDF Chapter 10. Querying RDF: RDF as Data Shelley Powers, O’Reilly SNU IDB Lab. Hyewon Lim.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Practical RDF Ch.10 Querying RDF: RDF as Data Taewhi Lee SNU OOPSLA Lab. Shelley Powers, O’Reilly August 27, 2004.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
05/01/2016 SPARQL SPARQL Protocol and RDF Query Language S. Garlatti.
Sesame: An Architecture for Storing and Querying RDF Data and Schema Inf. Yasser Ganji Saffar When they were out of sight Ali Baba.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Object storage and object interoperability
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
RDF storages and indexes Maciej Janik September 1, 2005 Enterprise Integration – Semantic Web.
RDF languages and storages part 2 - indexing semi-structure data Maciej Janik Conrad Ibanez CSCI 8350, Fall 2004.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Linked Data & Semantic Web Technology The Semantic Web Part 4. Resource Description Framework (1) Dr. Myungjin Lee.
1 RDF Storage and Retrieval Systems Jan Pettersen Nytun, UiA.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Triple Stores.
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
Analyzing and Securing Social Networks
Triple Stores.
Chapter 10 ADO.
LOD reference architecture
Triple Stores.
Triple Stores.
Presentation transcript:

Co-funded by the European Union Semantic CMS Community Semantic Data Access Copyright IKS Consortium 1 SRDC Ltd. August, 2011

Page: Outline  Semantic Data  Semantic Web  RDF  Semantic Data Storage  Triple Stores  Semantic Data Access  SPARQL  RQL  API Calls Copyright IKS Consortium 2

Page: Semantic Data  Stands for machine understandable information  Allows computers to figure out the data without user interference  Allows computers act intelligently without programming for each task

Page: Semantic Data  Provides infrastructure to get practical results  Applications find out subsequent information based on the previous relations. (e.g Eiffel Tower -> Paris -> France)  Allows reasoning capabilities  Providing extraction of related information which is not directly linked

Page: Semantic Web  A classical generic description:  “Web of data”  Extends the World Wide Web  By encouraging,  Common language for representing data  Transformable to/from disparate sources such as relational databases, XML, etc (RDF)  Common reusable data model to represent data from different domains in common terms (RDFS, OWL, etc)  Rules to enable applications reason over the information (SWRL)

Page: Semantic Web Stack

Page: Semantic Web  So many organizations publishing their data in different domains  Media  Geographic  Government  …  Whole set contains approximately 30 billion triples  One of the largest collections is DBPEDIA  Semantified version of Wikipedia  Example:  Obtain cities of China that have population over 20 million  Needs efficient storage and query for semantic data Copyright IKS Consortium 7

Page: Representation of Semantic Data  RDF  The common data format  An abstract model with several serialization formats  Consists of statement referred as triples having the form (subject, predicate, object) where,  Subject: any resource identifier  Predicate: a resource identifier of any property  Object: either a resource identifier or a literal value

Page:  Two types of resource identifiers:  URIRef  BNode.  Property  Used when talking about the particular aspect of a resource.  Must be represented by URIRefs and may not be given BNode identifiers.  Literal  Used when the object of the statement has no resource identifier.  Represents statement itself RDF

Page: RDF Serialization Formats  RDF/XML  N3  N-Triples  TRiG  TRiX  Turtle  JSON  JSON-LD  RDFa

Page: RDF Serialization Formats  RDF/XML is one of the most used serialization format making use of  Relative and absolute URIs  Namespaces  XSD Datatypes  The terms declared in XML Information Set

Page: RDF Serialization Formats  Notation 3 (N3)  More human readable RDF serialization  Aims to integrate logic and data in the same language by allowing smooth integration of rules with RDF

Page: RDF Examples  RDF/XML <rdf:RDF xml:base=“ xmlns:rdf=" xmlns:dbpprop=“ xmlns:dbpont=“

Page: RDF Examples Cont’d  N3 notation of previous ex: ex:Japan rdf:Type dbpont:Country; dbpprop:areaKm ; dbpprop:city ex:Tokyo. ex:Tokyo rdf:Type dbpont:City

Page: Storing Semantic Data  Need for specialized designs for triple collections  Two modalities:  Relational databases  Triple stores  Mostly used for storage  Lots of implementations  They can also be RDB based.

Page: Triple Store  A purpose-built database for the storage and retrieval of RDF data.  Optimized place to add, remove and query for triples. Each triple in the TripleStore complies with the form (subject, predicate, object)

Page: Considering XML Databases  XML databases are existing storage systems for semi- structured data  Idea: Transform RDF to XML and store it in XML databases  Yet, XML data model is not exactly same with semantic data  XML data model is a tree-like structure  RDF data is represented through a graph without an hierarchy Copyright IKS Consortium 17

Page: Considering XML Databases  XML Databases are not suitable for storage and querying RDF  Only simple manipulations can be handled through XML query languages  RDF Schema processing and inference is not possible  Standard RDF/XML mapping is unsuitable Copyright IKS Consortium 18

Page: Monolithic approach for DB Based Triple Stores  Generic representation for all RDF schemas  Only two tables are used  Resources table  Triples table Copyright IKS Consortium 19

Page: Monolithic approach for DB Based Triple Stores Copyright IKS Consortium 20 predidsubidobjidobjvalue Sunscal e iduri 1http:// 2http:// 3http:// 4http:// 5http:// 6http:// 7http:// ns#Property 8http:// 9rl

Page: Triples Stores  Can be categorized into 3 category:  In memory triple stores  Used for certain operations like benchmarking, caching, etc  Native triple stores  Provides their own implementations (Virtuoso, Mulgara, AllegroGraph, …)  Non memory non native triple stores  Are built on third party databases (Jena SDB, Kaon, …)

Page: Functionalities provided by Triple Stores  RDBMS-support  General RDF model access  Query language support in the store such as RQL, SPARQL  Some stores provide:  Provenance - tracking of who-said-what  APIs for accessing triple store over network  Very few stores provide:  Full text search  Inference and rule languages Copyright IKS Consortium 22

Page: Example Triple Store implementations  RDF Suite  Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases, SemWeb, 2001  Based on an ORDBMS model  Sesame   Relational databases (mysql, postgres, oracle)  Jena   Relational databases (mysql, postgres, oracle)  Virtuoso   Native RDF Quad Storage (Physical Quads) Copyright IKS Consortium 23

Page: RDFSuite (ICS-Forth)* * IST C-Web, IST Mesmuses

Page: How triples are stored and accessed in RDF Suite  Separate tables are created to store resources  Properties, subClasses, subProperties and instances  Indices on attributes like URI, source and target  Querying is possible through RQL Copyright IKS Consortium 25

Page: How triples are stored and accessed in RDF Suite Copyright IKS Consortium 26 [ Figure from *] *Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases, SemWeb, 2001

Page: Sesame Architecture  DBMS-independent API for accessing triple repositories  SAIL API  A set of Java interfaces between other modules and repository  Abstract from the actual storage mechanism  Query Module  RQL support  Different ways to communicate with clients  Through Protocol handlers Copyright IKS Consortium 27 *Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International Semantic Web Conference, 2002

Page: SAIL API over PostgreSQL  PostgreSQL  Object-relational DBMS  Support sub-table relations between its tables for providing RDF Schema class and property subsumption  Individuals are represented under separate tables created for resources  Difficult to add table *Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International Semantic Web Conference, 2002

Page: SAIL API over MySQL  MySQL  The database schema does not change when the RDFS changes  Has advantage where RDFS is unstable *Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International Semantic Web Conference, 2002

Page: Jena2 Architecture Copyright IKS Consortium 30

Page: Jena2 Architecture Copyright IKS Consortium 31 *Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB'03, The first International Workshop onKevin WilkinsonCraig SayersHarumi A. KunoDave Reynolds Semantic Web and Databases

Page: Jena2  Jena2  Denormalized schema  Avoids unnecessary joins by merging URIs, literals in statements table  Multiple statement tables  Better locality and caching  Property Tables Copyright IKS Consortium 32

Page: Normalized vs Denormalized Tables Copyright IKS Consortium 33

Page: Property Tables Copyright IKS Consortium 34 SubjectPropertyObject person1nameAlice person1age32 person1twinOfperson2 person1faxPhonex1234 person1adminPhx5678 person2nameBob person2age35 person2adopteeOfperson6 person2friendOfperson8 person2gendermale SubjectPropertyObject person1twinOfperson2 person1faxPhonex1234 person1adminPhx5678 person2adopteeOfperson6 person2friendOfperson8 IDnameagegender p1Alice32- p2Bob35male Triple Store Person Property Table Triple Store Only *Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB'03, The first International Workshop onKevin WilkinsonCraig SayersHarumi A. KunoDave Reynolds Semantic Web and Databases

Page: Jena Persistence Options  SDB  Scalable storage and query for RDF  Specifically designed for SPARQL support  Supports: MySQL, PostgreSQL, Oracle 11g, Microsoft SQL server and IBM DB2  Scales to graphs of 100 million triples Copyright IKS Consortium 35

Page: Jena Persistence Options  TDB  Provides for large scale storage and query of RDF datasets using a pure Java engine  Supports SPARQL  A non-transactional, faster database solution for use by a single system  It scales well beyond SDB and is simpler to setup Copyright IKS Consortium 36

Page: Virtuoso  General purpose RDBMS with extensive RDF adaptations  RDF data is stored as RDF quads, i.e. it supports RDF with named graphs  i.e. graph, subject, predicate, object tuples  The columns are G for graph, P for predicate, S for subject and O for object Copyright IKS Consortium 37

Page: Querying Semantic Data  Semantic data can be queried from triple stores by  Various query languages  SPARQL  Different endpoints provided  RQL  RDQL  SeRQL  …  API Calls  Through proprietary APIs of different projects  Linked Data

Page: SPARQL  Is an RDF query language  Standardized by W3C Concortium  Similar concept of SQL for databases  Syntactically resembles to SQL  RDF Graphs instead of databases

Page: SPARQL  Provides 4 types of query:  SELECT  Used for querying RDF graph by selecting certain fields from the query pattern  CONSTRUCT  Used for constructing a single RDF graph specified by a graph template  ASK  Used for querying existence of a resource  DESCRIBE  Used for construction an RDF graph, but the structure of graph determined by SPARQL query processor unlike Constructor type

Page: SPARQL  SPARQL SELECT queries can be constructed through the following parts:  Prefix declaration  Field declarations  Dataset selection  Query pattern  Query modifiers

Page: Prefix Declaration  Prefix declarations are specified to be able to use short URIs instead of the full ones PREFIX rdf:. PREFIX foaf:.

Page: Field declarations  Desired fields from the query pattern are specified after SELECT keyword with PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name

Page: Dataset selection  RDF graphs to be queried are stated after together with a FROM clause PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name FROM

Page: Query Pattern  A set of triples that determines the target resources to be selected PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. }

Page: Query Modifiers  There are a plenty of query modifiers in SPARQL providing post processing of query results  Order  Limit  Distinct  Offset  Reduced  Projection

Page: Query Modifiers  They are used to  Eliminate duplicate results  Ordering the results  Limiting the number of returned results, etc PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. } ORDER BY ?name LIMIT 5 OFFSET 10

Page: FILTER and OPTINAL CLAUSES  FILTER provides calling a subset of functions provided by XQuery specification in query pattern of SPARQL PREFIX rdf:. PREFIX foaf:. PREFIX fn:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. FILTER (fn:string-length(?name) > 10) }

Page: FILTER and OPTINAL CLAUSES  OPTIONAL clause enables to specify query patterns of which match are not obligatory in query execution. PREFIX rdf:. PREFIX foaf:. PREFIX fn:. PREFIX info:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. FILTER (fn:string-length(?name) > 10) OPTIONAL { ?person foaf:homepage ?page. } }

Page: ASK Query Example PREFIX foaf:. ASK WHERE { ?person foaf:name “Tim Berners-Lee”. }  Check existences of a resource having “Tim Berners- Lee” as foaf:name.

Page: CONSTRUCT Query Example  Below example construct an new RDF graph by changing type values of skos:Concept resources. In other words, that means transformation of skos vocabulary to a new custom vocabulary PREFIX rdf:. PREFIX skos:. PREFIX myvocab:. CONSTRUCT { ?person rdf:Type myvocab:MyType. } WHERE { ?person rdf:Type skos:Concept. }

Page: SPARQL Endpoints  Provides functionality to query the knowledge base via the SPARQL language  Accepts queries and returns results through HTTP protocol  Query results can be in different formats such as  RDF  XML  HTML  JSON  CSV

Page: Semantic Data Access With API Calls  Open source projects provides APIs to manipulate RDF data  Jena  Apache Clerezza  Sesame  JRDF

Page: Jena  Jena provides a rich API to manipulate the RDF stored in the underlying triple store.  Model to represent graphs  CRUD methods for triples  Querying methods for existing resources  See the next slide for the code snippet…

Page: Jena Code Snippet String personURI = " String givenName = "John"; String familyName = "Smith"; String fullName = givenName + " " + familyName; // create an empty Model which represents an RDF graph Model model = ModelFactory.createDefaultModel(); // create the resource which will produce the triples in the next slide Resource johnSmith = model.createResource(personURI).addProperty(VCARD.FN, fullName).addProperty(VCARD.N, model.createResource().addProperty(VCARD.Given, givenName).addProperty(VCARD.Family, familyName));

Page: Jena  Created triples with the code snippet in previous slide: (, VCARD.FN, “John Smith”) (, VCARD.FN, _) (_, VCARD.Given, “John”) (_, VCARD.Family, “Smith”) Note that _ symbol represents a blank node

Page: Apache Clerezza  Provides an API regardless from the different triples stores it supports  Its API provides a model to represent RDF graphs and manipulate those graphs  Also provides an SPARQL endpoint to query the stored knowledge

Page: Apache Clerezza Code Snippet String base = “ MGraph g = new SimpleMGraph(); g.add( new TripleImpl( new UriRef(base + “JohnSmith”), new UriRef(rdf:Type) new UriRef(foaf:Person))); g.add( new TripleImpl( new UriRef(base + “JohnSmith”), new UriRef(VCARD:FN) LiteralFactory.getInstance().createTypedLiteral(“John”)));  Simple code snippet adding two triples to the graph:

Page: Linked Data  Interrelated datasets on the Web so that computers can explore them  Has a standard format to be accessed and managed  Provides integration and reasoning on a huge amount of data on the Web

Page: Linked Data  Four famous principles of linked data represented by Tim Berners-Lee  Use URIs as names of things  Use HTTP URIs to provide dereferencable data to people  When an URI is dereferenced provide useful information in standard format (RDF, SPARQL)  Provide links to other URIs to make possible discovery of related data

Page: Linked Data

Page: Linking Open Data Project  Is an W3C SWEO Project  Aims to make data freely to everyone  Aims to publish open data sets as RDF and set semantic relationships between them  Serves information in a machine readable format  Enriches content  Reduces duplication  Linked datasets increasing rapidly  A large number of datasets are linked already

Page: Linked Datasets As of October 2008

Page: Linked Datasets As of September 2010

Page: 2011

Page: Access Data In The Cloud  Follow the RDF links representing the “things”  SPARQL Endpoints  Ready to use software to discover linked data (See the next slide)

Page: Linked Data Applications  Lots of application on top of the linked data  Tabulator  Marbles  Openlink RDF Browser  …  Just google  RDF Crawlers  RDF Browsers  Also see the following link containing a number of linked data applications:  LinkingOpenData/Applications LinkingOpenData/Applications

Page: Available SPARQL Endpoints    To see possible SPARQL endpoints providing a certain URI see   See also a list of alive SPARQL endpoints 

Page: References     web?src=related_normal&rel= web?src=related_normal&rel=      Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases, SemWeb, 2001  Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International, Semantic Web Conference, 2002  Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB'03, The first International Workshop on Semantic Web and Databases  