Presentation is loading. Please wait.

Presentation is loading. Please wait.

Co-funded by the European Union Semantic CMS Community Semantic Data Access Copyright IKS Consortium 1 SRDC Ltd. August, 2011.

Similar presentations


Presentation on theme: "Co-funded by the European Union Semantic CMS Community Semantic Data Access Copyright IKS Consortium 1 SRDC Ltd. August, 2011."— Presentation transcript:

1 Co-funded by the European Union Semantic CMS Community Semantic Data Access Copyright IKS Consortium 1 SRDC Ltd. August, 2011

2 www.iks-project.eu Page: Outline  Semantic Data  Semantic Web  RDF  Semantic Data Storage  Triple Stores  Semantic Data Access  SPARQL  RQL  API Calls Copyright IKS Consortium 2

3 www.iks-project.eu Page: Semantic Data  Stands for machine understandable information  Allows computers to figure out the data without user interference  Allows computers act intelligently without programming for each task

4 www.iks-project.eu Page: Semantic Data  Provides infrastructure to get practical results  Applications find out subsequent information based on the previous relations. (e.g Eiffel Tower -> Paris -> France)  Allows reasoning capabilities  Providing extraction of related information which is not directly linked

5 www.iks-project.eu Page: Semantic Web  A classical generic description:  “Web of data”  Extends the World Wide Web  By encouraging,  Common language for representing data  Transformable to/from disparate sources such as relational databases, XML, etc (RDF)  Common reusable data model to represent data from different domains in common terms (RDFS, OWL, etc)  Rules to enable applications reason over the information (SWRL)

6 www.iks-project.eu Page: Semantic Web Stack

7 www.iks-project.eu Page: Semantic Web  So many organizations publishing their data in different domains  Media  Geographic  Government  …  Whole set contains approximately 30 billion triples  One of the largest collections is DBPEDIA  Semantified version of Wikipedia  Example:  Obtain cities of China that have population over 20 million  Needs efficient storage and query for semantic data Copyright IKS Consortium 7

8 www.iks-project.eu Page: Representation of Semantic Data  RDF  The common data format  An abstract model with several serialization formats  Consists of statement referred as triples having the form (subject, predicate, object) where,  Subject: any resource identifier  Predicate: a resource identifier of any property  Object: either a resource identifier or a literal value

9 www.iks-project.eu Page:  Two types of resource identifiers:  URIRef  BNode.  Property  Used when talking about the particular aspect of a resource.  Must be represented by URIRefs and may not be given BNode identifiers.  Literal  Used when the object of the statement has no resource identifier.  Represents statement itself RDF

10 www.iks-project.eu Page: RDF Serialization Formats  RDF/XML  N3  N-Triples  TRiG  TRiX  Turtle  JSON  JSON-LD  RDFa

11 www.iks-project.eu Page: RDF Serialization Formats  RDF/XML is one of the most used serialization format making use of  Relative and absolute URIs  Namespaces  XSD Datatypes  The terms declared in XML Information Set

12 www.iks-project.eu Page: RDF Serialization Formats  Notation 3 (N3)  More human readable RDF serialization  Aims to integrate logic and data in the same language by allowing smooth integration of rules with RDF

13 www.iks-project.eu Page: RDF Examples  RDF/XML <rdf:RDF xml:base=“http://www.example.org” xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dbpprop=“http://dbpedia.org/property/” xmlns:dbpont=“http://dbpedia.org/ontology/”> 377944

14 www.iks-project.eu Page: RDF Examples Cont’d  N3 notation of previous example @prefix rdf:. @prefix dbpprop:. @prefix dbpont:. @prefix ex: ex:Japan rdf:Type dbpont:Country; dbpprop:areaKm 377944; dbpprop:city ex:Tokyo. ex:Tokyo rdf:Type dbpont:City

15 www.iks-project.eu Page: Storing Semantic Data  Need for specialized designs for triple collections  Two modalities:  Relational databases  Triple stores  Mostly used for storage  Lots of implementations  They can also be RDB based.

16 www.iks-project.eu Page: Triple Store  A purpose-built database for the storage and retrieval of RDF data.  Optimized place to add, remove and query for triples. Each triple in the TripleStore complies with the form (subject, predicate, object)

17 www.iks-project.eu Page: Considering XML Databases  XML databases are existing storage systems for semi- structured data  Idea: Transform RDF to XML and store it in XML databases  Yet, XML data model is not exactly same with semantic data  XML data model is a tree-like structure  RDF data is represented through a graph without an hierarchy Copyright IKS Consortium 17

18 www.iks-project.eu Page: Considering XML Databases  XML Databases are not suitable for storage and querying RDF  Only simple manipulations can be handled through XML query languages  RDF Schema processing and inference is not possible  Standard RDF/XML mapping is unsuitable Copyright IKS Consortium 18

19 www.iks-project.eu Page: Monolithic approach for DB Based Triple Stores  Generic representation for all RDF schemas  Only two tables are used  Resources table  Triples table Copyright IKS Consortium 19

20 www.iks-project.eu Page: Monolithic approach for DB Based Triple Stores Copyright IKS Consortium 20 predidsubidobjidobjvalue 621 537 518 592 39Sunscal e iduri 1http://www.iks.og/topics.rdfs#Hotel 2http://www.iks.og/topics.rdfs#HotelDirections 3http://www.oclc.org/dublincore.rdfs#title 4http://www.iks.og/schema.rdf#Ext.Resource 5http://www.w3.org/1999/02/22-rdf-syntax-ns#type 6http://www.w3.org/2000/01/rdf-schema#subClassOf 7http://www.w3.org/1999/02/22-rdf-syntax- ns#Property 8http://www.w3.org/2000/01/rdf-schema#Class 9rl

21 www.iks-project.eu Page: Triples Stores  Can be categorized into 3 category:  In memory triple stores  Used for certain operations like benchmarking, caching, etc  Native triple stores  Provides their own implementations (Virtuoso, Mulgara, AllegroGraph, …)  Non memory non native triple stores  Are built on third party databases (Jena SDB, Kaon, …)

22 www.iks-project.eu Page: Functionalities provided by Triple Stores  RDBMS-support  General RDF model access  Query language support in the store such as RQL, SPARQL  Some stores provide:  Provenance - tracking of who-said-what  APIs for accessing triple store over network  Very few stores provide:  Full text search  Inference and rule languages Copyright IKS Consortium 22

23 www.iks-project.eu Page: Example Triple Store implementations  RDF Suite  Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases, SemWeb, 2001  Based on an ORDBMS model  Sesame  http://www.openrdf.org/  Relational databases (mysql, postgres, oracle)  Jena  http://www.hpl.hp.com/semweb/jena2.htm  Relational databases (mysql, postgres, oracle)  Virtuoso  http://virtuoso.openlinksw.com/  Native RDF Quad Storage (Physical Quads) Copyright IKS Consortium 23

24 www.iks-project.eu Page: RDFSuite (ICS-Forth)* * IST-1999-13479 C-Web, IST-2000-26074 Mesmuses

25 www.iks-project.eu Page: How triples are stored and accessed in RDF Suite  Separate tables are created to store resources  Properties, subClasses, subProperties and instances  Indices on attributes like URI, source and target  Querying is possible through RQL Copyright IKS Consortium 25

26 www.iks-project.eu Page: How triples are stored and accessed in RDF Suite Copyright IKS Consortium 26 [ Figure from *] *Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases, SemWeb, 2001

27 www.iks-project.eu Page: Sesame Architecture  DBMS-independent API for accessing triple repositories  SAIL API  A set of Java interfaces between other modules and repository  Abstract from the actual storage mechanism  Query Module  RQL support  Different ways to communicate with clients  Through Protocol handlers Copyright IKS Consortium 27 *Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International Semantic Web Conference, 2002

28 www.iks-project.eu Page: SAIL API over PostgreSQL  PostgreSQL  Object-relational DBMS  Support sub-table relations between its tables for providing RDF Schema class and property subsumption  Individuals are represented under separate tables created for resources  Difficult to add table *Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International Semantic Web Conference, 2002

29 www.iks-project.eu Page: SAIL API over MySQL  MySQL  The database schema does not change when the RDFS changes  Has advantage where RDFS is unstable *Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International Semantic Web Conference, 2002

30 www.iks-project.eu Page: Jena2 Architecture Copyright IKS Consortium 30

31 www.iks-project.eu Page: Jena2 Architecture Copyright IKS Consortium 31 *Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB'03, The first International Workshop onKevin WilkinsonCraig SayersHarumi A. KunoDave Reynolds Semantic Web and Databases

32 www.iks-project.eu Page: Jena2  Jena2  Denormalized schema  Avoids unnecessary joins by merging URIs, literals in statements table  Multiple statement tables  Better locality and caching  Property Tables Copyright IKS Consortium 32

33 www.iks-project.eu Page: Normalized vs Denormalized Tables Copyright IKS Consortium 33

34 www.iks-project.eu Page: Property Tables Copyright IKS Consortium 34 SubjectPropertyObject person1nameAlice person1age32 person1twinOfperson2 person1faxPhonex1234 person1adminPhx5678 person2nameBob person2age35 person2adopteeOfperson6 person2friendOfperson8 person2gendermale SubjectPropertyObject person1twinOfperson2 person1faxPhonex1234 person1adminPhx5678 person2adopteeOfperson6 person2friendOfperson8 IDnameagegender p1Alice32- p2Bob35male Triple Store Person Property Table Triple Store Only *Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB'03, The first International Workshop onKevin WilkinsonCraig SayersHarumi A. KunoDave Reynolds Semantic Web and Databases

35 www.iks-project.eu Page: Jena Persistence Options  SDB  Scalable storage and query for RDF  Specifically designed for SPARQL support  Supports: MySQL, PostgreSQL, Oracle 11g, Microsoft SQL server and IBM DB2  Scales to graphs of 100 million triples Copyright IKS Consortium 35

36 www.iks-project.eu Page: Jena Persistence Options  TDB  Provides for large scale storage and query of RDF datasets using a pure Java engine  Supports SPARQL  A non-transactional, faster database solution for use by a single system  It scales well beyond SDB and is simpler to setup Copyright IKS Consortium 36

37 www.iks-project.eu Page: Virtuoso  General purpose RDBMS with extensive RDF adaptations  RDF data is stored as RDF quads, i.e. it supports RDF with named graphs  i.e. graph, subject, predicate, object tuples  The columns are G for graph, P for predicate, S for subject and O for object Copyright IKS Consortium 37

38 www.iks-project.eu Page: Querying Semantic Data  Semantic data can be queried from triple stores by  Various query languages  SPARQL  Different endpoints provided  RQL  RDQL  SeRQL  …  API Calls  Through proprietary APIs of different projects  Linked Data

39 www.iks-project.eu Page: SPARQL  Is an RDF query language  Standardized by W3C Concortium  Similar concept of SQL for databases  Syntactically resembles to SQL  RDF Graphs instead of databases

40 www.iks-project.eu Page: SPARQL  Provides 4 types of query:  SELECT  Used for querying RDF graph by selecting certain fields from the query pattern  CONSTRUCT  Used for constructing a single RDF graph specified by a graph template  ASK  Used for querying existence of a resource  DESCRIBE  Used for construction an RDF graph, but the structure of graph determined by SPARQL query processor unlike Constructor type

41 www.iks-project.eu Page: SPARQL  SPARQL SELECT queries can be constructed through the following parts:  Prefix declaration  Field declarations  Dataset selection  Query pattern  Query modifiers

42 www.iks-project.eu Page: Prefix Declaration  Prefix declarations are specified to be able to use short URIs instead of the full ones PREFIX rdf:. PREFIX foaf:.

43 www.iks-project.eu Page: Field declarations  Desired fields from the query pattern are specified after SELECT keyword with PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name

44 www.iks-project.eu Page: Dataset selection  RDF graphs to be queried are stated after together with a FROM clause PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name FROM

45 www.iks-project.eu Page: Query Pattern  A set of triples that determines the target resources to be selected PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. }

46 www.iks-project.eu Page: Query Modifiers  There are a plenty of query modifiers in SPARQL providing post processing of query results  Order  Limit  Distinct  Offset  Reduced  Projection

47 www.iks-project.eu Page: Query Modifiers  They are used to  Eliminate duplicate results  Ordering the results  Limiting the number of returned results, etc PREFIX rdf:. PREFIX foaf:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. } ORDER BY ?name LIMIT 5 OFFSET 10

48 www.iks-project.eu Page: FILTER and OPTINAL CLAUSES  FILTER provides calling a subset of functions provided by XQuery specification in query pattern of SPARQL PREFIX rdf:. PREFIX foaf:. PREFIX fn:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. FILTER (fn:string-length(?name) > 10) }

49 www.iks-project.eu Page: FILTER and OPTINAL CLAUSES  OPTIONAL clause enables to specify query patterns of which match are not obligatory in query execution. PREFIX rdf:. PREFIX foaf:. PREFIX fn:. PREFIX info:. SELECT ?person ?name FROM WHERE { ?person foaf:name ?name. FILTER (fn:string-length(?name) > 10) OPTIONAL { ?person foaf:homepage ?page. } }

50 www.iks-project.eu Page: ASK Query Example PREFIX foaf:. ASK WHERE { ?person foaf:name “Tim Berners-Lee”. }  Check existences of a resource having “Tim Berners- Lee” as foaf:name.

51 www.iks-project.eu Page: CONSTRUCT Query Example  Below example construct an new RDF graph by changing type values of skos:Concept resources. In other words, that means transformation of skos vocabulary to a new custom vocabulary PREFIX rdf:. PREFIX skos:. PREFIX myvocab:. CONSTRUCT { ?person rdf:Type myvocab:MyType. } WHERE { ?person rdf:Type skos:Concept. }

52 www.iks-project.eu Page: SPARQL Endpoints  Provides functionality to query the knowledge base via the SPARQL language  Accepts queries and returns results through HTTP protocol  Query results can be in different formats such as  RDF  XML  HTML  JSON  CSV

53 www.iks-project.eu Page: Semantic Data Access With API Calls  Open source projects provides APIs to manipulate RDF data  Jena  Apache Clerezza  Sesame  JRDF

54 www.iks-project.eu Page: Jena  Jena provides a rich API to manipulate the RDF stored in the underlying triple store.  Model to represent graphs  CRUD methods for triples  Querying methods for existing resources  See the next slide for the code snippet…

55 www.iks-project.eu Page: Jena Code Snippet String personURI = "http://somewhere/JohnSmith"; String givenName = "John"; String familyName = "Smith"; String fullName = givenName + " " + familyName; // create an empty Model which represents an RDF graph Model model = ModelFactory.createDefaultModel(); // create the resource which will produce the triples in the next slide Resource johnSmith = model.createResource(personURI).addProperty(VCARD.FN, fullName).addProperty(VCARD.N, model.createResource().addProperty(VCARD.Given, givenName).addProperty(VCARD.Family, familyName));

56 www.iks-project.eu Page: Jena  Created triples with the code snippet in previous slide: (, VCARD.FN, “John Smith”) (, VCARD.FN, _) (_, VCARD.Given, “John”) (_, VCARD.Family, “Smith”) Note that _ symbol represents a blank node

57 www.iks-project.eu Page: Apache Clerezza  Provides an API regardless from the different triples stores it supports  Its API provides a model to represent RDF graphs and manipulate those graphs  Also provides an SPARQL endpoint to query the stored knowledge

58 www.iks-project.eu Page: Apache Clerezza Code Snippet String base = “http://www.example.org#”; MGraph g = new SimpleMGraph(); g.add( new TripleImpl( new UriRef(base + “JohnSmith”), new UriRef(rdf:Type) new UriRef(foaf:Person))); g.add( new TripleImpl( new UriRef(base + “JohnSmith”), new UriRef(VCARD:FN) LiteralFactory.getInstance().createTypedLiteral(“John”)));  Simple code snippet adding two triples to the graph:

59 www.iks-project.eu Page: Linked Data  Interrelated datasets on the Web so that computers can explore them  Has a standard format to be accessed and managed  Provides integration and reasoning on a huge amount of data on the Web

60 www.iks-project.eu Page: Linked Data  Four famous principles of linked data represented by Tim Berners-Lee  Use URIs as names of things  Use HTTP URIs to provide dereferencable data to people  When an URI is dereferenced provide useful information in standard format (RDF, SPARQL)  Provide links to other URIs to make possible discovery of related data

61 www.iks-project.eu Page: Linked Data

62 www.iks-project.eu Page: Linking Open Data Project  Is an W3C SWEO Project  Aims to make data freely to everyone  Aims to publish open data sets as RDF and set semantic relationships between them  Serves information in a machine readable format  Enriches content  Reduces duplication  Linked datasets increasing rapidly  A large number of datasets are linked already

63 www.iks-project.eu Page: Linked Datasets As of October 2008

64 www.iks-project.eu Page: Linked Datasets As of September 2010

65 www.iks-project.eu Page: 2011

66 www.iks-project.eu Page: Access Data In The Cloud  Follow the RDF links representing the “things”  SPARQL Endpoints  Ready to use software to discover linked data (See the next slide)

67 www.iks-project.eu Page: Linked Data Applications  Lots of application on top of the linked data  Tabulator  Marbles  Openlink RDF Browser  …  Just google  RDF Crawlers  RDF Browsers  Also see the following link containing a number of linked data applications:  http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/ LinkingOpenData/Applications http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/ LinkingOpenData/Applications

68 www.iks-project.eu Page: Available SPARQL Endpoints  http://dbpedia.org/sparql http://dbpedia.org/sparql  http://www4.wiwiss.fu-berlin.de/dblp/ http://www4.wiwiss.fu-berlin.de/dblp/  To see possible SPARQL endpoints providing a certain URI see  http://void.rkbexplorer.com/endpoint-search/ http://void.rkbexplorer.com/endpoint-search/  See also a list of alive SPARQL endpoints  http://www.w3.org/wiki/SparqlEndpoints http://www.w3.org/wiki/SparqlEndpoints

69 www.iks-project.eu Page: References  http://www.w3.org/TR/rdf-sparql-query http://www.w3.org/TR/rdf-sparql-query  http://jena.sourceforge.net/tutorial/RDF_API/index.html http://jena.sourceforge.net/tutorial/RDF_API/index.html  http://www.slideshare.net/ldodds/sparql-tutorial http://www.slideshare.net/ldodds/sparql-tutorial  http://www.slideshare.net/shamod/a-hands-on-overview-of-the-semantic- web?src=related_normal&rel=1702851 http://www.slideshare.net/shamod/a-hands-on-overview-of-the-semantic- web?src=related_normal&rel=1702851  http://www.cambridgesemantics.com/2008/09/sparql-by-example http://www.cambridgesemantics.com/2008/09/sparql-by-example  http://linkeddata-specs.info/ http://linkeddata-specs.info/  http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData  http://www.bioontology.org/wiki/images/6/6a/Triple_Stores.pdf http://www.bioontology.org/wiki/images/6/6a/Triple_Stores.pdf  Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases, SemWeb, 2001  Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International, Semantic Web Conference, 2002  Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB'03, The first International Workshop on Semantic Web and Databases  http://jena.sourceforge.net/DB/index.htmlhttp://jena.sourceforge.net/DB/index.html  http://virtuoso.openlinksw.com/ http://virtuoso.openlinksw.com/


Download ppt "Co-funded by the European Union Semantic CMS Community Semantic Data Access Copyright IKS Consortium 1 SRDC Ltd. August, 2011."

Similar presentations


Ads by Google