Presentation is loading. Please wait.

Presentation is loading. Please wait.

RDF Databases By: Chris Halaschek. Outline Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction.

Similar presentations


Presentation on theme: "RDF Databases By: Chris Halaschek. Outline Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction."— Presentation transcript:

1 RDF Databases By: Chris Halaschek

2 Outline Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction Demo Future Directions

3 Motivation Having metadata available is not enough Need tools to process, transform, and reason with the information Need a way to store the metadata and interact with it

4 Requirements Scalable Good performance Useful query language

5 Storage Issues How to store the data? In relational database as tables Querying requires many joins…costly Triples Native graph structure Querying requires graph traversals…need efficient algorithms

6 Sesame - Introduction Open source RDF Schema-based repository and querying facility Developed as a research prototype by Aidministrator Nederland bv NLnet Foundation sponsors its further development as open source software

7 Sesame - Introduction Can handle RDF data in XML-serialized RDF and N-Triples format Can extract the contents of a Sesame repository in XML-serialized RDF, N- Triples, and N3 format

8 Sesame – Architecture

9 Repository Many options due to Repository Abstraction Layer (RAL) DBMS – relational, object-relational, etc Existing RDF stores RDF files RDF network services

10 Repository Abstraction Layer (RAL) Interface that translates RDF-specific methods to a specific DBMS Defined by an RDF API Created their own set of interfaces rather than adopt or extent the existing RDF API proposal Existing API targeted main memory model Theirs offers specific operations that support RDF Schema semantics (i.e. subsumption reasoning)

11 RAL Continued Several of Sesame’s functional modules are clients of the RAL Problems: Must read from repository – performance decrease Solution – selectively caching data in memory For small repositories, all data can be cached

12 Functional Modules Interact with RAL RQL query module Evaluates RQL queries RDF administration module Allows uploading RDF data and schema information, as well as deleting information RDF export module Allows extraction of schema and/or data from repository

13 RQL Query Module Proposed RQL: Developed within the European IST project C-Web Follow-up project by ICS at FORTH, in Greece Adopts the syntax of OQL Sesame’s implementation of RQL is slightly different from the proposed RQL Better compliance to W3C specificaitons Support for optional domain and range restrictions Queries are translated into sets of call to the RAL Note: Also supports RDQL – based on SquishQL

14 RQL Query Module

15 Admin Module Main functions: Add RDF data/schema information Clear repository Retrieves information from an RDF(s) source and parses it using SiRPAC RDF parser Parser delivers information to admin module in statement form – (S,P,O) Module check statements for consistency and then inserts data

16 RDF Export Module Exports the contents of a repository formatted in XML-serialized RDF Supplies a basis for using Sesame in combination with other RDF tools

17 Communication with Sesame Multiple options for various contexts HTTP RMI SOAP Intermediaries between the functional modules and their clients

18 Sesame – Architecture

19 Sesame - Scalability Performance Tests Uploaded and queried collection of nouns from Wordnet – 400,000 RDF statements Performed on Sun UltraSPARC 5, 256 MB RAM Used Java Servlets running on web server to communicate of HTTP PostgreSQL version 7.1.2 repository

20 Scalability Continued Uploading nouns 94 minutes 71 statements per second Querying was much slower than expected Due to distributed storage over multiple tables Retrieving data required doing many joins

21 Sesame’s Future Migration of Sesame to alternate repositories to boost performance DAML + OIL support

22 RQL Introduction Museum schema example

23 RQL - Syntax Query typically built upon three clauses Select Projection over query results From Bind variables to specific locations in graph model Where Optional – constraint on values of variables in the from clause

24 RQL - Example select X, @P from {X} @P {Y} where Y like "Pablo" x and y are bound to nodes @P bound to a connecting edge - @ prefix signifies the variable is bound to properties $ prefix signifies classes http://sesame.aidministrator.nl/sesame/actionFrameset.jsp ?repository=museum http://sesame.aidministrator.nl/sesame/actionFrameset.jsp ?repository=museum

25 RQL - Namespaces In RDF, nodes and edges are identified by URIs Can be very long Namespace abbreviation mechanism Extra clause using namespace cult = http://www.icom.com/schema.rdf#http://www.icom.com/schema.rdf# Simply type: cult:paints

26 RQL – Path Expressions Specify a linear path through the graph select PAINTER, PAINTING, TECH from {PAINTER} cult:paints {PAINTING}. cult:technique {TECH} using namespace cult = http://www.icom.com/schema.rdf# http://sesame.aidministrator.nl/sesame/actionFramese t.jsp?repository=museum http://sesame.aidministrator.nl/sesame/actionFramese t.jsp?repository=museum

27 RQL – Querying Schema Retrieving the class of a resource select X, $X, Y from {X : $X} cult:paints {Y} using namespace cult = http://www.icom.com/schema.rdf#http://www.icom.com/schema.rdf# Variable $X is matched to the class of the resource value of X http://sesame.aidministrator.nl/sesame/actionFramese t.jsp?repository=museum http://sesame.aidministrator.nl/sesame/actionFramese t.jsp?repository=museum

28 RQL – Querying Schema Constraining resources to a schema select X, Y from {X : cult:Cubist } cult:paints {Y} using namespace cult = http://www.icom.com/schema.rdf#http://www.icom.com/schema.rdf#

29 RQL – Standard Functions Class (also Property) subClassOf (also subProperyOf) typeOf In all above use ^ for only direct descendents (i.e. subClassOf^( cult:Painter ) )

30 RQL – subClassOf Example: select X, @P, Y from {X} @P {Y} where X in subClassOf^( cult:Painter ) using namespace cult = http://www.icom.com/schema.rdf#

31 RQL – Advanced Queries Set Operators Union, Intersection, Difference Logical Operators Domain and Range Constraints Comprehensive List: http://sesame.aidministrator.nl/publications/rql-tutorial.html

32 Future of RDF Databases Standard query language Improved storage structures Native graph model

33 References / Links Sesame: http://sesame.aidministrator.nl/ http://sesame.aidministrator.nl/ NLnet Foundation: http://www.nlnet.nl/ http://www.nlnet.nl/ Original Specifications of RQL: http://139.91.183.30:9090/RDF/RQL http://139.91.183.30:9090/RDF/RQL


Download ppt "RDF Databases By: Chris Halaschek. Outline Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction."

Similar presentations


Ads by Google