RDF David R Newman drn05r@ecs.soton.ac.uk 15 July 2009
Overview RDF Generation Ontology Design SPARQL Endpoint OAI-ORE Export & Research Objects Linked Data Why Bother? RDF Generation Ontology Design SPARQL Endpoint OAI-ORE Export & Research Objects Why Bother?
RDF Generation - Issues Don’t want to expose all database data Irrelevant Secret Private Rationalisation of data Link up data Reuse data properties Greater / more logical abstraction Don’t want to expose all database data Irrelevant – plugin schema and oauth tables Secret – Crypted password and salt, reset password code, remember tokens, oauth tables Private – Contributions, open_id, regsitered email account Rationalisation of data Link up data – Bits of the same object in different SQL tables, multiple sql fields into single RDF property Reuse data properties – map sql fields to pre-defined RDF properties Greater / more logical abstraction – Database has evolved and is strongly tied to the web interface. The logical model is different and can and should be more consistent and structured.
RDF Generation - Implementation Bespoke SQL queries Simple mapping language URIs http://rdf.myexperiment.org/<type>/<id> Resolves to RDF/XML representation Protecting private data 401 authorization required Authorization credentials stored in session Bespoke SQL queries – Join up tables so can generate all properties of an object. Restrict the records that are selected dependent on domain, private/public/protected Simple mapping language – Map fields to RDF property directly, map multiple fields to compound property, use a function to munge data for formatting or representing the data more logical. E.g. Working out who is the requester and accepter for a Friendship/Membership URIs http://rdf.myexperiment.org/<type>/<id> Resolves to RDF/XML representation - Are these URIs right. Should we use content resolution and use same URIs as web interface Protecting private data 401 authorization required Authorization credentials stored in session
RDF Example <mebase:Group rdf:about="http://rdf.myexperiment.org/Group/9"> <mebase:human-start-page rdf:resource="http://www.myexperiment.org/groups/9"/> <sioc:has_owner rdf:resource="http://rdf.myexperiment.org/User/70"/> <dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"> 2007-07-23T17:02:58Z </dcterms:created> <dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"> 2008-09-17T09:53:02Z </dcterms:modified> <sioc:name rdf:datatype="http://www.w3.org/2001/XMLSchema#string"> myExperiment </sioc:name> <dcterms:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string"> <p>This is the official group for the myExperiment team</p> </dcterms:description> <mebase:auto-accept rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean"> 0 </mebase:auto-accept> </mebase:Group> Example of a Group in RDF/XML Had to map networks in SQL to Group in RDF Adding RDF URI domain to user_id for group owner Dates needed to be reformatted Description needs to xml entity escaped to produce valid xml Datatypes specified – automatically pulled in from the ontology. External properties defined manually
Ontology Design http://rdf.myexperiment.org/ontologies/ Reuses properties and extends classes From Dublin Core, SIOC, FOAF, OAI-ORE and Creative Commons Modularized http://rdf.myexperiment.org/ontologies/ Reuses properties and extends classes From Dublin Core, SIOC, FOAF, OAI-ORE and Creative Commons Modularized - SNARM for policies/permissions - Base – Users/Groups, abstract Contribution/Annotation class. Friendships, Memberships, Invitations and Messages - Annotations and Contributions – specific types - Viewings & Downloads, Attribution & Creditation, Packs and Experiments are fairly obvious to what they are - Components for Workflows. Designed for Taverna but is quite generic - Specific Imports all the modules and pull things together as well as providing classes/instances specific to myExperiment
SPARQL Endpoint Public RDF/XML imported to JENA triplestore PHP hooks provide web interface http://rdf.myexperiment.org/sparql http://rdf.myexperiment.org/sparql Public RDF/XML imported to JENA triplestore PHP hooks provide web interface
OAI-ORE Export & Research Objects For Packs and Experiments http://rdf.myexperiment.org/Aggregation/Pack/1 303 redirects dependent on HTTP accept value application/rdf+xml -> RDF resource map application/atom+xml -> Atom entry text/html -> splash page Atom Feeds http://rdf.myexperiment.org/AtomFeed/Packs http://rdf.myexperiment.org/AtomFeed/Experiments http://rdf.myexperiment.org/AtomFeed/Experiment/12 Build on OAI-ORE Export for Research Objects For Packs and Experiments http://rdf.myexperiment.org/Aggregation/Pack/1 303 redirects dependent on HTTP accept value application/rdf+xml -> RDF resource map application/atom+xml -> Atom entry text/html -> splash page Atom Feeds http://rdf.myexperiment.org/AtomFeed/Packs http://rdf.myexperiment.org/AtomFeed/Experiments Build on OAI-ORE Export for Research Objects – Has helped to inform the design of ROs
Linked Data Hash URIs or 303 Redirects 303 Redirects from the Mothership 303 Redirects designed for content resolution Current design more suited to 303 redirects RDF “Links” Need to be linked to/from a project already part of the Linked Data Project (e.g. RKBExplorer, Eprints, etc.) Minimum 100 links required
Why Bother? RDF is a standard format Generic rather than structured SPARQL allows flexible queries Align termininology with other projects Inference RDF is a standard format - Other people are using it and lots of tools out there to consume it Generic rather than structured - Allows easy adaptation when new data needs to be represented SPARQL allows flexible queries - A querying interface that allows users great flexibility to what they can query Align termininology with other projects - Makes data interchange easier Inference – Get more for less