Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Reasoning and Querying for the Web Dr. Axel Polleres
Digital Enterprise Research Institute 2 of URQ Problem: The Web is not(!) a database "If HTML and the Web made all the online documents look like one huge book, RDF, schema and inference languages will make all the data in the world look like one huge database„ Tim Berners-Lee, Weaving the Web, 1999 Database? Not quite! RDBMS Social Interactions: Aggregated view vs. privacy concerns Dynamics: historical data changes, trends Sensors: limited resources distribution incomplete data … and btw. WWW, 10 billion+ pages going up Still, we want an integrated view! RDBMS Heterogeneity: Health records Digital Libraries, CMS, blogs… CMS
Digital Enterprise Research Institute 3
Digital Enterprise Research Institute For Today – Let’s stick with mainly the first box: 0) Where is the data? 1) Formal languages & how they interplay 2) Reasoning with Web data 3) Querying Web data 4
Digital Enterprise Research Institute 0) Prerequisite – Good news! Thanks to UDI2, UIMR, USCS, UNLP, USM etc. we may assume to already have loads of RDF/OWL data available: Crawled from the Web, Linked Data Cloud From Semantic Desktop Mined/aggregated from Sensors Extracted by shallow NLP techniques from text Etc. Cool, now how shall we aggregate it? 5
Digital Enterprise Research Institute 1) Languages & how they interplay Challenges: Querying XML & RDF: XQuery & SPARQL Ontologies & Rules: OWL2 & RIF Querying Ontologies & Rules: SPARQL/OWL+RIF 6
Digital Enterprise Research Institute XSPARQL: Bringing XML and RDF closer! What if I want to translate all my RDF and OWL data back to XML? E.g. Display FOAF data on a KML map. What to use? Custom Script? XSLT? SPARQL? 7
Digital Enterprise Research Institute XSPARQL: Close the gap in the Web of Data: 8 SOAP/WSDL RSS HTML SPARQL XSLT/XQuery XSPARQL
Digital Enterprise Research Institute Example: Generate KML from FOAF+Geo: 9 XSPARQL+FOAF and GEO data enables KML map data: e foaf:name “Stefan Decker” foaf:locatedIn geo:lat geo:lat More: { for $person $name $long $lat from where { $person a foaf:Person; foaf:name $name; foaf:based_near [ geo:long $long; geo:lat $lat ] } return {fn:concat("Location of ", $name)} {fn:concat($long, ",", $lat, ",0")} }
Digital Enterprise Research Institute XSPARQL “crew” Nuno Lopes Stefan Bischof Alumni: Waseem Akhtar Thomas Krennwallner 10
Digital Enterprise Research Institute Reasoning needed! SELECT ?N WHERE {?P foaf:Name ?N } Returns: “Stefan Decker” 11 “Stefan Decker” foaf:name
Digital Enterprise Research Institute Reasoning needed! SELECT ?N WHERE {?P a foaf:Person; foaf:Name ?N } Returns: nothing Why? RDFS reasoning needed! Together with: foaf:knows rdfs:domain foaf:Person We could infer that Stefan is a person… Too bad, standard SPARQL (and XSPARQL) engines don’t do this! You may think this is easy… just add the inferred triples, but… 12 “Stefan Decker” foaf:name foaf:knows axel
Digital Enterprise Research Institute Reasoning and SPARQL is tricky! Let’s assume we know: Every Person who is known has a name Cool, we can write this in OWL! Cool, let’s do SPARQL over it! Let’s ask for names of all known Persons? SELECT ?N { [] foaf:knows ?P. ?P foaf:name ?N } Shouldn’t have an answer, or should it? 13 “Stefan Decker” foaf:name foaf:knows axel
Digital Enterprise Research Institute Reasoning and SPARQL is tricky! Let’s assume we know: Every Person who is known is has a name Cool, we can write this in OWL! Now how about this one: Is there a known person with a name? ASK?N { [] foaf:knows ?P. ?P foaf:name ?N } Hmmm, should have an answer, or shouldn’t it? 14 “Stefan Decker” foaf:name foaf:knows axel
Digital Enterprise Research Institute RDF & OWL don’t solve heterogeneity Example assumed we had Stefan’s data in FOAF, but in my desktop application, I might have e.g. vCard instead: Needs more than OWL! Rules, Alignments! E.g.: { ?P foaf:name ?N } :- { ?P vCard:Given ?G; vCard:Family ?F ?N = fn:concat(?G, “ “, ?F) } 15 “Stefan Decker” foaf:name “Stefan”vCard:Given “Decker” vCard:Family
Digital Enterprise Research Institute Reasoning needed! W3C SPARQL Working Group W3C OWL2 WG W3C Rule Interchange Format Working groups work on new standards to solve these issues … and more! DERI members involved: SPARQL: Alex Passant & myself (co-chair) OWL2: Antoine Zimmermann RIF: myself 16
Digital Enterprise Research Institute 2) Reasoning with Web Data Challenges Authoritativeness Scale Completeness, Consistency 17
Digital Enterprise Research Institute Who can say what & where on the Web?? Authoritative Reasoning : Who gets to say what FOAF spec authoritative for foaf:Person ✓ MY spec not authoritative for foaf:Person ✘ Only allow extension in authoritative documents my:Person rdfs:subClassOf foaf:Person. (MY spec) ✓ BUT: Reduce obscure memberships foaf:Person rdfs:subClassOf my:Person. (MY spec) ✘ ALSO: Protect specifications foaf:mbox rdf:type owl:SymmetricProperty. (MY spec) ✘ Linear scale for most rules 1.1bn in => bn out <10 hours Ontology Hijacking 18 Check our ASWC paper! Tech Report! Journal version to come at IJSWIS!
Digital Enterprise Research Institute Reasoning with Web Data Aidan Hogan SAOR/SWSE (planned work): – extend SAOR to support OWL2 – datatype support – develop SPARQL(2) on top of SWSE – Validation of Web data Alumni: Andreas Harth Antoine Zimmermann Reasoning with Alignments Gergely Lukacsy Complete DL Reasoning (OWL2) at large scale 19
Digital Enterprise Research Institute 3) Querying Web Data Challenges: Finding/Querying Distributed SPARQL endpoints More ideas we currently work on: Query distributed SPARQL endpoints Querying Temporal data on the Web (e.g. Wikis, ontologies) 20
Digital Enterprise Research Institute Help the end user Challenges: End users don’t want to write RDF End users don’t speak OWL End users don’t care about SW! How can we change this? Help users to find the right ontologies to use How to avoid the publication of messy Web data in first place 21
Digital Enterprise Research Institute Help the end user: Semantic Drupal: 22 Enable data mining, text- analysis, reasoning, aggregation, trend detection over different platforms
Digital Enterprise Research Institute Semantic Drupal: Populate the Web of Data Reasonably Check prototype at: Main developer: Stéphane Corlosquet Also check our TR! – Explains Reasoning in the background Next Steps: – Not only export RDF from Drupal, but also allow Drupal Sites to consume Linked Data and act as SPARQL endpoints – Ontology Term Search Engine (powered by SWSE&Sindice, Antoine, Aidan & Renaud) for easier vocab linkage – Stéphane to conclude his MSc thesis 23
Digital Enterprise Research Institute What I didn’t talk about: Projects, Collaborations Upcoming visitors (Prof. George Vouros, Prof. Piero Bonatti) Things cooking Reasoning/Querying over heterogeneous data from DERI HCLS Domain (Ratnesh Sahay & Antoine Zimmermann) Temporal SPARQL GiaBATA Rules and Ontology aware SPARQL engine … 24
Digital Enterprise Research Institute Outlook That was some, but not ALL of the things happening in our research Unit. 1) Formal languages & how they interplay 2) Reasoning with Web data 3) Querying Web data If you have questions on SW standards! If you have questions about Reasoning& Querying techniques on RDF/OWL/RIF You are welcome to ask us! Let’s deploy the Semantic Web reasonably! 25