Relational Databases to RDF (a.k.a RDB2RDF) Juan F. Sequeda Dept of Computer Science University of Texas at Austin
I want RDF… but my data is in RDB! 2
Why RDB2RDF? Semantic Web –Deep Web is 500 times bigger than Static Web (2008) –Where do you think that the majority of the data is stored? –If we want a Semantic Web, we need data to be on the web as RDF and interlinked! Where do you think this data is going to come from?
RDB
RDB2RDF
Why RDB2RDF? Data Integration –Do you know why RDF is cool? because it’s a graph! –How do link/integrate two different graphs? add edges between nodes or merge nodes!
Boss: Find me clients that are based in cities who have a population less than 1 million? You: ??? idNamec_id 10ACME Inc20 11Foo Bars21 c_idcitystate 20AustinTX 21DallasTX Clients Locations Real world scenario
You: I found the population information… but it’s in a different database. Can you add a column to the Location table in order to insert the new data? DBA: NO! idcitystatepop 1AustinTX DallasTX Location Real world scenario idNamec_id 10ACME Inc20 11Foo Bars21 c_idcitystate 20AustinTX 21DallasTX Clients Locations
idcitystatepop 1AustinTX DallasTX Location ient10 client ACME Inc Foo Bars Austin TX Dallas TX ex:Client ex:basedIn ex:pop ex:state ex:city ex:state ex:name rdf:type Austin TX ex:state ex:city Dallas TX ex:state ex:city ex:pop idNamec_id 10ACME Inc20 11Foo Bars21 c_idcitystate 20AustinTX 21DallasTX Clients Locations
idcitystatepop 1AustinTX DallasTX Location ient10 client11 ACME Inc Foo Bars ex:Client ex:basedIn ex:pop ex:name rdf:type Austin TX ex:state ex:city Dallas TX ex:state ex:city ex:pop idNamec_id 10ACME Inc20 11Foo Bars21 c_idcitystate 20AustinTX 21DallasTX Clients Locations
A bit of history Relational Databases on the Web. TimBL, 1998 W3C Workshop on RDF Access to Relational Databases, October 2007 –Report: W3C RDB2RDF Incubator Group, –Survey: Report.pdf Report.pdf W3C RDB2RDF Working Group, 2009 – today –R2RML: RDB to RDF Mapping Language –A Direct Mapping of Relational Data to RDF
RDB and the Semantic Web 12 RDF RDFS OWL RIF
RDB and the Semantic Web 13 RELATIONAL MODEL TABLE DEFINITION CONSTRAINTS TRIGGERS
RDB and the Semantic Web 14 RELATIONAL MODEL TABLE DEFINITION CONSTRAINTS TRIGGERS RDF RDFS OWL RIF
Overview
R2RML: RDB to RDF Mapping Language Language for expressing customized mappings from relational databases to RDF datasets Give precise control to the developer –You create the structure you want –You choose the target vocabulary No RDFS/OWL is created from the schema 16
RDB RDF R2RML manual R2RML Mapping
Direct Mapping Automatic transformation from Relational Database to RDF –Click a button… Voila! Generate RDFS/OWL of the database schema If this doesn’t get you where you want…use existing languages for mapping –RDF to RDF with RIF or SPARQL Construct Semantic Web community –Create SQL Views and directly map those Database community 18
RDB Direct Mapping RDF RIF/ SPARQL Construct automatic RDF Direct Mapping SQL Views
Hybrid Instead of starting from a blank R2RML file… 1) Direct Mapping 2) Manual Editing 20
RDB RDF Direct Mapping in R2RML R2RML Direct Mapping Modify Hybrid Mapping
Materialize Triples Data is not dynamic Dump RDB into RDF and then insert into triplestore RDF dump may not be consistent with RDB 22
RDB RDF Dump SPARQL Materialized Triples
Virtual Triples Data is dynamic Need to query RDB with SPARQL Translate SPARQL to SQL –Comparing the overall performance […] of the fastest rewriter with the fastest relational database shows an overhead for query rewriting of 106%. This is an indicator that there is still room for improving the rewriting algorithms [Bizer and Schultz 2009] –Current rdb2rdf systems are not capable of providing the query execution performance required [...] it is likely that with more work on query translation, suitable mechanisms for translating queries could be developed. These mechanisms should focus on exploiting the underlying database system’s capabilities to optimize queries and process large quantities of structure data [Gray et al. 2009] –Ultrawrap solves this RDF data is consistent with RDB data 24
RDB Mapping SPARQL Virtual Triples RDF
Materialized Triples Virtual Triples Direct Mapping Custom Mapping RDB2RDF Space Hybrid
Tuples to Triples SIDNAMEAGE 1Alice25 2Bob26 SUBJECT PREDICATE OBJECT
Current Status of W3C RDB2RDF WG R2RML: RDB to RDF Mapping Language Working Draft A Direct Mapping of Relational Data to RDF Working Draft Last Call: Sept 1 (hopefully) 28
Implementations Ultrawrap –SPARQL and semantically equivalent SQL have equal execution time –Commercial databases – Spyder –Oracle and HSQLDB – Other non-standard RDB2RDF –D2R Server, Virtuoso, Triplify, … 29
Publicity International Semantic Web Conference –Oct 23 – 27 in Bonn, Germany Posters and Demos –August 15 Consuming Linked Data Workshop –August 15 Outrageous Ideas Track –Sept 5 Semantic Web Challenge –Sept 30 2 nd Linked Data-a-thon –Oct Join the Facebook group SSSW2011
Thank Acknowledgments: - UT Austin - W3C RDB2RDF WG members - David McNeil - Revelytix