SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin
Ralf Heese, SPARQL Query Graph Model2 A Posting in a Newsgroup Question: A series of SPARQL queries of the form: … WHERE { {?family ?d. ?d "Peter".} {?family ?m. ?m "Robin".} … My queries runs very slowly Simple queries on a database of 10,000 trees describing families Answer: Put the more specific part of the query first; it makes a significant difference. … Reply: … My time went from 33000ms 150ms. … Date: Mar 8,
Ralf Heese, SPARQL Query Graph Model3 One query, many ways to execute {?family ?d. ?d "Peter".} {?family ?m. ?m "Robin".} {?family ?p. ?p "Toller".} {?family ?m. ?m "Robin".} {?family ?d. ?d "Peter".} {?family ?p. ?p "Toller".} {?family ?m. ?m "Robin".} {?family ?p. ?p "Toller".} {?family ?d. ?d "Peter".}
Ralf Heese, SPARQL Query Graph Model4 Outline Query processing in databases SPARQL query graph model (SQGM) Transforming SQGMs Evaluation Conclusion
Query Processing in Databases
Ralf Heese, SPARQL Query Graph Model6 Internal representation of the query SPARQL Query Graph Model SPARQL Query Graph Model Tasks of the query engine Query parsing Query rewriting QEP generation QEP execution Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion QEP = Query Execution Plan
SPARQL Query Graph Model (SQGM)
Ralf Heese, SPARQL Query Graph Model8 Extensible to new concepts of the query language Advantages Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion Supports all phases of query processing Adaptable to changes of the query language Store additional information needed for query processing SPARQL Query Graph Model SPARQL Query Graph Model
Ralf Heese, SPARQL Query Graph Model9 Basic Structures Directed graph Operation Head: provided variables Body: operation details Dataflow connects the input and the output of two operations Body Head Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Ralf Heese, SPARQL Query Graph Model10 Constructing an SQGM SELECT ?n ?c FROM WHERE { ?s rdf:type ub:GraduateStudent. OPTIONAL { ?s ub:takesCourse ?c.} ?s ub:name ?n. } ?s ub:name ?n ?s?n ?s ub:takesCourse ?c ?s?c ?s rdf:type ub:GraduateStudent ?s Join ?s?c Select ?n?c Join ?s?c?n optional ?s ?c ?s ?s ?c ?s ?n ?n ?c Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Ralf Heese, SPARQL Query Graph Model11 Operation Types and Dataflow Types Variable providing operations Graph providing operations Variable dataflows Graph dataflows ?s ub:name ?n ?s?n ?s ub:takesCourse ?c ?s?c ?s rdf:type ub:GraduateStudent ?s Select ?n?c Join ?s?c?n Join ?s?c optional ?s ?c ?s ?s ?c ?s ?n ?n ?c Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Transforming SQGMs
Ralf Heese, SPARQL Query Graph Model13 Query Rewriting Goals More efficient evaluation of a query Provide more options for the generation of query plans, e.g., Data access strategy Join order Selection of indexes Means Rule-based transformation, i.e., restructuring of the query, detection of redundancies and contradictions Heuristic = set of rules Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Ralf Heese, SPARQL Query Graph Model14 optional ?s ub:name ?n ?s?n ?s ub:takesCourse ?c ?s?c ?s rdf:type ub:GraduateStudent ?s Select ?n?c Join ?s?c?n Join ?s?c ?s ?c ?s ?s ?c ?s ?n ?n ?c Heuristic: Combine Basic Graph Pattern Basic graph pattern cannot be mergedBut these could be merged if they were operands of the same join operation. Apply transformation rules to the SQGM Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Ralf Heese, SPARQL Query Graph Model15 Next Step ?s ub:name ?n ?s?n ?s ub:takesCourse ?c ?s?c ?s rdf:type ub:GraduateStudent ?s Join ?s?n Select ?n?c Join ?s?c?n optional ?s ?c ?s ?s ?n ?n ?c Apply another transformation rule Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion ?s rdf:type ub:GraduateStudent ?s ub:name ?n ?s ?n
Evaluation
Ralf Heese, SPARQL Query Graph Model17 Prototype Setup Jena Semantic Web Framework ARQ – SPARQL query processor for Jena RDF graphs stored on secondary storage Extended by SPARQL query graph model Rule engine Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Ralf Heese, SPARQL Query Graph Model18 Interaction between ARQ and SQGM extension ARQSQGM extension Construction of an ARQ query model Translation into an SQGM Translation into an ARQ model Rewriting of the SQGM Generation of a Query Execution Plan Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion SPARQL Query Execution of the QEP Query result
Ralf Heese, SPARQL Query Graph Model19 Evaluation – Setup RDF Data A set of 41 SPARQL queries Different combinations of graph patterns including OPTIONAL, FILTER and UNION UnivBench (1.0) UnivBench (5.0) UnivBench (10.0) #Triples100,543624,5321,272,575 #Resources20,659129,533263,427 Generator UBA (v.1.7) of Lehigh University Benchmarks Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Ralf Heese, SPARQL Query Graph Model20 Evaluation – Results Measured query execution time of a selected query Factor 2.4 Time needed for transformation between models < 1 ms Average time savings of approx. 87% Only one case with slightly higher execution time Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion SELECT ?n ?c FROM WHERE { ?s rdf:type ub:GraduateStudent. OPTIONAL { ?s ub:takesCourse ?c.} ?s ub:name ?n. }
Ralf Heese, SPARQL Query Graph Model21 Explanation for the Result Fast path algorithm of Jena Perform pattern matching within the underlying relational database Match multiple filtered basic graph patterns WHERE { ?s rdf:type ub:GraduateStudent. OPTIONAL { ?s ub:takesCourse ?c.} ?s ub:name ?n. } WHERE { ?s rdf:type ub:GraduateStudent. ?s ub:name ?n. OPTIONAL { ?s ub:takesCourse ?c.} } Fast path not applicable Fast path applicable
Conclusion
Ralf Heese, SPARQL Query Graph Model23 Conclusion and Future Work SQGM: a query model for SPARQL Supporting all phases of query processing Easy to extend Transformation rules and heuristics for SQGMs Implementation illustrated the potential of SQGMs Outlook Develop further heuristics to rewrite SPARQL queries Integrate index selection into the query optimization Query Processing in Databases SPARQL Query Graph Model Transforming SQGMs Evaluation Conclusion
Thank you!