Download presentation
Presentation is loading. Please wait.
Published byBlaze Lombard Modified over 10 years ago
1
Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong Kong University of Science and Technology, 3 University of Waterloo
2
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 2
3
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 3
4
Semantic Web 4 “Semantic Web Technologies” is a collection of standard technologies to realize a Web of Data.
5
RDF Data Model 5 URI Literals
6
RDF Graph 6 Entity Vertex Literal Vertex
7
SPARQL Queries 7 SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865- 04-15”. } Query Graph
8
Subgraph Match vs. SPARQL Queries 8
9
Naïve Triple Store 9 SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865-04-15”. } SQL: Select T3.Subject From T as T1, T as T2, T as T3 Where T1.Predict=“BornOnDate” and T1.Object=“1809-02-12” and T2.Predict=“DiedOnDate” and T2.Object=“1865-04-15” and T3. Predict=“hasName” and T1.Subject = T2.Subject and T2. Subject= T3.subject Too many Self-Joins
10
Existing Solutions Three categories of solutions are proposed to speed up query processing: 1.Property Table; Jena [K. Wilkinson et al. SWDB 03], … 2. Vertically Partitioned Solution; SW-store [D. J. Abadi et al. VLDB 07],… 3. Exhaustive-Indexing RDF-3x [T. Neumann et al. VLDB 08], Hexastore [C. Weiss et al. VLDB 08 ],… 10
11
Existing Solutions-Property Table 11 SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865-04-15”. } SQL: Select People.hasName from People where People.BornOnDate = “1809-02-12” and People.DiedOnDate = “1865-04-15”. Reducing # of join steps
12
Existing Solutions- Vertically Partitioned Solution 12 Fast Merge Join
13
Existing Solutions- Exhaustive-Indexing Each SPARQL query statement can be translated into one “range query”. SPARQL Query: Select ?name Where { ?m ?name. ?m “1809-02-12”. ?m “1865-04-15”. } 13 Range query & Merge Join
14
Some Limitations 1.Difficult to handle ``wildcard queries’’. 2.Difficult to handle updates. 14
15
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 15
16
Intuition of gStore 16 Finding Matches over a Large Graph is not a trivial task.
17
Preliminaries 17 Entity Vertex Literal Vertex
18
Storage Schema in gStore 18 Encoding all neibhors into a “bit-string”, called signature.
19
Encoding Technique (1) 19 “Abr”, “bra”, ”rah”, ”aha”, …., ( hasName, “Abraham Lincoln”) 0010 0000 0000 0000 0010 0000 0000 1000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0001 1000 0010 0100 0001 OR 1000 0010 0100 0001 ( BornOnDate, “1809-02-12”) 0100 0000 00000100 0010 0100 1000 ( DiedOnDate, “1865-04-15”) 0000 1000 00000000 0010 0100 0000 ( DiedIn, “y:Washington_D.c”) 0000 0010 00001000 0010 0100 0001 0000 0010 00001100 0010 0100 1001 OR
20
Encoding Technique (2) 20
21
Encoding Technique (3) 21 Finding Matches over signature graph G* Verify Each Match in RDF Graph G
22
Outline Background & Related Work Overview of gStore Encoding Technique VS-tree & Query Algorithm Experiments Conclusions 22
23
A Straightforward Solution (1) 23 001 004 006 002 003 006 u1u1 u2u2 L1L1 L2L2
24
A Straightforward Solution (2) 24 001 004 006 002 003 006 Large Join Space ! L1L1 L2L2
25
VS-tree 25
26
Pruning Technique 26 u1u1 u2u2 10010 001 004 006 002 003 006 Reduced Join Space!
27
An Example for Pruning Effect 27 Query: ?x1 y:hasGivenName ?x5 ?x1 y:hasFamilyName ?x6 ?x1 rdf:type ?x1 y:bornIn ?x2 ?x1 y:hasAcademicAdvisor ?x4 ?x2 y:locatedIn ?x3 y:locatedIn ?x4 y:bornIn ?x3 Before Pruning After Pruning x1 810 X2 424197 x3 66 x4 361876686
28
Query Algorithm-Top-Down 28
29
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 29
30
Datasets 30 Triple #Size Yago20 million3.1GB DBLP8 million0.8 GB
31
Exact Queries 31
32
Wildcard Queries 32
33
Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions 33
34
Conclusions Vertex Encoding Technique; An Efficient index Structure: VS-tree; A Novel Filtering Technique. 34
35
35 zoulei@pku.edu.cn
36
Updates- Insertion in G* 36
37
Updates- Insertion in VS*-tree 37
38
Updates- Deletion in VS*-tree 38 To be deleted
39
Framework in gStore 39 Finding Candidate Matches over G* Verify Each Candidate Match
40
A Straightforward Solution (1) 40 uu & 001 = u
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.