Download presentation
Presentation is loading. Please wait.
1
Lu Xing CS59000GDM 9/21/2018
2
Challenges for RDF data management
Scalability: Cause of problem: RDF data are stored in triples and manages by RDBMS. Single-machine Systems Distributed systems SW-Store SHARD Hexastore YARS2 RDF-3x Virtuoso Expensive join operations and large intermediate results!
3
Challenges for RDF data management
Generality: Cause of problem: general purpose queries on RDF data is not supported
4
Highlights of Trinity.RDF
Distributed system Native graph In-memory graph Smaller intermediate results
5
Join vs. Graph Exploration -- Join
RDF data is scanned to generate bindings
6
Intermediate results of Join operation
?director ?award ?movie ?actor ?movie_award J_Cameron Oscar_Award Titanic L_Dicaprio Best_Picture Avatar S_Worthington G_Lucas Saturn_Award P_Haggis Crash D_Cheadle Star War VI M_Hamill
7
Join vs. Graph Exploration -- Graph
8
Join vs. Graph Exploration -- Graph
9
Join vs. Graph Exploration -- Graph
10
Join vs. Graph Exploration -- Graph
No need to explore nodes here.
11
Join vs. Graph Exploration -- Graph
Why do the queries execute in this way?
12
Trinity: A Distributed Graph Engine on a Memory Cloud (Trinity supports fast graph exploration as well as efficient parallel computing)
13
Typo
14
How to model graphs?
15
How to model graphs? Problem: What if node n has a huge number of neighbors, but none of them are in the same machine as n?
16
Graph partition The in-adjacency-list stored in machine i
17
Other auxiliary data structures and operators
Indexing predicates Local predicate indexing Global predicate indexing Basic graph operators
18
Query process Decompose Q into an ordered sequence of triple patterns
Find matches for each qi From each match, explore the graph to find matches for qi+1 Gather the matchings for all individual triple patterns to the centralized query proxy
19
Single triple pattern matching
From subject to object
20
Single triple pattern matching
All the possibilities of src
21
Single triple pattern matching
Get the key nid to obtain node’s neighbors in machine i
22
Single triple pattern matching
Select list of nodes who have p as predicate
23
Single triple pattern matching
24
Example for Single Pattern Matching
25
Example for Single Pattern Matching
26
Example for Single Pattern Matching
27
Multiple Pattern Matching
No joins! Eagerly prune invalid matchings.
28
The order of queries matters
Query Q: q1, q2, q3, …, qn RDF data … q1 q2 qn
29
Exploration Plan Optimization
30
Define Exploration Point
A node that does not contain any redundant values. Hence it introduces fewer redundant values for other variables in the exploration. Exploration points This is how a graph grows.
31
Use Dynamic Programming to compute
Start from subgraph of size 1 Fill the 2-D table with the minimum cost
32
Evaluation – Running time
33
Evaluation – Running time (cont’)
34
Thank you!
35
Single triple pattern matching
Otherwise, src is a constant
36
How about cost? The cost is a linear combination of the size of the results and bindings of src.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.