Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graph Indexing for Shortest-Path Finding over Dynamic Sub-Graphs

Similar presentations


Presentation on theme: "Graph Indexing for Shortest-Path Finding over Dynamic Sub-Graphs"— Presentation transcript:

1 Graph Indexing for Shortest-Path Finding over Dynamic Sub-Graphs
Mohamed S. Hassan Walid G. Aref Ahmed M. Aly Purdue University – West Lafayette, IN, USA SIGMOD’16

2 Graphs are Everywhere Road Network Biological Network Social Network
Datacenter Network

3 Edge-Labeled Graph Model
Directed Graph 1 6 3 8 2, B 1, R 2 9 4 7 5 9, R 10, R 8, R 1, G 4, B 6, B 7, R Add arrows Add #labels Edge Weight Edge Label Label = Color

4 Querying Sub-Graphs Graph Query Select Sub-Graph Select … From …
Where… Select Sub-Graph

5 Motivation What is the shortest path between two persons considering only family relationships? Does Protein X interact with Protein Y through stable or covalent interactions?

6 Problem Definition Edge-Constrained Shortest Path Query (ECSP)
ECSP Query Q(s, d, A) using Labeled-Graph G Given Source vertex s Destination vertex d Set of labels A ⊆ G.L Find a shortest path from Vertex s to Vertex d using only edges labeled by labels of A

7 ECSP Query Example ECSP Query Q(1, 6, {B, R}) 1 6 3 8 2, B 1, R 2 9 4
7 5 9, R 10, R 8, R 1, G 4, B 6, B 7, R

8 ECSP Query Example ECSP Query Q(1, 6, {B, R})
Dashed Shortest Path with Cost 8 (Invalid) 1 6 3 8 2, B 1, R 2 9 4 7 5 9, R 10, R 8, R 1, G 4, B 6, B 7, R

9 ECSP Query Example ECSP Query Q(1, 6, {B, R})
Dashed Shortest Path with Cost 9 (Valid Path) 1 6 3 8 2, B 1, R 2 9 4 7 5 9, R 10, R 8, R 1, G 4, B 6, B 7, R

10 Intuition |L| << |E| (e.g., 5, 32) Regular query-answer pattern
Consecutive monochrome edges 1 6 3 8 2, B 1, R 2 9 4 7 5 9, R 10, R 8, R 1, G 4, B 6, B 7, R

11 Query-Answer Regular Pattern
|L| << |E| (e.g., 5, 32) Regular query-answer pattern Consecutive monochrome edges Dashed Shortest Path with Cost 30 for Q(1, 9, {B, R}) 1 6 3 8 2, B 1, R 2 9 4 7 5 9, R 10, R 8, R 1, G 4, B 6, B 7, R

12 Query-Answer Regular Pattern
|L| << |E| (e.g., 5, 32) Regular query-answer pattern Consecutive monochrome edges 1 8 2, B 2 9 4 7 10, R 8, R 4, B 6, B Blue Monochrome Shortest Path from Vertex 1 to Vertex 7 Precomputations (Index entries)

13 Challenges A query can select one of 2|L| possible sub-graphs to operate on Graph updates (e.g., label, weight, new edge) how to update the precomputations affected by updating the underlying graph?

14 Edge-Disjoint Partitioning (EDP)
Proposed solution: Edge-Disjoint Partitioning (EDP) = Partitioned Index + Traversal Algorithm Not needed

15 EDP Partitioning Indexing Graph G to obtain Index I(G)
Partition based on edge labels 1 6 3 8 2, B 1, R 2 9 4 7 5 9, R 10, R 8, R 1, G 4, B 6, B 7, R

16 EDP Partitioning Indexing Graph G to obtain Index I(G)
Partition based on edge labels (linear time) 5 4 PrG 1 2 PrB 7 6 PrR 3 8 1, R 9 9, R 10, R 8, R 7, R

17 EDP Partitioning (Cont’d)
Key ideas… Efficient pruning Indexing monochrome shortest paths Index I(G) can cover all the queries

18 EDP Partitioning (Cont’d)
I(G) has the same connectivity as G Bridge vertexes OtherHosts Lists x {Label1, Label2, …} 5 4 PrG 1 {B} {R} 2 PrB {G} 7 6 PrR 3 8 1, R 9 9, R 10, R 8, R 7, R {R}

19 EDP Partitioning (Cont’d)
Index is incremental and query-workload aware EDP allows the user to restrict the index growth User defined maximum size (e.g., 8 GB) Index/Cache replacement policy (e.g., LRU) 5 4 PrG 1 {B} {R} 2 PrB {G} 7 6 PrR 3 8 1, R 9 9, R 10, R 8, R 7, R

20 Query Processing in EDP
Greedy traversal algorithm that uses a priority queue Each vertex is identified by (Partition Id, Vertex Id) A monochrome shortest path is computed only once (incremental indexing) 5 4 PrG 1 {B} {R} 2 PrB {G} 7 6 PrR 3 8 1, R 9 9, R 10, R 8, R 7, R

21 Query Processing in EDP
Consider Query Q1(1, 6, {R, B}) Start from a partition hosting the source node Check for the destination in the current partition Traverse other partitions through bridge vertexes {R} PrG 2 4 {G} 2 1 12 7 {R} PrB {R} {B} {B} 1 2 5 7 8 7, R 10, R Cost = 9 3 6 9 PrR

22 Query Processing in EDP (Cont’d)
Key ideas behind efficient query evaluation Leveraging precomputed monochrome shortest paths On-demand parallel computation of bridge edges Make it with shorter or use pictuires

23 Graph Updates in EDP Everything can be updated:
Topological Adding/removing a vertex Adding/removing an edge Non-topological An edge weight can be updated An edge label can be updated How EDP handles graph updates?

24 Handling Graph Updates (Cont’d)
Key ideas Lazy updates (for the precomputations) Invalidate potentially affected pre-computations Fix invalidated pre-computations on-demand 4 2 4 {G} {B} 1 {R} 6 4 5 2 1 7 {R} {R} PrG PrB {R} {B} {B} 1 6 3 8 1, R 2 9 7 5 9, R 10, R 8, R 7, R 10, R PrR

25 Handling Graph Updates (Cont’d)
Find the naturally formed disconnected components in each partition Use global clock 5 4 PrG 1 {B} {R} 2 PrB {G} 7 6 PrR 3 8 1, R 9 9, R 10, R 8, R 7, R C1 C2 {R}

26 Handling Graph Updates (Cont’d)
Each pre-computation has a timestamp (TS(Entry)) Each component in a partition has a timestamp (TS(C)) 5 4 PrG 1 {B} {R} 2 PrB {G} 7 6 PrR 3 8 1, R 9 9, R 10, R 8, R 7, R C1 C2 {R}

27 Handling Graph Updates (Cont’d)
On update: update the timestamp of the affected component (TS(C)) On query: re-compute Entry E iff TS(C) > TS(E) 5 4 PrG 1 {B} {R} 2 PrB {G} 7 6 PrR 3 8 1, R 9 9, R 10, R 8, R 7, R C1 C2 {R}

28 Experimental Results Using six real edge-labeled graph datasets
Comparing with CHLR [1] One to four orders-of-magnitude query-time speedup [1] M. N. Rice and V. J. Tsotras. Graph indexing of road networks for shortest path queries with label restrictions. PVLDB, 4(2):69–80, 2010.

29 Average-Speedup of Query-Time

30 Index Size Always less than 1 GB

31 Conclusions EDP outperforms the state-of-the-art on static graphs and supports dynamic graphs EDP efficiently prunes disallowed edges Bridge edges are discovered in parallel to the main traversal thread The dynamic index of EDP is an incremental-index and query-workload aware On-demand re-computation of the invalidated index entries Index size can be controlled by the user Up to four orders-of-magnitude query-time speedup w.r.t. the state-of-the-art

32 Thank You!

33 Contraction Hierarchies

34 Handling Large Bridge Vertexes
Consider Q(S, D, {R, B}) Avoid adding all the bridge edges at once Define MaxBreadth parameter Explore MaxBreadth bridge edges at a time D 2 2 PB {…} {…} {G} {B} 3 900 Remove large bridge vertexes 1 2 500 6 2 9940 S PR

35 Handling Large Bridge Vertexes
Consider Q(S, D, {R, B}) Avoid adding all the bridge edges at once Define MaxBreadth parameter Explore MaxBreadth bridge edges at a time D 2 2 PB {…} {…} {G} {B} 3 900 Remove large bridge vertexes 1 2 500 6 2 9940 S PR

36 Handling Large Bridge Vertexes
Consider Q(S, D, {R, B}) Avoid adding all the bridge edges at once Define MaxBreadth parameter Explore MaxBreadth bridge edges at a time D 2 2 PB {…} {…} {G} {B} 3 900 Remove large bridge vertexes 1 2 500 6 2 9940 S PR

37 Handling Large Bridge Vertexes
Consider Q(S, D, {R, B}) Avoid adding all the bridge edges at once Define MaxBreadth parameter Explore MaxBreadth bridge edges at a time D 2 2 PB {…} {…} {G} {B} 3 900 Remove large bridge vertexes 1 2 500 2 6 9940 S PR

38 Expensive Update Handling
Assume that (1⇝ 9) was computed at TS = 10 When to re-compute (1⇝ 9) When a query asks for (1⇝ 9), and the log of Partition PrR has updates with TS > 10, and an edge not in (1⇝ 9) has a decreased weight or, an edge in (1⇝ 9) has an increased weight PrR {B} 1 6 3 9 8 7 5 10 25 22

39 Expensive Update Handling (Cont’d)
Assume that BridgeEdges(1) was computed at TS = 10 When to re-compute BridgeEdges(1) When a query asks for BridgeEdges(1), and the log of Partition PrR has updates with TS > 10, and an Edge (u, v) has a decreased/increased weight, and Vertex u is reachable from Vertex 1, and Vertex v can reach a bridge vertex 4 2 PrB {G} {R} 7 1 6 6 8

40 Query Processing in EDP (Cont’d)
Check potential shorter paths in other allowed partitions through bridge edges (computed in parallel using lazy evaluation) Consider Query Q2(1, 6, {R, B}) PrG 4 2 PrB {G} {R} 7 1 6 PrR {B} 3 8 1, R 9 5 9, R 10, R 8, R 7, R {R}

41 Query Processing in EDP (Cont’d)
Processing Query Q1(1, 6, {R}) Can start from PR(1) or PB(1) PQ: {(PR(1), 0)} PrG 4 2 PrB {G} {R} 7 1 6 PrR {B} 3 8 1, R 9 5 9, R 10, R 8, R 7, R {R}

42 Query Processing in EDP (Cont’d)
Processing Query Q1(1, 6, {R}) PQ: {(PR(1), 0)}  {(PB(1), 0), (PR(6), 10)} PrG 4 2 PrB {G} {R} 7 1 PrR {B} 6 3 8 10, R 9 5

43 Query Processing in EDP (Cont’d)
Processing Query Q1(1, 6, {R}) PQ: {(PR(1), 0)}  {(PB(1), 0), (PR(6), 10)}  {(PR(2), 2), (PR(6), 10), (PR(7), 12)} PrG 4 2 PrB {G} {R} 7 1 PrR {B} 6 3 8 10, R 9 5 12

44 Query Processing in EDP (Cont’d)
Processing Query Q1(1, 6, {R}) PQ: {(PR(1), 0)}  {(PB(1), 0), (PR(6), 10)}  {(PR(2), 2), (PR(6), 10), (PR(7), 12)} PrG 4 2 PrB {G} {R} 7 1 PrR {B} 6 3 8 10, R 9 5 12

45 Query Processing in EDP (Cont’d)
Processing Query Q1(1, 6, {R}) PQ: {(PR(1), 0)}  {(PB(1), 0), (PR(6), 10)}  {(PR(2), 2), (PR(6), 10), (PR(7), 12)}  {(PR(6), 9), (PR(6), 10), (PR(7), 12)} PrG 4 2 PrB {G} {R} 7 1 PrR {B} 6 3 8 10, R 9 5 12 7, R

46 Query Processing in EDP (Cont’d)
Processing Query Q1(1, 6, {R}) PQ: {(PR(1), 0)}  {(PB(1), 0), (PR(6), 10)}  {(PR(2), 2), (PR(6), 10), (PR(7), 12)}  {(PR(6), 9), (PR(6), 10), (PR(7), 12)}  Cost = 9 PrG 4 2 PrB {G} {R} 7 1 PrR {B} 6 3 8 10, R 9 5 12 Top-k shortest path 7, R

47 Handling Large Bridge Vertexes (Cont’d)
Breadth factor parameter (see details in paper of how it is set) Consider Query Q(S, D,{R, B}) with BreadthFactor = 2 Bridge edges are discovered by a thread running Dijkstra’s Algo.  Sorted by cost New attribute in a PQ element PQ: {(PR(S), 0, 0)} 2 D PB PR 1 S 900 6 500 9940 B G 3 Next edge to explore

48 Handling Large Bridge Vertexes (Cont’d)
Processing Q(S, D,{R, B}) with BreadthFactor = 2 PQ: {(PR(S), 0, 0)} 2 D PB PR 1 S 900 6 500 9940 B G 3

49 Handling Large Bridge Vertexes (Cont’d)
Processing Q(S, D,{R, B}) with BreadthFactor = 2 PQ: {(PR(S), 0, 0)}  {(PR(1), 2, 0), (PR(2), 6, 0), (PR(S), 500, 3)} 2 D PB PR 1 S 900 6 500 9940 B G 3

50 Handling Large Bridge Vertexes (Cont’d)
Processing Q(S, D,{R, B}) with BreadthFactor = 2 PQ: {(PR(S), 0, 0)}  {(PR(1), 2, 0), (PR(2), 6, 0), (PR(S), 500, 3)}  {(PB(2), 6, 0), (PR(S), 500, 3)} 2 D PB PR 1 S 900 6 500 9940 B G 3

51 Handling Large Bridge Vertexes (Cont’d)
Processing Q(S, D,{R, B}) with BreadthFactor = 2 PQ: {(PR(S), 0, 0)}  {(PR(1), 2, 0), (PR(2), 6, 0), (PR(S), 500, 3)}  {(PB(2), 6, 0), (PR(S), 500, 3)}  {(PB(D), 8, 0), (PR(S), 500, 3)} Destination Reached with SP distance = 8 2 D PB PR 1 S 900 6 500 9940 B G 3

52 Future Work Support non-categorical attributes (e.g., latency in a communication network) Optimize for other graph queries (e.g., reachability) Extend a relational engine to support graphs natively Extend the query language (declarative and procedural) Introduce primitive graph operators Seamless pipelining of graph and relational operators in the same query execution plan


Download ppt "Graph Indexing for Shortest-Path Finding over Dynamic Sub-Graphs"

Similar presentations


Ads by Google