Download presentation
Presentation is loading. Please wait.
Published byAngelina Conley Modified over 9 years ago
1
Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs Yan Qi, Arizona State University K. Selcuk Candan, Arizona State University Maria Luisa Sapino, University of Torino VLDB ’07, September 23-28, 2007, Vienna, Austria 2008. 01. 25 Summarized by Dongjoo Lee, IDS Lab., Seoul National University Presented by Dongjoo Lee, IDS Lab., Seoul National University
2
Copyright 2008 by CEBT Contents Motivation Top-k Twig Queries over Weighted Data Graphs Answering Twig Queries on Weighted Graphs Sum-Max Monotonicity Progressive Result Enumeration HR-Join, MHR-Join Experiments Conclusion 2
3
Copyright 2008 by CEBT Motivation Query Processing on Metadata with Conflicts (FICSR, SIGMOD07) F eedback-based I n C on S istency R esolution and Query Processing on Misaligned Data Sources 3
4
Copyright 2008 by CEBT FICSR Integrated Representation 4 Internal FICSR Representation Simplified Visualization for the User
5
Copyright 2008 by CEBT FICSR Aggrements Based on source analysis and user feedback 5
6
Copyright 2008 by CEBT Top-k Twig Queries over Weighted Data Graphs XPath, XQuery 6 More desirable! Can we find it before the other?
7
Copyright 2008 by CEBT Answering Twig Queries on Weighted Graphs NP-complete problem By reduction from the Group Steiner Tree Problem. 7 VLSI design
8
Copyright 2008 by CEBT How to Solve the Problem? Use ranked-join algorithms for top-k queries A query plan with better sub-plans is always more desirable Must be monotonic Twig query? Sum-max monotonicity is held! We can enumerate the result incrementally! 8
9
Copyright 2008 by CEBT Sum-Max Monotonicity (1) 9 A//CA//DA[//C]//D A E CB D 5 7 3 2 10 12 A E CB D 5 7 3 2 10 12 A E CB D 5 7 3 2 10 12 Cost = 12 Cost = 10 Cost = 17 < 22 P ROPOSITION 1 q = T q (V q, E q ) r = SR = {sr 1, sr 2, …, sr m } (max(10, 12) = 12) < (cost = 17) < (12 + 10 = 22)
10
Copyright 2008 by CEBT Sum-Max Monotonicity (2) 10 P ROPERTY 1 answer r 1, r 2 sets of sub result R 1, R 2 Use for pruning
11
Copyright 2008 by CEBT Progressive Result Enumeration 11 cost Join for A[//C]//B A//BA//C v3(4 ) v2(2) v1(7 ) v3(4 ) v2(2) v8(5 ) v1(7 ) v9(7 ) v19(8 ) v8(9 ) Horizon = ∞ Horizon = 14 Horizon = 11 14 ( <=5+9) v3(4 ) v2(2) v8(5 ) v1(7 ) v9(7 ) v19(8 ) v4(10 ) v8(9 ) v1(10 ) v5(10 ) 14 11 ( <=10+7) 1411 v3(4 ) v2(2) v8(5 ) v1(7 ) v9(7 ) v19(8 ) v4(10 ) v8(9 ) v1(10 ) v5(10 ) v11(13 ) v15(10 ) v16(12 ) 14 11 Horizon = 14 Horizon Tightening Horizon Relaxation The best can be returned as the top-1 result
12
Copyright 2008 by CEBT HR-Join : Horizon based Ranked Join 12 Result Sieve Creates cost ranked output stream Use heap sort Alg. Control the horizon valves Horizon Valve Control the data availability on a given stream of cost- ranked data Use horizon variable that externally controlled
13
Copyright 2008 by CEBT Query Plan using HR-Join Operators 13 A query twig and sub-queries
14
Copyright 2008 by CEBT M-way HR-Joins (HRM-Joins) 14 A query twig and sub-queries
15
Copyright 2008 by CEBT Sub Jobs Sub-result enumeration K-shortest simple paths problem O(k|V| (|E| + |V|log|V|)) Dealing with “*” wildcards in twigs Can be expensive Query rewriting 15
16
Copyright 2008 by CEBT Experiments Data FICSR weighted graph data Query plans HR-Join, M-way HR-Join 2 significantly different join selectivity distributions – ~10% and 1 to 1 16
17
Copyright 2008 by CEBT Results 17
18
Copyright 2008 by CEBT Conclusion Twig query processing over weighted data graphs. Optimization using HR-Join based on sum-max monotonicity HR-Join, MHR-Join 18
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.