Download presentation
Presentation is loading. Please wait.
Published bySheryl Chapman Modified over 9 years ago
1
Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris Tsirogiannis (Univ. of Toronto)
2
VLDB '06 Introduction Preferences expressed as scoring functions on the attributes of a relation, e.g t id X1X1 X2X2 X3X3 182159 2531983 3299915 480458 5283239 t id Score 2612 1543 4370 3360 5343 Top-k: k tuples with the highest score R
3
VLDB '06 Related Work TA [Fagin et. al. ‘96] Deterministic stopping condition Always the correct top-k set PREFER [Hristidis et. al. ‘01] Stores multiple copies of base relation R Utilizes only one We complement existing approaches
4
VLDB '06 Motivation Query answering using views Space-Performance tradeoff Improved efficiency Can we exploit the same tradeoffs for top-k query answering?
5
VLDB '06 Problem Statement V1V1 t id Score 3553 4385 5216 2201 1169 V2V2 t id Score 2351 1237 5177 3159 488 Rt id X1X1 X2X2 X3X3 182159 2531983 3299915 480458 5283239 Ranking Views: Materialized results of previously asked top-k queries Problem: Can we answer new ad-hoc top-k queries efficiently using ranking views?
6
VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions
7
VLDB '06 LPTA - Setting Linear additive scoring functions e.g. Set of Views: Materialized result of a previously executed top-k query Arbitrary subset of attributes Sorted access on pairs Random access on the base table R
8
VLDB '06 LPTA - Example V1V2 Top-1 V1 V2 Q stopping condition X1X1 X2X2 R(X 1, X 2 )
9
VLDB '06 LPTA Linear Programming adaptation of TA Q:Q: V1V1 V2V2 d iteration
10
VLDB '06 LPTA - Example (cont’) V1V2 Top-1 V1 V2 Q stopping condition X1X1 X2X2 R(X 1, X 2 )
11
VLDB '06 LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions Outline
12
VLDB '06 View Selection Problem Given a collection of views and a query Q, determine the most efficient subset to execute Q on. Conceptual discussion Two dimensions Higher dimensions
13
VLDB '06 View Selection - 2d A B Min top-k tuple Q V1 V2 A1 B1 M
14
VLDB '06 View Selection - Higher d Theorem: If is a set of views for an -dimensional dataset and Q a query, the optimal execution of LPTA requires a subset of views such that. Question: How do we select the optimal subset of views?
15
VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions
16
VLDB '06 Cost Estimation Framework What is the cost of running LPTA when a specific set of views is used to answer a query? Cost = number of sequential accesses Cost = 6 sequential accesses Min top-k tuple Can we find that cost without actually running LPTA? A B Q V1 V2
17
VLDB '06 Simulation of LPTA on Histograms 1. Use H Q to estimate the score of the k highest tuple (topk min ). 2. Simulate LPTA in a bucket by bucket lock step to estimate the cost. HQHQ H V1 H V2 topk min H Q : approximates the score distribution of the query Q b buckets n/b tuples per bucket Cost
18
VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions
19
VLDB '06 View Selection Algorithms Exhaustive (E): Check all possible subsets of size,. Greedy (SV): Keep expanding the set of views to use until the estimated cost stops reducing.
20
VLDB '06 Requires the solution of a single linear program. Q Selected Views Select Views Spherical (SVS) T
21
VLDB '06 Select Views By Angle (SVA) Select Views By Angle (SVA): Sort the views by increasing angle with respect to Q. Q Selected Views V1 V2V3V4
22
VLDB '06 General Queries and Views Views that materialize their top-k tuples. Truncate the view histograms. Accommodating range conditions Select the views that cover the range conditions. Truncate each attribute’s histogram.
23
VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions
24
VLDB '06 Experiments Datasets (Uniform, Zipf, Real) Experiments: Performance comparison of LPTA, PREFER and TA Accuracy of the cost estimation framework Performance of LPTA using each of the view selection algorithms Scalability of the LPTA algorithm
25
VLDB '06 Performance comparison of LPTA, PREFER and TA Uniform dataset, 3dReal dataset, 2d
26
VLDB '06 Cost Estimation Accuracy (buckets = 0.5% of n)(buckets = 1% of n) 2d
27
VLDB '06 Performance of LPTA using View Selection Algorithms (2d) (3d)500K tuples, top-100
28
VLDB '06 Scalability Experiments on LPTA (2d, uniform dataset) (500K tuples, top-100)
29
VLDB '06 Conclusions Using views for top-k query answering LPTA: linear programming adaptation of TA View selection problem, cost estimation framework, view selection algorithms Experimental evaluation
30
VLDB '06 (Thank You!) Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.