Presentation is loading. Please wait.

Presentation is loading. Please wait.

Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris.

Similar presentations


Presentation on theme: "Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris."— Presentation transcript:

1 Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris Tsirogiannis (Univ. of Toronto)

2 VLDB '06 Introduction Preferences expressed as scoring functions on the attributes of a relation, e.g t id X1X1 X2X2 X3X3 182159 2531983 3299915 480458 5283239 t id Score 2612 1543 4370 3360 5343 Top-k: k tuples with the highest score R

3 VLDB '06 Related Work TA [Fagin et. al. ‘96] Deterministic stopping condition Always the correct top-k set PREFER [Hristidis et. al. ‘01] Stores multiple copies of base relation R Utilizes only one We complement existing approaches

4 VLDB '06 Motivation Query answering using views Space-Performance tradeoff Improved efficiency Can we exploit the same tradeoffs for top-k query answering?

5 VLDB '06 Problem Statement V1V1 t id Score 3553 4385 5216 2201 1169 V2V2 t id Score 2351 1237 5177 3159 488 Rt id X1X1 X2X2 X3X3 182159 2531983 3299915 480458 5283239 Ranking Views: Materialized results of previously asked top-k queries Problem: Can we answer new ad-hoc top-k queries efficiently using ranking views?

6 VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions

7 VLDB '06 LPTA - Setting Linear additive scoring functions e.g. Set of Views: Materialized result of a previously executed top-k query Arbitrary subset of attributes Sorted access on pairs Random access on the base table R

8 VLDB '06 LPTA - Example V1V2 Top-1 V1 V2 Q stopping condition X1X1 X2X2 R(X 1, X 2 )

9 VLDB '06 LPTA Linear Programming adaptation of TA Q:Q: V1V1 V2V2 d iteration

10 VLDB '06 LPTA - Example (cont’) V1V2 Top-1 V1 V2 Q stopping condition X1X1 X2X2 R(X 1, X 2 )

11 VLDB '06 LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions Outline

12 VLDB '06 View Selection Problem Given a collection of views and a query Q, determine the most efficient subset to execute Q on. Conceptual discussion Two dimensions Higher dimensions

13 VLDB '06 View Selection - 2d A B Min top-k tuple Q V1 V2 A1 B1 M

14 VLDB '06 View Selection - Higher d Theorem: If is a set of views for an -dimensional dataset and Q a query, the optimal execution of LPTA requires a subset of views such that. Question: How do we select the optimal subset of views?

15 VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions

16 VLDB '06 Cost Estimation Framework What is the cost of running LPTA when a specific set of views is used to answer a query? Cost = number of sequential accesses Cost = 6 sequential accesses Min top-k tuple Can we find that cost without actually running LPTA? A B Q V1 V2

17 VLDB '06 Simulation of LPTA on Histograms 1. Use H Q to estimate the score of the k highest tuple (topk min ). 2. Simulate LPTA in a bucket by bucket lock step to estimate the cost. HQHQ H V1 H V2 topk min H Q : approximates the score distribution of the query Q b buckets n/b tuples per bucket Cost

18 VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions

19 VLDB '06 View Selection Algorithms Exhaustive (E): Check all possible subsets of size,. Greedy (SV): Keep expanding the set of views to use until the estimated cost stops reducing.

20 VLDB '06 Requires the solution of a single linear program. Q Selected Views Select Views Spherical (SVS) T

21 VLDB '06 Select Views By Angle (SVA) Select Views By Angle (SVA): Sort the views by increasing angle with respect to Q. Q Selected Views V1 V2V3V4

22 VLDB '06 General Queries and Views Views that materialize their top-k tuples. Truncate the view histograms. Accommodating range conditions Select the views that cover the range conditions. Truncate each attribute’s histogram.

23 VLDB '06 Outline LPTA Algorithm View Selection Problem Cost Estimation Framework View Selection Algorithms Experimental Evaluation Conclusions

24 VLDB '06 Experiments Datasets (Uniform, Zipf, Real) Experiments: Performance comparison of LPTA, PREFER and TA Accuracy of the cost estimation framework Performance of LPTA using each of the view selection algorithms Scalability of the LPTA algorithm

25 VLDB '06 Performance comparison of LPTA, PREFER and TA Uniform dataset, 3dReal dataset, 2d

26 VLDB '06 Cost Estimation Accuracy (buckets = 0.5% of n)(buckets = 1% of n) 2d

27 VLDB '06 Performance of LPTA using View Selection Algorithms (2d) (3d)500K tuples, top-100

28 VLDB '06 Scalability Experiments on LPTA (2d, uniform dataset) (500K tuples, top-100)

29 VLDB '06 Conclusions Using views for top-k query answering LPTA: linear programming adaptation of TA View selection problem, cost estimation framework, view selection algorithms Experimental evaluation

30 VLDB '06 (Thank You!) Questions?


Download ppt "Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris."

Similar presentations


Ads by Google