Download presentation
Presentation is loading. Please wait.
Published byLeslie Lobdell Modified over 9 years ago
1
PREFER: A System for the Efficient Execution of Multi-parametric Ranked Queries Vagelis Hristidis University of California, San Diego Nick Koudas AT&T Research Yannis Papakonstantinou University of California, San Diego
2
Example
3
ORDER BY 0.01· Mileage + 0.6·Year + 0.03· Price
4
Example ORDER BY 0.01· Mileage + 0.6·Year + 0.03· Price
5
Example ORDER BY 0.01· Mileage + 0.6·Year + 0.03· Price Problem: Retrieve WHOLE relation
6
Example ORDER BY 0.01· Mileage + 0.6·Year + 0.03· Price Problem: Retrieve WHOLE relation PREFER retrieves only part of relation
7
Applications Such preference queries are used in Web sites like: www.Zagat.com ( restaurants)www.Zagat.com www.personallogic.com (online retailer)www.personallogic.com
8
Definitions - Problem statement A preference query orders the tuples of a relation according to a function of the attribute values. eg: 0.01· Mileage + 0.6·Year + 0.03· Price Goal is to produce top-K answers of a preference query, retrieving the minimum # of tuples
9
Our Approach PREFER materializes a number of ranked views of the relation and uses them to efficiently answer to preference queries.
10
Our Approach Ranked view 0.08*Price + 0.2*Year 0.08 0.2 Price Year Ranked view 0.075*Price + 0.8*Year
11
Our Approach Ranked view 0.08*Price + 0.2*Year 0.08 0.2 Price Year Preference query: 0.07*Price + 0.35*Year 0.07 0.35 Ranked view 0.075*Price + 0.8*Year
12
Relation Space constraints Discretization of ranked views’ vectors. Which ranked views should we materialize? PREFER Architecture Views Creation Preprocessing stage
13
View Selection Query Pipelining Algorithm Query Ranked View id Mat.Views Output results Runtime Process Which ranked view should we use to answer to a specific preference query? PREFER Architecture index of mat. views Preprocessing stage Relation Space constraints Discretization of ranked views’ vectors. Which ranked views should we materialize? Views Creation How to use a preference view to answer to a preference query
14
View Selection Query Pipelining Algorithm Query Ranked View id Mat.Views Output results Runtime Process How to use a preference view to answer to a preference query Which ranked view should we use to answer to a specific preference query? PREFER Architecture index of mat. views Preprocessing stage Relation Space constraints Discretization of ranked views’ vectors. Which ranked views should we materialize? Views Creation
15
t1t1 Watermark = 14.26 Car ID...Doorsfqfq Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price last tuple Watermark
16
Calculating the Watermark Watermark
17
Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price t1t1 1.Calculate Watermark for t 1, which is 14.26 Car ID
18
How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm t1t1 Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price 1.Calculate Watermark for t 1, which is 14.26 2.Find prefix of view with f v greater than watermark value and sort them by f q Car ID
19
How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm t1t1 Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price 1.Calculate Watermark for t 1, which is 14.26 2.Find prefix of view with f v greater than watermark value and sort them by f q Car ID
20
How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm t1t1 1.Calculate Watermark for t 1, which is 14.26 2.Find prefix of view with f v greater than watermark value and sort them by f q 3.Output tuples up to t 1 Car ID 2 1 Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price
21
How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm t1t1 1.Calculate Watermark for t 1, which is 14.26 2.Find prefix of view with f v greater than watermark value and sort them by f q 3.Output tuples up to t 1 4.Repeat using first unprocessed as t 1 Car ID 2 1 Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price
22
How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm t1t1 Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price 1.Calculate Watermark for t 1, which is 13.1 2.Find prefix of view with f v greater than watermark value and sort them by f q 3.Output tuples up to t 1 4.Repeat using first unprocessed as t 1 Car ID 2 1
23
How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm t1t1 Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price 1.Calculate Watermark for t 1, which is 13.1 2.Find prefix of view with f v greater than watermark value and sort them by f q 3.Output tuples up to t 1 4.Repeat using first unprocessed as t 1 Car ID 2 1 3
24
How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm t1t1 1.Calculate Watermark for t 1, which is 8.3 2.Find prefix of view with f v greater than watermark value and sort them by f q 3.Output tuples up to t 1 4.Repeat using first unprocessed as t 1 Car ID 2 1 3
25
Ranked View, ordered by 0.02*Mileage+0.4*Year+0.04*Price How to use a ranked view to answer a preference query (cont’d) PipelineResults Algorithm Result, ordered by 0.01*Mileage+0.6*Year+0.03*Price t1t1 1.Calculate Watermark for t 1, which is 8.3 2.Find prefix of view with f v greater than watermark value and sort them by f q 3.Output tuples up to t 1 4.Repeat using first unprocessed as t 1 Car ID 2 1 3 5 4
26
View Selection Query Pipelining Algorithm Query Ranked View id Mat.Views Output results How to use a preference view to answer to a preference query Which ranked view should we use to answer to a specific preference query? PREFER Architecture index of mat. views Preprocessing stage Relation Space constraints Discretization of ranked views’ vectors. Which ranked views should we materialize? Views Creation Runtime Process
27
Define coverage 0.8 0.2 Year Price Ranked view 0.8*Price + 0.2*Year V1V1 q1q1 Preference query: 0.7*Price + 0.35*Year 0.7 0.35 V 1 covers q 1 : At most k tuples are retrieved from V 1 in order to output first result of q 1.
28
Which ranked view should we use to answer to a specific preference query? Ranked view 0.8*Price + 0.2*Year 0.8 0.2 Price Year Ranked view 0.75*Price + 0.8*Year
29
Ranked view 0.8*Price + 0.2*Year 0.8 0.2 Price Year Ranked view 0.75*Price + 0.8*Year Which ranked view should we use to answer to a specific preference query?
30
Ranked view 0.8*Price + 0.2*Year 0.8 0.2 Price Year Preference query: 0.7*Price + 0.35*Year 0.7 0.35 Ranked view 0.75*Price + 0.8*Year V 1 covers q 1 Which ranked view should we use to answer to a specific preference query? V1V1 q1q1
31
View Selection Query Pipelining Algorithm Query Ranked View id Mat.Views Output results How to use a preference view to answer to a preference query Which ranked view should we use to answer to a specific preference query? PREFER Architecture index of mat. views Preprocessing stage Relation Space constraints Discretization of ranked views’ vectors. Which ranked views should we materialize? Views Creation Runtime Process
32
Which ranked views should we materialize? ViewSelection Algorithm while (not all preference vectors in [0,1] n covered) Randomly pick v [0,1] n and add it to the list of views L VIEWS for i = 1 to C do select v L that covers the maximum number of uncovered vectors in [0,1] n VIEWS VIEWS v
33
Which ranked views should we materialize? (cont’d) ViewSelection Algorithm while (not all preference vectors in [0,1] n covered) Randomly pick v [0,1] n and add it to the list of views L VIEWS for i = 1 to C do select v L that covers the maximum number of uncovered vectors in [0,1] n VIEWS VIEWS v
34
Which ranked views should we materialize? (cont’d) ViewSelection Algorithm while (not all preference vectors in [0,1] n covered) Randomly pick v [0,1] n and add it to the list of views L VIEWS for i = 1 to C do select v L that covers the maximum number of uncovered vectors in [0,1] n VIEWS VIEWS v C = 3
35
Constraint on # of views Maximum coverage problem using the minimum # of materialized views is NP- Hard. Greedy Heuristic is approximation for maximum coverage.
36
Related Work Preference Query Framework [AW00] Top-k queries –Joins Fagin [F99,F96,F01], equijoins of ordered data –Selections [reduce top-k selection to range query] Histograms to estimate cutoff [Chaudhuri&Gravano 99] Probabilistic model [Donjerkovic&Ramakrishnan 99] Partitioning [Carey & Kossman 97,98]
37
Related Work The Onion Technique (Sigmod 2000). Main observation: the points of interest lie on the convex hull of the tuple space. Drawbacks of Onion: Does not scale Computing the convex hull is very computationally intensive Not efficient if the domain of an attribute has a small cardinality Not efficient for more than the top-1 result
38
Experiments Measured parameters # attributes size of relation # views constraint on max # tuples retrieved
39
Parameters of Experiments synthetic datasets 3 to 5 attributes 10,000 to 500,000 tuples random & correlated data discretization of 0.1 or 0.05
40
Experiments (cont’d) Dual PII CPU, 512MB RAM, 4 attr, 50,000 tuples, 34 Views
41
Experiments (cont’d) 4 attr, constraint = 500 tuples, discretization = 0.1
42
Experiments (cont’d) 500,000 tuples, constraint = 500 tuples, discretization = 0.05...0.1
43
Experiments (cont’d) 4 attr, discretization = 0.1
44
Experiments (cont’d) 4 attr, discretization = 0.1
45
Experiments (cont’d) 50,000 tuples, 3 attr, discretization = 0.05
46
More Resources www.db.ucsd.edu/PREFER PREFER demo PREFER Application –Construct Materialized Views –Issue preference queries MS Windows, on top of Oracle DBMS
47
Conclusions Methodology to efficiently answer to top-K linearly weighted queries Algorithm that uses a ranked view to answer to a preference query Ranked materialized views were used Experimental evaluation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.