Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Ranking Inexact Answers. 2 Ranking Issues When inexact querying is allowed, there may be MANY answers –different answers have a different level of incompleteness.

Similar presentations


Presentation on theme: "1 Ranking Inexact Answers. 2 Ranking Issues When inexact querying is allowed, there may be MANY answers –different answers have a different level of incompleteness."— Presentation transcript:

1 1 Ranking Inexact Answers

2 2 Ranking Issues When inexact querying is allowed, there may be MANY answers –different answers have a different level of incompleteness Ranking the answers allows the user to quickly see the (hopefully) most relevant answers Preference: Create answers in ranking order –Why is this important? We will consider several different approaches to this problem

3 3 Tree Pattern Relaxation Amer-Yahia, Cho, Srivastava EDBT 2002

4 4 Tree Patterns Queries are tree patterns, as considered in previous lessons Book CollectionEditor NameAddress Double line indicates descendent

5 5 Relaxed Queries Four types of “relaxations” are allowed on the trees Node Generalization: Assume that we know a relationship of types/super-types among labels. Allow label to be changed to super-type Book CollectionEditor NameAddress Document CollectionEditor NameAddress

6 6 Relaxed Queries Leaf Node Deletion: Delete a leaf node (and its incoming edge) from the tree Book CollectionEditor NameAddress Book Editor NameAddress

7 7 Relaxed Queries Edge Generalization: Change a parent-child edge to an ancestor-descendent edge Book CollectionEditor NameAddress Book Editor NameAddress Collection

8 8 Relaxed Queries Subtree Promotion: A query subtree can be promoted so that it is directly connected to its former grandparent by an ancestor-descendent edge Book CollectionEditor NameAddress Book Editor Name Address Collection

9 9 Composing Relaxations Relaxations can be composed. Are the following relaxations of Q? Book CollectionEditor NameAddress Q Book Collection Book CollectionAddress Name DocumentAddress

10 10 Approximate Answers and Ranking An approximate answer to Q is an exact answer to a relaxed query derived from Q In order to give different answers different rankings, tree patterns are weighted Each node and edge has 2 weights – value when exactly satisfied, value when satisfied by a relaxation Book CollectionEditor NameAddress (7, 1) (4, 3) (2, 1) (6, 0) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) A fragment of a document that exactly satisfies the query will have a score of: 45

11 11 Example Ranking Book CollectionEditor NameAddress (7, 1) (4, 3) (2, 1) (6, 0) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) Book Person Name Address Details Sam NY How much would this answer score?

12 12 Example Ranking Book Collection Editor NameAddress (7, 1) (4, 3) (2, 1) (6, 0) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) Book Person Name Address Details Sam NY How much would this answer score?

13 13 Problem Definition Given an XML document D, a weighted tree pattern Q and a threshold t, find all approximate answers of Q in D whose scores are ≥ t Naive strategy to solve the problem: –Find all relaxations of Q –For each relaxation, compute all exact answers –remove answers with score below t Is this a good strategy?

14 14 Problem Definition Given an XML document D, a weighted tree pattern Q and a threshold t, find all approximate answers of Q in D whose scores are ≥ t A better strategy to compute an answer to a relaxation of a query: –Intuition: Compute the query as a series of joins –Can use stack-merge algorithms (studied before) for computing joins –filter out intermediate results whose scores are too low

15 15 The Query Plan We now show the how to derive a plan for evaluating queries in this setting First, we show how an exact plan is derived Then, we consider how each individual relaxation can be added in Finally, we show the complete relaxed plan

16 16 Query Plan: Exact Answers Book CollectionEditor NameAddress (7, 1) (4, 3) (2, 1) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) BookCollection Editor Address Name c(Book, Collection) c(Book, Editor) c(Editor, Name) d(Editor, Address) c(x,y) = y is child of x d(x,y) = y is descendent of x (6, 0)

17 17 Query Plan: Exact Answers Book CollectionEditor NameAddress (7, 1) (4, 3) (2, 1) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) BookCollection Editor Address Name c(Book, Collection) c(Book, Editor) c(Editor, Name) d(Editor, Address) Remember, to compute a join, e.g., of Book and Collection, we actually find the list of Books and the list of Collections (from the index) and perform the stack-merge algorithms (6, 0)

18 18 Adding Relaxations into Plan Node generalization: Book relaxed to Document Book Collection Editor Address Name c(Book, Editor) c(Editor, Name) d(Editor, Address) Document c(Book, Collection) c(Document, Collection) c(Document, Editor)

19 19 Adding Relaxations into Plan Edge generalization: Relax Editor-Name Edge Book Collection Editor Address Name c(Book, Editor) c(Editor, Name) d(Editor, Address) c(Book, Collection) c(Editor, Name) or (Not exists c(Editor,Name) and d(Editor, Name(( Written in short as: c(Editor, Name) or d(Editor, Name( We only allow relaxations when a direct child does not exist

20 20 Adding Relaxations into Plan Subtree Promotion: Promote tree rooted at Name Book Collection Editor Address Name c(Book, Editor) c(Editor, Name) d(Editor, Address) c(Book, Collection) c(Editor, Name) or (Not exists c(Editor,Name) and d(Book, Name(( Written in short as: c(Editor, Name) or d(Book, Name(

21 21 Adding Relaxations into Plan Leaf Node Deletion: Make Address Optional Book Collection Editor Address Name c(Book, Editor) c(Editor, Name) d(Editor, Address) c(Book, Collection) Outer Join Operator: Means that should join if possible, but not delete values that cannot join

22 22 Combining All Possible Relaxations All approximate answers can be derived from the following query plan Document Collection Editor Address Name c(Document, Editor) OR d(Document, Editor) c(Editor, Name) OR d(Editor, Name) OR d(Document,Name) d(Editor, Address) OR d(Document, Address) c(Book, Collection) OR d(Document, Collection) Book CollectionEditor NameAddress (7, 1) (4, 3) (2, 1) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) (6, 0)

23 23 Creating “Best Answers” Want to find answers whose ranking is over the threshold t Naive solution: Create all answers. Delete answers with low ranking Algorithm Thres: Goal of the algorithm is to prune intermediate answers that cannot possibly meet the specified threshold

24 24 Associating Nodes with Maximal Weight The maximal weight of a node in the evaluation plan is the largest value by which the score of an intermediate answer computed for that node can grow Document Collection Editor Address Name c(Document, Editor) OR d(Document, Editor) c(Editor, Name) OR d(Editor, Name) OR d(Document,Name) d(Editor, Address) OR d(Document,Address) c(Book, Collection) OR d(Document, Collection)

25 25 Book CollectionEditor NameAddress (7, 1) (4, 3) (2, 1) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) Document Collection Editor Address Name c(Document, Editor) OR d(Document, Editor) c(Editor, Name) OR d(Editor, Name) OR d(Document,Name) d(Editor, Address) OR d(Document,Address) c(Book, Collection) OR d(Document, Collection) (38)(39) (6, 0) (30)(40) (39) (41) (21) (7) (0)

26 26 Algorithm Thres Relaxed query evaluation plan is computed bottom-up –Note that the joins are computed for all matching intermediate results at the same time At each step, intermediate results are computed, along with their scores If the sum of an intermediate result score with the maximal weight of the current node is less than the threshold, prune the intermediate result

27 27 Example: Threshold = 35 Book Editor Name Address Details Sam NY Document Collection Editor Name c(Document, Editor) OR d(Document, Editor) c(Editor, Name) OR d(Editor, Name) OR d(Document,Name) d(Editor, Address) OR d(Document,Address) c(Book, Collection) OR d(Document, Collection) (38)(39) (30)(40) (39) (41) (21) (7) (0) Book CollectionEditor NameAddress (7, 1) (4, 3) (2, 1) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) Address (6, 0) When will the answer be pruned? 7 7 16 27

28 28 Test Yourself

29 29 Example Ranking Book Collection Editor NameAddress (7, 1) (4, 3) (2, 1) (6, 0) (5, 0) (8, 5) (6, 0) (4, 0) (3, 0) Document Name Address Sam NY How much would this answer score? Collection

30 30 (8, 5) Query Plan Book CollectionEditor Name (7, 1) (4, 3) (2, 1) (5, 0) (6, 0) 1. What will the exact plan look like? FNameLName 2. What will the plan look like if all possible relaxations are added? 3. What is maximal weight by which the score of an intermediate answer can grow, for each node? (2, 1) (2, 0)(1, 0)


Download ppt "1 Ranking Inexact Answers. 2 Ranking Issues When inexact querying is allowed, there may be MANY answers –different answers have a different level of incompleteness."

Similar presentations


Ads by Google