Download presentation
Presentation is loading. Please wait.
1
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 DB Seminar, Spring 2005 By: Andrey Balmin Fatma Ozcan Kevin S. Beyer Roberta J. Cochrane Hamid Pirahesh
2
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 ‘Prolog’ Materialized Views Vs. Views Materialized XPath views Vs. materializes views (relational databases) ?
3
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Agenda Introduction Materialized XPath Views XPath Matching Algorithm Definitions Description Examples Complexity Compensation Expression Additional References
4
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Introduction With increase amounts of data, presented and exchanged as XML document, there is a need to efficiently query those documents. To address this request: W3C has proposed an XML query language – XQuery (will be discussed later) ANSI and ISO has defined SQL / XML (Extends relational databases to handle XML) Both uses Xpath to navigate through the XML document.
5
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XQuery XML query language Input: XML document (tree) Output: XML sub tree Syntax: FLWOR (For-[Let]-Where-[Order]-Return) Example: XML DocumentXQuery – Q1 Result
6
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Complexity The containment problem, as discussed in “Containment and Equivalence for an XPath Fragment” by G. Miklau and D. Suciu (will by reviewed at the 8’th lecture), is shown to be CO-NP complete Meaning: The problem T’ T is NP complete, and therefore cannot be solved in a polynomial time. Just for intuition.. : Consider the query: //A//b And the following XML document: Introduction
7
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Materialized XPath Views The Goal In relational databases indexes and materialized views are two well-known techniques to accelerate processing of expensive SQL queries. We would like to expand the materialized views idea to speed up processing of XQuery or SQL / XML queries. The System Suggest a materialized XPath view structure. Define an algorithm that checks whether a given view can be used to answer a given query. (Match algorithm) Define a compensation expression that computes the query result using the information available from the view.
8
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Materialized XPath Views Storing XPath Views To be able to answer all kinds of queries, the view structure should contain: Node path (a list of ancestors) Typed values Reference to the node Note that storing only typed values and node path (without references to the document nodes), will not allow us execute queries whose result is a node collection. For example: Sometimes it also beneficial to store actual copies of XML fragments in an XPath view.
9
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Materialized XPath Views Example: V = //@*
10
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm After defining the materialized XPath view structure, we have to make sure that a given view can be utilized in a user query. Definitions We represent XPath expressions as labeled binary trees, called XPS trees. (XPath Step) An XPS node, represents a step in the XPath statement, is labeled with: Axis {“root”, “child”, “descendant”, “self”, “attribute”, “descendant-or-self”, “parent”} Test {name test, wildcard test (*), kind test (node(), text())} //employees /* /employee [@salary] Node 1 234 Axis: dos Test: employees Axis: descendent Test: *
11
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Definitions – cont’ Each XPS node has two children: The first child is called predicate (and / or, a comparison operator, a constant, XPS node) The second child, called next, points to the next step (XPS node) If one of the children does not exist, we represent it with null For example:
12
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Transition Rules
13
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm General The algorithm computes all possible mappings from XPS nodes of the view to XPS nodes of the query expression, in a single top-down pass of the view expression. The basic algorithm deals only with and / or predicates, and child, attribute or descendant axis. Each function of the table evaluates to Boolean. Important! The view expression can not be more restrictive than the query.
14
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(v,q) Rule 1.1: It is sufficient for one of the conjunction to be mapped by a node of v. (For the other conjunction we use the reference) Rule 1.2: If the view is more restricted than the query, it cannot be used. (For example: Q 2 )
15
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(v,q) – cont’ Rule 1.3: When the view node contains a “descendant” axis, we keep looking for matches down the tree. Rule 1.4: If the axis matches, we try to match the predicate and the next children of the view node. (axis {predicate, child, attribute}) Rule 1.5: If none matches, the algorithm returns false.
16
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchChildren(v,q) Rule 2.1: If the tests matches, we try to match the predicate and the next step of v. matchPred(v pred,q) Rule 3.1: If v does not have a predicate, then the step trivially matches.
17
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchPred(v pred,q) – cont’ Rule 3.2: If v has a predicate and q does not (meaning: v is more restricted than q), then the match fails. Rule 3.3 and 3.4: Match both conjuncts is case of conjunction, and one disjunct in case of disjunction. (For example if v contains all the orders with both price and amount attributes, it cannot be used for a query that requires either price or amount). Rule 3.5: Match both predicates, and the view’s predicate with query’s next child, in case of nested XPath expressions.
18
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchNext(v next,q) Rule 4.1 and 4.2: Same as 3.1 and 3.2. Rule 4.3: Match next children, and the view’s next child with the query’s predicate, in case of nested XPath expressions. (For example: v = //a[b/c] q = //a/b[c] )
19
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Example Consider the following hierarchy of employees: Each employee has: Name, Salary and Bonus attributes Zero or more sub-employees elements
20
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Example – cont’ Consider the following view: //Employee//@*, witch contains all attributes in a sub-tree of any employee, And the query: //Employee[@Bonus]/Employee[@bonus]/@Salary, that asks for the salary of employees who, together with their direct managers, have bonuses.
21
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(1,11)matchChildren(1,11)matchPred(null,11) ^ matchNext(2,11) T matchNext(2,11)
22
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchStep(2, null) matchStep(2,12) F matchStep(2,12) matchChildren(2,12) matchChildren(2,14) Later.. matchChildren(2,12)
23
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchPred(null,12) ^ matchNext(3,12) T matchNext(3,12) matchStep(3,13) matchStep(3,14) Later.. matchStep(3,14)
24
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm matchChildren(3,14) matchPred(null,14) ^ matchNext(4,14) T matchNext(4,14) matchStep(4,15) matchStep(4,16)matchChildren(4,15) matchChildren(4,16) T T T T
25
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchStep(XPS 1, XPS 11 ) matchChildren(XPS 1, XPS 11 ) matchNext(XPS 2, XPS 11 ) matchPred(null, XPS 11 ) T matchStep(XPS 2, XPS 12 ) matchStep(XPS 2, null) F matchChildren(XPS 2, XPS 14 ) matchChildren(XPS 2, XPS 12 ) matchNext(XPS 3, XPS 12 ) matchPred(null, XPS 12 ) T matchStep(XPS 3, XPS 14 ) matchStep(XPS 3, XPS 13 ) T F T Same way… T T T T T
26
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Recording the Match Consider the following example: View v consists n nodes: //a//a…//a Query q consists of m nodes: /a/a…/a Where m > n. Any view node n v can map to any query expression node n q such that n v ’s parent maps to some ancestor of n q. Hence, there are n out of m distinct tree mapping of the view to the query expression. And so on…
27
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Solution: Keeping track of all mapping in a match matrix structure (matches between v and q). Match matrix allows us to encode an exponential number of tree mappings in a polynomial size structure, by recording all possible contexts for each node mapping. It also reduces running time of the algorithm to polynomial.
28
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Match matrix structure: v \ q XPS 11 root XPS 12 //Employ ee XPS 13 @Bonus XPS 14 /Employe e XPS 15 @Bonus XPS 16 @Salary XPS 1 root T XPS 2 //Employ ee T T XPS 3 dos::* T T XPS 4 @* T TT Each row corresponds to an XPS node of the view tree. Each column corresponds to an XPS node of the query tree. Each cell may contain one of three possible values: Empty, True or False. Edges represent the context in which the mapping was detected.
29
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm For example, in the previous XPS trees: matchStep(XPS v2, XPS q2 ) will be called twice, but will be calculated only once.
30
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Extensions Handling Comparison Predicates The algorithm as shown so far, lacks rules to match comparison operations. To solve this problem, the XPS trees are preprocessed in a way that comparison operations become transformed to filters. After a successful matching, for each filter in the view query tree, we have to check if the filter of a matched node in the query tree is at least as specific.
31
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Complexity Space complexity: Match matrix size:O(|V| * |Q|) Number of XPS nodes in the view Number of XPS nodes in the query expression Each matrix cell can have at most |Q| incoming edges The number of edges in the matrix: O(|V|*|Q| 2 )
32
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 XPath Matching Algorithm Complexity – cont’ Constructing the matrix: matchStep(v,q) function has only |V|*|Q| distinct sets of parameters Each match can be calculated at most once In the worst case a function call may expand into |Q| function calls: Total running time: O(|V|*|Q| 2 ) Polynomial!
33
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Compensation Expression After having a successful match between the query and the view, we will have to extract the query by using the materialized view. For simplicity we assume that there is only one possible mapping between the view and the query. This extraction is called compensation, and is achieved in two steps: Step 1: Eliminate unnecessary conditions. For example: V = //a[@b] Q = //a[@b^@c] The query expression does not have to include the [@b] condition, since it implies by the view.
34
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Compensation Expression Step 2: The query statement is transformed, or relaxed, so that data can be extracted easily from the view table: We define the last matched node of the view as extraction point, and the last matched node of the query expression as compensation root node. Than, we reconstruct Q to an equivalent expression that starts at the compensation root node. Now, as in this example, item elements can be directly extracted from the view table.
35
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Compensation Expression Extract from the table Use reference
36
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing DB Seminar, Spring 2005 Additional References XQuery and SQL / XML: http://www.datadirect.comhttp://www.datadirect.com XML Seminar: http://www.jgreen.dehttp://www.jgreen.de General: http://msdn.microsoft.comhttp://msdn.microsoft.com
37
Inbal Yahav The End DB Seminar, Spring 2005
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.