Download presentation
Presentation is loading. Please wait.
1
1 Rainbow XML-Query Processing Revisited: The Incomplete Story (Part II) Xin Zhang
2
2 Outline XAT Decorrelation. Optimization XAT Computation Pushdown. XAT Data Model Cleanup. XAT Cutting. Conclusion & Future Works.
3
3 XAT Decorrelation XQuery is Correlated Query Decorrelation is required for Optimization XAT Computation Pushdown. XAT Data Model Cleanup. XAT Cutting.
4
4 Three kinds of Decorrelation Simple Decorrelation No Additional sources No Aggregate Functions Complex Decorrelation with Additional Sources Complex Decorrelation with Aggregate Functions
5
5 TCP/IP Illustrated 65.95 TCP/IP Illustrated 65.95 Data on the Web 34.95 Data on the Web 39.95 Example* of XML Use Cases.
6
6 Simple Query Example T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 FOR($t) Agg() T ( [$t] ):col1 { for $t in distinct (document("prices.xml") /book/title) return $t } In the document "prices.xml", find the book title.
7
7 Simple Decorrelation Linear the Tree: T[FOR(CB, T2[])[T1[S1]]] T[T2[T1[S1]]] T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 FOR($t) Agg() T ( [$t] ):col1 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 Agg() T ( [$t] ):col1
8
8 Is Simple Decorrelation Right? Every operator, except Groupby, has the semantic of “for each” tuple in the input table. Hence, the FOR operator can be omitted in the simple decorrelation scenario.
9
9 Two types of Navigates Navigate Unnesting: U Unnesting the parent-children relationship, and duplicates the parent values for each child. Navigate Collection: C Nesting the parent-children relationship, create a collection of children, but keep the single parent.
10
10 Where to use two types Navigate Unnesting: U FOR binding. Navigate Collection: C LET binding.
11
11 Complex Query Example c ($b, price):col4 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 FOR($t) Agg() T ( [$t], [col4] ):col1 { for $t in distinct (document("prices.xml") /book/title), let $b := document( “ prices.xml") /book [title = $t] return $t, $b/price } In the document "prices.xml", find the book title and its prices. S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) c ($b, title):col3
12
12 Complex Decorrelation with Additional Source : T[FOR(CB, T2[S2])[T1[S1]]] T[T2[ [T1[S1],S2]]] c ($b, price):col4 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 FOR($t) Agg() T ( [$t], [col4] ):col1 S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) c ($b, title):col3 C ($b, price):col4 T ( [$t], [col4] ):col1 S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) C ($b, title):col3 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 Agg()
13
13 Full Query Example c ($b, price/text()):col4 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 FOR($t) Agg() T ( [$t], [col5] ):col1 { for $t in distinct (document("prices.xml") /book/title), let $b := document( “ prices.xml") /book [title = $t] return $t, min($b/price/text()) } In the document "prices.xml", find the minimum price for each book, in the form of a "minprice" element. S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) c ($b, title):col3 min(col4):col5
14
14 Complex Query Decorrelation with one Aggregation Function T[FOR(CB, T2[Agg(T3[])])[T1[S1]]] T[ (DM(T1))[T1,T2[ (DM(T1),Agg(T3[ [Distinct(T1[S1]), S2]))]]] DM(T1) is data model computed from T1. S2 Agg() T1 S1 T3 FOR($rate) T2 T S1 Groupby(DM(T1), Agg()) S2 T3 T T2 T1 Distinct
15
15 The Query after Decorrelation c ($b, price/text()):col4 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 FOR($t) Agg() T ( [$t], [col5] ):col1 S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) c ($b, title):col3 min(col4):col5 C ($b, price/text()):col4 T ( [$t], [col4] ):col1 S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) C ($b, title):col3 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 Agg() GB(DM, min(col4):col5)
16
16 Where are we? XAT Decorrelation. Optimization XAT Computation Pushdown. XAT Data Model Cleanup. XAT Cutting. Conclusion & Future Works.
17
17 XAT Computation Pushdown To push the execution into relational database Steps: Push Navigation down. Cancel out Navigation and Tagger. Generating SQL stmt.
18
18 Navigation Pushdown Basically Navigation can push through all the operators until: Has dependency on its child operator. Example Rewriting rules: (x1, path):x2[ (y1, path):y2[T]] (y1, path):y2[ (x1, path):x2[T]] (x1 != y2) (x1, path):x2[ (c) [T]] (c) [ (x1, path):x2[T]] (x1, path):x2[ [T1, T2]] [T1, (x1, path):x2[T2]] (if x1 in DM(T2)) (x1, path):x2[ [T1, T2]] [ (x1, path):x2[T1], T2] (if x1 in DM(T1))
19
19 Navigation Pushdown Example C ($b, price/text()):col4 T ( [$t], [col4] ):col1 S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) C ($b, title):col3 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 Agg() GB(DM, min(col4):col5) C ($b, price/text()):col4 T ( [$t], [col4] ):col1 S(“prices.xml”):R2 C (R2, /book):$b (col3=$t) C ($b, title):col3 T( [col1] ):col0 distinct(col2):$t S(“prices.xml”):R1 (R1, /book/title):col2 Agg() GB(DM, min(col4):col5)
20
20 Navigation/Tagger Cancel Out Used to simplify a composite XAT tree. Transformation Rules: (x, /):y[T( [z] ):x[s]] s Note: Also use type analysis for the cancel out.
21
21 View Query Example TCP/IP Illustrated 65.95 TCP/IP Illustrated 65.95 Data on the Web 34.95 Data on the Web 39.95 { for $row in distinct (DXV /book/row), return $row/title, $row/price } T( [col6] ):col5 T( [col7],[col8] ):col6 S(DXV):R3 (R3, /book/row):$row Agg() ($row, title):col7 ($row, price):col8
22
22 Cancel Out Example (1) C ($b, price/text()):col4 S(“prices.xml”):R2 C (R2, /book):$b C ($b, title):col3... T( [col6] ):col5 T( [col7],[col8] ):col6 S(DXV):R3 (R3, /book/row):$row Agg() ($row, title):col7 ($row, price):col8 C ($b, price/text()):col4 C (R2, /book):$b C ($b, title):col3... T( [col6] ):R2 T( [col7],[col8] ):col6 S(DXV):R3 (R3, /book/row):$row Agg() ($row, title):col7 ($row, price):col8 (x, y)[op():x[s]] op():y[s]
23
23 Cancel Out Example (2) C ($b, price/text()):col4 C (R2, /book):$b C ($b, title):col3... T( [col6] ):R2 T( [col7],[col8] ):col6 S(DXV):R3 (R3, /book/row):$row Agg() ($row, title):col7 ($row, price):col8 C ($b, price/text()):col4 C ($b, title):col3... T( [col7],[col8] ):$b S(DXV):R3 (R3, /book/row):$row ($row, title):col7 ($row, price):col8
24
24 Cancel Out Example (3) C ($b, price/text()):col4 C ($b, title):col3... T( [col7],[col8] ):$b S(DXV):R3 (R3, /book/row):$row ($row, title):col7 ($row, price):col8 C ($b, price/text()):col4... T( [col7],[col8] ):$b S(DXV):R3 (R3, /book/row):$row ($row, title):col3 ($row, price):col8
25
25 Cancel Out Example (4) C ($b, price):temp1... T( [col7],[col8] ):$b S(DXV):R3 (R3, /book/row):$row ($row, title):col3 ($row, price):col8 C (temp1, text()):col4... S(DXV):R3 (R3, /book/row):$row ($row, title):col3 ($row, price):temp1 C (temp1, text()):col4
26
26 SQL Generation Find a pattern in the XAT Translate that pattern into a SQL operator that will access the relational database.
27
27 SQL Generation Example... S(DXV):R3 (R3, /book/row):$row ($row, title):col3 ($row, price):temp1 C (temp1, text()):col4... SQL( select title as col3, price as temp1 from book):{col3,temp} C (temp1, text()):col4
28
28 Where are we? XAT Decorrelation. Optimization XAT Computation Pushdown. XAT Data Model Cleanup. XAT Cutting. Conclusion & Future Works.
29
29 XAT Data Model Cleanup By Default Each operator will append one additional columns to the data model. Used to Help: Execute: used to optimize the data storage during the execution Cutting: get rid of the un-used operators in the XQuery Equations for Data Model Cleanup Only keep the columns required by ancestors. DM := (DM p – P p ) C p (P – C)
30
30 Data Model Example for $b in document("prices.xml") /book let $prices := $b/price return $b S(“prices.xml”):R1 (R1, /book):$b Agg() ($b,):col1 C ($b, price):$prices 1 2 3 4 5 NodeProduceConsumeDM beforeDM after 1{} {$prices, R1, $b, col1} {} 2{col1}{$b}{$prices, R1, $b, col1} {col1} 3{$prices}{$b}{$prices, R1, $b} {$b, $prices} 4{$b}{R1}{R1, $b}{$b} 5{R1}{}{R1} DM := (DM p – P p ) C p (P – C)
31
31 Where are we? XAT Decorrelation. Optimization XAT Computation Pushdown. XAT Data Model Cleanup. XAT Cutting. Conclusion & Future Works.
32
32 XAT Cutting General Idea: Get rid of the operators that’s produce useless data. Equations: R := (R p – P) C (P M) (R p M p ) = NULL
33
33 XAT Cutting Example R := (R p – P) C (P M) (R p M p )= NULL for $b in document("prices.xml") /book let $prices := $b/price return $b S(“prices.xml”):R1 (R1, /book):$b Agg() ($b,):col1 C ($b, price):$prices 1 2 3 4 5 NodeProduceConsumeModifie d RequiredCut? 1{} {*}{}N/A 2{col1}{$b}{}{$b}{col1} 3{$prices}{$b}{}{$b}{} 4{$b}{R1}{}{R1}{$b} 5{R1}{} {R1}
34
34 Conclusions XQuery are heavily correlated, hence need to be decorrelated for better optimization. After Decorrelation, more optimization techniques can be applied: Computation Pushdown. Data Model Cleanup. Cutting.
35
35 Future Works Write TR to formalize the XAT. Compare with ORDB, ODB, also XQA operators. Wrap Up: Finalize uncertain operators deal with collections Union, Navigate Formalize the Pushdown Rewriting Rules by Type (Reg. Exp. Type) Analysis Finalize the XAT Rewriting Rules for: Order Handling Update propagation. Translation from XAT back to Query Next Step: Generate Search Space and Optimization Algorithm for XAT, ready for Schema Generation.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.