Download presentation
Presentation is loading. Please wait.
1
1 IVOX I ncremental V iew Maintenance for O rdered X ML DSRG Talk WPI February 20 th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke Rundensteiner
2
2 Outline Motivation Problem Description Background XML Algebra Order in XML Algebra The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
3
3 Outline Motivation Problem Description Background XML Algebra Order in XML Algebra The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
4
4 Motivation Views in general Data warehouses Information integration Access control, Privacy,..etc XML Views (EXTRA useful) Information Inter-Portability Crossing gaps between different data models Materialized Views Speed up data retrieval Query optimization Increased availability RDBXML Other Sources View View Definition Query
5
5 Maintaining Materialized Views When sources are updated, materialized view may becomes inconsistent. Methods of view maintenance Recomputation recompute view from scratch from base data Incremental view maintenance compute changes to view in response to changes to base sources Heuristic: Incremental view maintenance is usually cheaper than full recomputation.
6
6 Outline Motivation Problem Description Background The XAT Algebra XML order in the XAT Context The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
7
7 The Problem Previous work for: Relational [GMS93], bag semantics [GL95], [ZGHW95], [PSCP02] Object-Relational [LVM00] Object-Oriented [AFP02] Structured data models [AMRVW98], [ZM98] XML data model not handling order [LD00] Can techniques for other data models be reused for XML?
8
8 Is Maintaining XML Views Different? XML features Hierarchical Optional elements Self-typed References Ordered Expressiveness of view definition language Complex operations tagging, unnesting, aggregation,.. Expected large auxiliary information
9
9 Example 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price List all books that cost less than $60, including their title and price Data on the Web 39.95 Data on the Web 39.95 Bib.xml View Definition Query View Extent
10
10 Example Insert element 55.48 into second book Bib.xml Data on the Web 39.95 Data on the Web 39.95 View Extent TCP/IP Illustrated 55.48 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 39.95 Data on the Web 55.48 for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return $b/title, $b/price View Definition Query
11
11 Our Goal Design incremental view maintenance strategy for XQuery views that: Correctly update the view Is order sensitive Returns view in proper order Allows for updates that specify order Covers at least the “core” of XQuery language views Minimizes auxiliary information requirements
12
12 Basics of IVOX Approach: Algebraic Update propagation rules for each algebra operator and each update type XML Source XML Source XML Source XML View Update Algebra Tree XQuery Definition Operator D1 D2 Operator D1 Update D2 Update ExecutionView Maintenance time
13
13 Why Algebraic? Robust – Easily adaptable to operator semantic changes Extensible – new operators can be added Allows for reuse of techniques for known operators Language independent- independent of syntax changes (of XQuery by W3C) Formal – basis for provable correctness
14
14 Outline Motivation Problem Description Background XML Algebra Order in XML Algebra The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
15
15 Background on XML Algebra XAT XAT Operators SQL Operators: Select, Project … Special Operators: Source, FOR… XML Operators: Navigate, Tagger.. XAT Data Model (XAT Table) Order sensitive table of tuples Columns denote user-specified or internally generated variable bindings A cell in a tuple holds an XML node for a sequence of XML nodes $col1, price $col3 $col3$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … ….
16
16 Order in XAT Context Order among tuples Order among XML nodes in a cell $col1, price $col3 $col3$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … ….
17
17 Order in the XAT Context Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ) (, Order among the tuples Order among XML nodes in a single cell
18
18 Order in XAT Context: View Maintenance On update worry about: Order among tuples Order among XML nodes in a cell $col1, price $col3 $col3$b 65.95 …. 55.48 …. 39.95 …. $b 65.95 Advanced … TCP/IP … 55.48 ….
19
19 Order in XAT Context & View Maintenance Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ), ( On update worry about: Order among the tuples Order among XML nodes in a single cell
20
20 Duplicate Information in XAT Context Complex operations require auxiliary information Auxiliary information can be too large in XAT context May be expensive to maintain it $col1, price $col3 $col3$b 65.95 65.95 Advanced … 39.95 …. $b 65.95 Advanced … TCP/IP … …. Duplicated Storage !
21
21 Outline Motivation Problem Description Background XML Algebra Order in XML Algebra The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
22
22 Possible Solutions to Order Preservation (I) Sequential storage (XPROP approach by Maged, Ling & Luping) Assume intermediate results stored sequentially Inserts and deletes are performed in physical order No order encoding Special support required for secondary storage May require iteration over many tuples to determine order $col1, price $col3 39.95 …. 65.95 $col3 …. $b …. TCP/IP … 65.95 Advanced … $b $col3$b 65.95 …. 39.95 …. 55.48 ….
23
23 Possible Solutions to Order Preservation (II) Naïve order encoding for tuples and sequences of XML nodes Assign order numbers to tuples and to XML nodes in a sequence Requires frequent renumbering on inserts. $col1, price $col3 $col3$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … …. Ord 1 2 1 2 3 55.48 …. 2 3 2 1 Ord 55.48
24
24 Using Node Identity node identity Idea: Use node identity Usage: For encoding order and structure As a reference to base data
25
25 What Encoding For Node Identity? bib book price title price title 1 2 5 7 4 3 6 8 9 Existing techniques for encoding order for XML Global Order (UW) Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) price 6 7 8 9 10
26
26 bib book price title price title 1 1 2 3 2 1 1 1 2 Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? price 1 2
27
27 bib book price title price title 1 1.1 1.2 1.3 1.1.2 1.1.1 1.2.1 1.3.1 1.3.2 Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? price 1.2.1 1.2.2
28
28 bib book price title price title b b.b b.d b.f b.b.cd b.b.b b.d.f b.f.cm b.f.l Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) Lexicographical Order (MASS) What Encoding For Node Identity? The Winner price b.d.b
29
29 Lexicographical Keys: LexKeys What are LexKeys? Multi-level lexicographical keys Example: c, ba.c.b Examples of comparison b < b.c bab < bd.cc b.b < b.b.c Advantages All LexKeys form a totally ordered set with respect to < It is always possible to generate a key between two keys The deletion of a LexKey in a sequence does not affect other LexKeys Usage Reference to XML nodes Encoding order
30
30 LexKeys in XAT Tables $b, price $col2 $col2$b b.b.bb.b b.f.cmb.f $b b.b b.d b.f $b, price $col2 $col2$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … ….
31
31 Order Among XAT Tuples Notion: designate order schema to XAT tables Ordering by LexKeys by columns in order schema yields correct tuple order. $d$c$b c.mb.b.bb.f d.cb.f.cmb.b d.c.bb.f.cmb.b Order Schema 12 1 3 2
32
32 Calculating Order Schema OperatorOrder Schema odc(out) Tagger T pattern $col’ (s) odc(s) Source S desc $col’ none. Navigate Unnest $col, path $col’ (s) If col is last in odc(s) Concat (odc(s) – col, col’ ) else Concat (odc(s), col’ ) Rules for each operator Calculated in a postorder traversal of the tree Sample Rules
33
33 Order Among Tuples Example $b, price $col2 $col2$b b.b.bb.b b.f.cmb.f $b b.b b.d b.f $b, price $col2 $col2$b 65.95 …. 39.95 …. $b 65.95 Advanced … TCP/IP … …. 1 1 2 1 3 2 1
34
34 Order in Collection within a cell? Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ) (, Agg $col5 $col5 tbb tbc $col5$col4$col2 tbbb.f.lb.f.cm tbcb.d.bb.d.f {}, 12 1 2 12
35
35 Smart Keys What is a SmartKey? Overriding Order (LexKey) Key (LexKey) SmartKey Key part, by default also represents order Optional, only represents order when present Notation: key(order) Examples b.c.b (h) b.c.b
36
36 SmartKeys in XATTables Agg $col5 $col5 TCP/IP… Data … ….. $col5 TCP/IP … 55.48 Data … 39.95 ) (, Agg $col5 $col5 tbb(b.f.cm..b.f.l) tbc(b.d.f..b.d.b) $col5$col4$col2 tbbb.f.lb.f.cm tbcb.d.bb.d.f {}, 12 1 2 12
37
37 The Impact of SmartKeys on View Maintenance
38
38 Order Among XAT Tuples during View Maintenance Not touching other tuples in XAT table No reordering ever needed. Gaining distributiveness in regard to bag union on tuple level $col1, price $col3 $col3$b b.b.bb.b b.f.cmb.f b.d.bb.d $b b.b b.f b.d 3 1 2 3 1 2
39
39 Order in a Sequence during View Maintenance Agg $col5 $col5 tb..b.f.l..b.f.cm tb..b.d.f..b.d.b $col5 tb..b.f.l..b.f.cm tb..b.d.f..b.d.b Not touching other members of the sequence No reordering ever needed. Gaining distributiveness in regard to bag union on cell level {}, 1 2 12
40
40 Update Propagation Rules Operator XAT table 1 XAT table 2 Operator Update to XAT table 1 Update to XAT table 2 ExecutionView Maintenance time Use distributiveness in regard to bag union Reuse rules from relational for most SQL XAT operators
41
41 Update Propagation Rules Example ( Navigate Unnest on Insert Tuple) T2 old = $col,path $col’ (T1 old ) T1 new =T1 old + T1 T2 new = $col,path $col’ (T1 old + T1) = = $col,path $col’ (T1 old ) + $col,path $col’ ( T1) = = T2 old + T2 + represents bag union T1 T2 $col,path $col’ T1 T2 ExecutionView Maintenance time $col,path $col’
42
42 Update Propagation Strategy XML Source XML View Update XAT xatup keyup Translator xmlup Update XQuery Storage Manager
43
43 Update Primitives (The Format of Delta) XML Update Primitives (xup) Insert (xmlFragment, path) Delete (path) InsertAtt (name, value, path) DeleteAtt (name, path) Replace (oldValue, newValue, path) XML Key Update Primitives (keyup) Insert (el, path) Delete (path) Replace (el, pos) XAT Update Primitives (xatup) InsertTuple (tuple) DeleteTuple (tupleId) ChangeTuple (Keyup, columnName, tupleId) Apply to original XML Document Express update on original XML data in terms of LexKeys Apply to XATTable
44
44 A Complete Example
45
45 S ”bib.xml” $S1 bib.xml $S1, bib $col1 $col1, book $b $b, price $col2 $b, title $col4 $col3 < 60 T $col4 $col2 $col5 Agg $col5 Storage Manager bib book pricetitle price title b b.bb.db.f b.b.cd b.b.b b.d.f b.f.cmb.f.l bib.xml Constructed XDOMs { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) } $col5 tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm tr $col6 tr tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm result tb..b.f.l.. b.f.cm T $col5 $col6 b $col1 b.f b.d b.b $b b.f.cm b.b.b $col2 b.f b.b $b b.f.l b.b.cd $col4 b.f.cm b.b.b $col2 b.f.l $col4 b.f.cm $ col2 tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm tb..b.f.l..b.f.cm $col5 Execution
46
46 S ”bib.xml” $S1 bib.xml $S1, bib $col1 $col1, book $b $b, price $col2 $b, title $col4 $col3 < 60 T $col4 $col2 $col5 Agg $col5 Storage Manager bib book pricetitle price title b b.bb.db.f b.b.cd b.b.b b.d.f b.f.cmb.f.l bib.xml Constructed XDOMs T $col5 $col6 price b.d.b Insert (price, bib[1].book[2]) Insert (price[b.d.b], bib[b].book[b.d]) b $col1 ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col1, b) b.f b.d b.b $b changeTuple(insert(price[b.d.b], book[b.d]), $b, b.d) ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col2, b.f, b.f.m) b.f.cm b.b.b $col2 b.f b.b $b insertTuple({b.d, b,d.b}) b.f.l b.b.cd $col4 b.f.cm b.b.b $col2 insertTuple({b.d.b, b.d.f}) b.f.l $col4 b.f.cm $ col2 insetTuple({b.d.b, b.d.f}) tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm tb..b.f.l..b.f.cm $col5 insertTuple({tb..b.d.f..b.d.b}) tr $col6 tr tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm result tb..b.f.l.. b.f.cm ChangeTuple(insert(tb..b.d.f..b.d. b, result[tr]), $col6, tr) b.d.bb.d b.f.cm b.b.b $col2 b.f b.b $b b.d.fb.d.d b.f.l b.b.cd $col4 b.f.cm b.b.b $col2 b.d.f b.f.l $col4 b.d.d b.f.cm $ col2 tb.. b.d.f.. b.d.b tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm book b.d.fb.d.b tb..b.d.f..b.d.b tb..b.f.l..b.f.cm $col5 tb.. b.d.f.. b.d.b tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm book b.d.fb.d.b { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) tb..b.d.f..b.d.b(..b.d.f..b.d.b) } $col5 tb..b.d.f..b.d.b(..b.d.f..b.d.b) { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) } $col5 ChangeTuple(insert( tb..b.d.f..b.d.b, null), $col5, ) tb.. b.d.f.. b.d.b tb.. b.f.l.. b.f.cm XDOMKey book b.f.lb.f.cm book b.d.fb.d.b View Maintenance
47
47 Outline Motivation Problem Description Background on XAT XML Algebra Order in XML Algebra The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
48
48 System Architecture Process Data Legend XML Query Engine XML Source XML Algebra Tree Materialized Auxiliary Views Materialized XML View XML Source Persistent Data Storage One time occurrence On-update occurrence XML View Maintainer VM Initializer View Definition XQuery Rainbow User Update XQuery Update Propagation Rules Repository XML Source Update Primitive Generator Executer XTUP Storage Manager Execution View Maintenance
49
49 Outline Motivation Problem Description Background on XAT XML Algebra Order in XML Algebra The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
50
50 Related Work A.Gupta, I.S.Mumick. Maintenance of Materialized Views: Problems, Techniques, and Application. In Bulletin of the Technical Committee on Data engineering 1995. T. Grin, L.Libkin. Incremental maintenance of views with duplicates. In SIGMOD 1995. H. Liefke and S. Davidson. View Maintenance for Hierarchical Semistructured Data. In DAWAK 2000. S. Abiteboul, J. McHugh, Rys, Vassalos, J. Wiener. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB 1998.
51
51 Outline Motivation Problem Description Background on XAT XML Algebra Order in XML Algebra The IVOX Approach Order Encoding Overall strategy System Architecture Related Work Future Work
52
52 Future Work Near Future … Launch the system Batch update coming Experiments and Evaluation Compare the system’s performance to recomputation … and Beyond Batching updates coming from different sources Integrity constraints Algebra tree rewrite rules
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.