Download presentation
Presentation is loading. Please wait.
Published byRaymond Hudson Modified over 9 years ago
1
Schema-Based Query Optimization for XQuery over XML Streams Hong Su Elke A. Rundensteiner Murali Mani Worcester Polytechnic Institute, Massachusetts, USA VLDB 2005
2
Schema-Based Query Optimization (SQO) Schema knowledge can be utilized to optimize queries Well studied in deductive/relational databases Join elimination predicate elimination, detection of empty answer set … Equally applicable to XML for flat value filtering
3
SQO for XML Pattern Retrieval General XML SQO Applicable to both static and streaming XML E.g..: Query tree minimization [Amer-Yahia+02] Static XML Specific SQO Focus on expediting random access of data E.g.: Query rewrite using “extents” (indices built on element types) [Fernandez+98], … Stream specific XML SQO Focus on expediting token-by-token sequential access of data
4
Stream Specific SQO Example /seller[shipTo] Without schema Buffer seller element Retrieve /shipTo Buffer seller element Retrieve /shipTo Retrieve /sameAddr … … buffer: When retrieved Skip computation
5
Related Work YFilter [Diao02] and XSM [Ludscher 03] Use schema to decide whether pattern results are recursive or types of child elements Essentially propose general XML SQO FluXQuery [Koch+04] Use schema to minimize buffer size Is complementary to our focus (aim to skip unnecessary computations) SIX [Gupta+03] Use indices interleaved with XML data to reduce parsing Could be combined with our techniques
6
Challenge: Constraint Useful? /seller/shipTo Retrieve /shipTo Retrieve /sameAddr When retrieved Nothing to save: /shipTo is the only pattern retrieval /seller[shipTo]/billTo Retrieve /shipTo Retrieve /sameAddr When retrieved Retrieve /billTo Nothing to save: /billTo has already been retrieved
7
Challenge : Benefits/Overhead? Maximal benefits: no beneficial optimization should be missed Any failed patterns should be detected as early as possible Minimal overhead: no redundant optimization should be introduced Whether a particular pattern fails should not be repeatedly checked
8
Challenge: Plan Execution Optimization at lower level than query rewrite Specific physical implementations are needed /seller[shipTo] Buffer seller element Retrieve /shipTo Retrieve /sameAddr When retrieved No query can capture this optimization
9
Outline SQO Technique Design SQO Application Execution of Optimized Plan Experimentations
10
Physical Implementation of Pattern Retrieval Note: Important to understand physical stream engine implementation for designing effective SQO Our implementation: Widely used automata implementation [e.g., Tukwila, YFilter]
11
Example Query and its Automata 012 9 1112 auctionsauction shipTo seller primary, secondary phone 3 λ 10 … for $a in /auctions/auction, $b in $a/seller[shipTo] where $b/*/phone=“508-123-4567” return for $c in $a/item where $c//keyword=“auto” return $b/*/phone * … … input [2,3] [1] [0] [1] [0] stack [12#] [11] … [2,3] [1] [0] … … [11] … [2,3] [1] [0] #: buffering flag
12
Example Query and its Automata 012 9 1112 auctionsauction shipTo seller primary, secondary phone 3 λ 10 … * … … input [2,3] [1] [0] [1] [0] stack [12#] [11] … [2,3] [1] [0] … … [11] … [2,3] [1] [0] #: buffering flag Opt. opportunities: 1.avoid transitions as much as possible 2.revoke buffering flag as soon as possible
13
Is Constraint Useful for Opt.? Constraints used to find “ending marks” of a pattern within a context element is ending mark of /shipTo within seller element context
14
Is Constraint Useful for Opt.? Ending mark helpful if Context element can be filtered out earlier:
15
Is Constraint Useful for Opt.? Ending mark helpful if Context element can be filtered out earlier: Pattern may fail to appear Ending mark for $a/seller is not helpful for $a in /auctions/auction, $b in $a/seller … + Ending mark for $a/seller is helpful
16
Is Constraint Useful for Opt.? Ending mark for $a/seller is not helpful for $a in /auctions/auction, $b in $a/seller … + Ending mark for $a/seller is helpful Ending mark helpful if Context element can be filtered out earlier: Pattern may fail to appear Pattern is required
17
Is Constraint Useful for Opt.? Ending mark helpful if Context element can be filtered out earlier: Pattern may fail to appear Pattern is required for $c in $a/item return $a/category <!element item (category?, desc, …)> + Ending mark for $a/category is not helpful for $c in $a/item[category] return $a/category Ending mark for $a/category is helpful
18
Is Constraint Useful for Opt.? Ending mark helpful if Context element can be filtered out earlier: Pattern may fail to appear Pattern is required and The early filtering can be beneficial: Transitions may happen after ending marks Buffering flags may be raised before ending marks
19
SQO Design Helpful ending marks identified by our SQO Three SQO rules designed using Occurrence constraints Exclusive constraints Order constraints
20
Example SQO Rule Use occurrence constraint Event-condition-action output by rule for $a in /auctions/auction, $b in $a/seller Where $b/*/phone = “508-1234567” … + Event: second is encountered in a seller Condition: $b/*/phone = “508- 1234567” not satisfied yet Action: skip rest computations within current seller element
21
Outline SQO Technique Design SQO Application Execution of Optimized Plan Experimentations
22
Properties of SQO Application Maximal benefits Minimal overhead
23
Maximal Benefit Definition of “rule independence” Proof of “maximal benefits” given If rules are all independent, as long as each rule is applied on each pattern, maximal benefits are ensured
24
Minimal Overhead: Redundancy Same pattern redundancy : Multiple ending marks adopted for same pattern for $a in /auctions/auction, $b in $a/seller[shipTo] … Query Schema Constraints Ending mark for $b/shipTo guarantees to capture failure of /shipTo Ending mark for $b/shipTo Redundant
25
Minimal Overhead: Redundancy? Parent-child pattern redundancy: ending marks of child patterns early filter parent pattern for $a in /auctions/auction, $b in $a/seller[shipTo] … optional QueryConstraints for $b/shipTo for $a/seller required Can be used to capture failure of $a/seller[shipTo] Redundant
26
SQO Application Algorithm Input: XQuery represented as a tree XML Schema represented as a graph Processing: Query tree traversed top-down “maximal benefits” ensured Tree node applied by local/regional appliers Same pattern redundancy excluded by local applier Parent-child pattern redundancy excluded by regional applier Output: Event-condition-actions attached to tree nodes
27
Outline SQO Technique Design Guideline SQO Application Execution of Optimized Plan Experimentations
28
Encoding ECAs in Automata E: push-in or pop-out of state C: pattern result buffer checked A: actions include: Suspend computations by removing automata transitions Clean up result generated within current context element Prepare for recovering computation for next context element (e.g., backup transitions)
29
Example: ECAs in Automata 012 9 5 auctions auction shipTo item seller 3 10 13 sameAddr (1, startTag, none,state 2) … Event: 1 st encountered Condition: none Action: cut all transitions from 1.q2 2.States reachable via : q3 3.States between q2 and q13: q9 … primary, secondary 1112 phone (…, state 3) <sameAddr> </sameAddr> <item> </item> <primary> </primary> … for $a in /auctions/auction, $b in $a/seller[shipTo] where $b/*/phone=“508-123-4567” return for $c in $a/item …
30
Outline SQO technique design guideline SQO application Execution of optimized plan Experimentations
31
Optimization Effected by ? How often pattern fails (pattern selectivity) How much gain each early filtering brings (unit gain)
32
Necessity of Design Guideline Selectivity of Pattern with the Only Useful Ending Mark Plan without SQO Plan with SQO (1 ending mark) Plan with SQO but no guideline considered (30 ending marks)
33
Conclusion First SQL on streaming XML Support SQO on nested XQuery with “*” or “//” Offer criteria of “useful” constraints Ensure maximal benefits and minimal overhead in SQO application Provide execution strategy in widely-used automata- based model Implement SQO optimizer in Raindrop system (VLDB’04 demo) Experimentally demonstrate SQO brings significant improvement with little overhead
34
Visit our XQuery engine over XML stream project (RAINDROP) website http://davis.wpi.edu/dsrg/raindrop/ Supported by USA National Science Foundation and IBM PhD Fellowship
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.