Download presentation
Presentation is loading. Please wait.
Published byLesley Wiggins Modified over 9 years ago
1
Self Maintenance of materialized XML views with non-cooperative data sources DBDBD – 2006 Virginie Sans –ETIS/CNRS Laboratory– MIDI Team
2
2 Summary 1) Issue and context 1) Pre-requisite 2) The issue 3) Context 4) State of the art 2) Contributions 1) View computation with the XAlgebra 2) Detection and Identification of source updates 3) View maintenance 4) Applications and performances Conclusion
3
3 Mediation architecture Introduced by WiederHold The architecture mediator wrappers sources Query langague 1.1 Pre-requisite
4
4 Mediation architecture Mediator Handle the user request: canonization, atomization Send atomic request to a source via its wrapper wrappers Translate query coming from the mediator into a query in the native langague of the web source Give the mediator an answer in XML Data sources heterogeneous distributed In a web context : Partially unavailable Source SQL Wrapper Meditor XML Atomic request SQLTuples 1.1 Pre-requisite
5
5 Views What about views ? Data integration Access control, security Data-warehouses Why ? Interoperability Heterogeneous data Materializing views Fast access to complex query Better Availability Request optimization RDBSQLHTML Materialized views Wrapper Mediator Wrapper 1.1 Pre-requisite
6
6 Issue : View maintenance Maintenance process Recomputation Recompute the whole view from scratch When data sources are updated, the view consistency should be kept Incremental maintenance compute changes to view in response to changes to base sources Source t View t View computation Source t+1 View t+1 Recomputation Update incremental Maintenance 1.2 Issue
7
7 Context : semi-structured XML data XML views are materialized at the mediator level Hierarchical data No scheme, except the query scheme 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 65.95 Advanced Programming in the Unix environment 39.95 Data on the Web Données sur le Web 65.95 Advanced Programming in the Unix environment TCP/IP Illustrated 65.95 Advanced Programming in the Unix environment 39.95 Data on the Web Données sur le Web 1.3 Context
8
8 Context : XQUERY XQuery Dedicated to XML data Relational operator (projection, select, join, union, …) XML operator (tagging, unnesting, aggregation,..) FLWOR syntax …………(pronounced Flower !) for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return $b/title for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return $b/title Syntaxe FLWOR for $var in foret [$var in foret]* let $var:= sous-arbre Where condition Return result Syntaxe FLWOR for $var in foret [$var in foret]* let $var:= sous-arbre Where condition Return result 1.3 Context
9
9 Context : Other specificities Views are computed using XAlgebra Cf.View computation Wrappers have limited resources Few computation possibilities A component named logger stores the last modification date and a checksum of sources Non cooperative web sources No information about their updates Not always available Not enough granularity 1.3 Context
10
10 State of the art (1/2) Relational views Not fit for semi-structured data Abiteboul and Al. OEM (Object Embedded Model) LOREL language Some Operators are missing VOX – Rainbow Team Need to know the exact position in the XML Tree where the update has been done 1.4 State of the art
11
11 State of the art (2/2) Cobena and Al. XDiff – an algorithm for XML files comparison Need a copy of the source at the wrapper level Bonnet and Al. /Papadimos and Al. Parachute queries A mutant query plan What about when sources are really unavailable ? Our goal : Reduce to the minimum sources access Use information that are stored in the view 1.4 State of the art
12
12 View maintenance : The process View computation An algebraic approach using XAlgebra – Extension of the XAlgebra (identifiers) Update detection Comparison of the information of the source and those stored in the logger Update identification Recovering process Diff Algorithm View maintenance Propagation rules for each operator 2.1 View computation
13
13 View computation Steps : 2.1 View computation
14
14 The XAlgebra data model Data structures : XRelation, XTuple, XAttributes Operators : XSource, XConstruct, XUnion, …. 2.1 View computation
15
15 XSource Operator– Step 1 XQuery analysis We obtain : A context A set of patterns For $f in doc("informations.xml")/personnes/personne Let $a:=$f/nom Where $f/age<27 and $a="Durand" Return {$a} {$f/prenom} Path extraction : Optional Mandatory Hidden 2.1 View computation
16
16 XSource Operator– Step 2 and 3 From XML Sub-Trees to the tabular structure 1 Sub Tree => 1 Xtuple XRelation = set of XTuples 2.1 View computation
17
17 XSource Operator– Extending the Algebra adding identifiers : XTids An XTID is a set of pair : {(idsource, idfragment), …..} 2.1 View computation
18
18 View computation - XOperator XProject 2.1 View computation
19
19 View computation - XOperator XJoin XTids propagation : card (XTID) 1 for some nodes 2.1 View computation
20
20 Update detection and Identification Detection Comparison of the information of the source and those stored in the logger The last modification date The checksum of the source Identification Partial recovery of the source information based on Xtids Comparison of the recovered XRelation with the updated source Δ computation 2.2 Update detection and identification
21
21 XRecover Step 1 : Project XR v on XR 1 patterns 2.2 Update detection and identification
22
22 XRecover Step 2 : filtering XTuples values 2.2 Update detection and identification
23
23 XRecover Step 3 : re-ordering XTuples XTidUnnest 2.2 Update detection and identification Xtuples are unnested depending on their XTids
24
24 XRecover Step 3 : re-ordering Xtuples XTidnest 2.2 Update detection and identification Xtuples are nested by their Xtids Xtuples are re-ordered
25
25 Update Identification – Comparison Algorithm Comparison of XR 1 t+1 avec XR t ’ XR 1 t+1 is the XRelation obtained by applying Xsource to source 1 at t+1 XR t’ is the partial recovery of Xrelation of source 1 at t Remark : XR 1 t+1 can also be filtered using predicates before comparison The Diff algorithm is based on Unix Diff (Hunt & McIllroy). The symbol is the Xtuple instead of being the line 2.2 Update detection and identification
26
26 Update identification – Diff algorithm Delta with hunks : Insert(pos; Xtuple) delete(pos;Xtuple) Replace(pos; Xtupleold, Xtuplenew) 2.2 Update detection and identification Insert(2,{Leclerc,Avide,{(1,3)}} {John,Avide,{(1,3)}} } Delete(4,{Durand,Avide,{(1,11)}}, {Marcel,Avide,{(1,11)}} {Eric,Avide,{(1,11)}}} Etc…
27
27 Maintenance Rules From Delta to view maintenance Case of a deletion - delete(pos, xtuple) An Xtuple is associated to an Xtid {(x)} such that card=1, Each Xvalue of the view have xtids noted XTID 1) We delete from Xvalues each pair of the Xtid such that x XTID Example : The XTuple where xtid is x=1,3 has been deleted The Xvalue {Alain}1,3;1,4 becomes XValeur {Alain}1,4 2) We delete each Xvalues such that card(XTID)=0 If XValue {Alain}1,3 become XValeur {Alain} We delete entirely the XValue 3) If the Xvalue was concenned by the predicate, we delete the XTuple Join and restriction case 2.3 View maintenance
28
28 Maintenance Rules From Delta to view maintenance Case of an insertion - insert(pos; xtuple) 1) A new Xtid is created Goal : preserved Xtuples order for a later recovery 2) Depending on the operator; we obtain various maintenance instructions Projection: insert of the projection of the xtuple Select : xtuple satisfies the predicat insertion Join XR 1 * XR 2, computation of XT= xtuple * XR 2. If XT insertion of XT Union and Intersect: we keep the conservation des doublons Union Select where the predicate is always true Intersect join Depending on the predicate, we can request either XR 2 or its recovery 2.3 View maintenance
29
29 Maintenance Rules From Delta to view maintenance Case of a modification- Replace(pos; Xtupleold, Xtuplenew) Xtuple modification = Xvalue modification OR Xvalues deletion followed by insertion Project and Union: modification of the concerned XValues Select and Intersect: If modification is applied an Xvalue that must verify the condition, deletion of the Xtuple Else modification of the XValues Intersect select. Join deletion followed by insertion. 2.3 View maintenance
30
30 Maintenance Rules From Delta to view maintenance 2.3 View maintenance
31
31 Maintenance rules Missing Information Missing Information (join ?) Source Recovery Multi-view strategy Source request Goal : limited acces to the sources !!!! Example : View= S 1 *S 2 SQLHTML Materialized views Mediator Wrapper xtuple x is inserted in S1 Computation of S2 ’ Insertio : x * S 2 ’ 2.3 View maintenance
32
32 Applications On the web With sensors (ANR Project ) When necessary sources are unavailable Goal : Limited access to them With sensors that have no wire Goal: Preserve power ressources 2.4 Applications and performances
33
33 Performances Comparison between XRecover and Recomputation 2.4 Applications and performances
34
34 Performances Comparison between XRecover and Recomputation 2.4 Applications and performances
35
35 Contributions Maintenance process in the context of non-cooperative web sources Contribution to the XAlgebra New operators : XRecover, XTidUnnest, XTidNest New data structure : XTids Futur work Order sensitive view maintenance A better Diff algorithm Conclusion
36
36 Thanks for you attention ! Any questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.