Download presentation
Presentation is loading. Please wait.
Published byMarisa Vine Modified over 10 years ago
1
1 Incremental Validation of XML Databases Yannis Papakonstantinou Victor Vianu Computer Science & Eng, UCSD
2
Incremental Validation of XML Databases: XML Database Document Type Definition (DTD) XML Schema/ XQuery Type System Updates O(log n) O(log 2 n) n nodes
3
XML As Labeled Ordered Trees cars usednew car yearmodelyearmodel 92 Civic 96 Acura model CivicMaxima year 03
4
Document Type Definitions (DTDs): Abstraction & Example cars usednew car yearmodelyearmodel root : cars cars used new used car* new car* car (year|) model car modelyear 92 Civic 96 AcuraCivicMaxima 03 dummy
5
Tree Satisfying DTD, General Case 1 2 i i-1 i+1 k-1 k … … … 1 2 k-1 k … … abc root : … r … r
6
XML Schemas/XQuery Types as Specialized DTDs cars usednew car yearmodelyearmodel root : cars T cars T used T new T used T car U * new T car N * car U year T model T car N (year T |) model T car modelyear used T new T cars T car U car N car U,car N model T year T model T year T LABEL TYPES car {car U, car N } cars {cars T } used {used T } …
7
Tree Automata Specialized DTDs cars usednew car yearmodelyearmodel car modelyear used T new T cars T car U, car N car U, car N car U, car N model T year T model T year T
8
Incremental Validation Problem Statement For each valid tree T use an auxiliary structure A(T) so that, given a series of update commands efficiently decide if the updated tree T is valid efficiently update A(T) and T
9
Types of Updates: Node Renaming u(v, ) 1 2 i i-1 i+1 k-1 k … … … r 1 2 k-1 k … … abc v
10
Types of Updates: Deletion d(v) 1 2 i-1 i+1 k-1 k … … … r … abc i 1 2 k-1 k … v
11
Types of Updates: Insertion 1 2 i-1 i+1 k-1 k … … … r … abc v i+1 i insert_after(v i-1, i ) v i-1
12
Validating a Renaming u(i, ) on a Regular String of N : Take One 1 2 i i-1 i+1 n-1 n … N … Validation of one update in O(1) given precomputed Pre and Post Post(i+1) Pre(i-1) u(i, ) requires recomputation of Pre(i), Pre(i+1), … and of Post(i), Post(i-1), … q0q0 1 2 i-1 … qFqF n n-1 i+1 … q0q0 1 2 i-1 …
13
Transition Relation Definition 1 2 i j n-1 n … ……… m T i,j = { (q, q) | } i+1 q i … q j m+1 T i,j = T i,m T m+1,j
14
Transition Relation Trees 1 2 3 4 5 6 7 8 T 5,8 T 1,4 T 3,4 T 1,2 T 5,6 T 7,8 T 1,1 T 2,2 T 3,3 T 4,4 T 5,5 T 6,6 T 7,7 T 8,8 T 1,8
15
Maintenance of the Structure and Validation in O(log n) 1 2 3 4 5 6 7 8 T 1,1 T 2,2 T 3,3 T 4,4 T 5,5 T 6,6 T 7,7 T 8,8 T 1,2 T 3,4 T 5,6 T 7,8 T 5,8 T 1,4 T 1,8 u(6, ) If (q 0, q F ) then valid T 6,6 T 5,6 T 5,8 T 1,8
16
Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions 1 2 3 5 6 7 9 T 1 T 2 T 3 T 5 T 6 T 7 T 9 Ta Tb TcTa Tb Tc T a = T 1 T 2 If (q 0, q F ) T a T b T c then valid
17
Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions 1 2 3 5 6 7 9 8 T 1 T 2 T 3 T 5 T 6 T 7 T 8 T 9 T a T b T c
18
Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions 1 2 3 5 6 4 7 9 8 T 1 T 2 T 7 T 8 T 9 T a T b T c T 3 T 5 T 6
19
Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions T3 T4T3 T4 T 5 T 6 1 2 3 5 6 4 7 9 8 T 1 T 2 T 7 T 8 T 9 T a T b T c
20
Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions Ta TdTa Td T e T c T3 T4T3 T4 T 5 T 6 1 2 3 5 6 4 7 9 8 T 1 T 2 T 7 T 8 T 9 T f T g
21
Auxiliary Structures for Incremental DTD Validation 1 2 i i-1 i+1 k-1 k … … … r 1 2 k-1 k … … vivi u(v i, ) r i … … r r
22
Specialized DTD Incremental Validation: Take One a1a1 aiai a i-1 a i+1 akak … … r b1b1 b k-1 bkbk … … vivi u(v i, ) … types(v i )= { i,1,…, i,n } types() types(v i )= { i,1,…, i,n } types()
23
Inefficient for Deep Trees: Apply Divide- And-Conquer in Vertical Direction … … Turn Specialized DTD into NFA that validates a vertical line Fuse vertical and horizontal directions using binary tree and split work in both
24
Tree Satisfying Specialized DTD transformed into Binary Tree Accepted By Tree Automaton a b c dj k e fh gi a b c dj k e fh g i # # # # ## # # # # ##
25
Designate Lines in Binary Trees Size( ) > 2 Size( ) Size( ) > 4 Size( )
26
Example Line Structure a b c dj k e fh g i # # # # ## # # # # ## a c d b # f # j e k # h g i # # # # # # # # #
27
From Tree Automaton to Validating Lines with NFA a c b j e k h g i d f d
28
a c b, T c j e k h g i d, T j f, T g
29
Incremental Validation of the Line Structure in O(log 2 |T|) a c b, T c j e k h g i f, T g m d, T j Insert m after k #updated lines < 1 + log |T| Cost of line update O(log |T|)
30
Validating Insertions and Deletions: the Non-Line-Preserving Case Insertion
31
Key Complexity Results Given m updates on tree of size n, incrementally validate DTD in O(m log n) given alphabet, size of maximum regular expression d: O(m | | d 2 log d log n) Data structure of size O(d 2 n) Specialized DTDs in O(m log 2 n) given set of types O(m | | 2 d 2 (log d + log | |) log 2 n) Data structure of size O(| | 2 d 2 log 2 n) Lower complexity for 1-unambiguous
32
Ongoing and Future Work (with Andrey Balmin) Incorporate Transition Relation Trees in B-Tree Structure Exploit locality Experimental evaluation on set of 65 DTDs: In 96% of type definitions an update may only affect transition relations of length<4 Common case much more efficient than worse case Detect the property and employ algorithms that do not build trts in such cases Optimization over multiple updates More complex updates & edit operations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.