Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Lübeck, Germany Institute of Information Systems Incremental Validation of String- Based XML Data in Databases, File Systems and Streams.

Similar presentations


Presentation on theme: "University of Lübeck, Germany Institute of Information Systems Incremental Validation of String- Based XML Data in Databases, File Systems and Streams."— Presentation transcript:

1 University of Lübeck, Germany Institute of Information Systems Incremental Validation of String- Based XML Data in Databases, File Systems and Streams Beda Christoph Hammerschmidt 3, Christian Werner 2, Ylva Brandt 2, Volker Linnemann 1, Sven Groppe 1, Stefan Fischer 2 1 Institute of Information Systems U of Lübeck, Germany 2 Institute of Telematics U of Lübeck, Germany 3 Oracle Corp. Redwood Shores California, USA

2 Incremental Validation of String-based XML Data © Volker Linnemann et al. 22.10.2007 Table of Contents 1. Introduction and Motivation 2. The XML Validation Problem 3. Efficiently Validating Updates 4. Experiments 5. Conclusion

3 Incremental Validation of String-based XML Data © Volker Linnemann et al. 32.10.2007 1. Introduction and Motivation XML Data is important in many applications Valid XML data increases the correctness of applications Validity according to an XML DTD or an XML Scheme

4 Incremental Validation of String-based XML Data © Volker Linnemann et al. 42.10.2007 1. Introduction and Motivation In case of an update: –Revalidation of the whole document is time consuming –Solution: Incremental Validation XML Document validate changed part only

5 Incremental Validation of String-based XML Data © Volker Linnemann et al. 52.10.2007 1. Introduction and Motivation Some approaches for partial validation exist, but: –most of them are DOM-based, i.e. tree of nodes DOM: inherently well formed We focus on the string representation of XML data as it is used in –XML column types –Message Systems (SOAP) –SQLXML update commands → Sequence of tags and values

6 Incremental Validation of String-based XML Data © Volker Linnemann et al. 62.10.2007 2. The XML Validation Problem XML Schema:

7 Incremental Validation of String-based XML Data © Volker Linnemann et al. 72.10.2007 2. The XML Validation Problem Regular Tree Grammar of XML Schema: G = (N,T,P,S) N: set of Nonterminal Symbols T: set of Terminal Symbols P: set of Production Rules S: set of Start Symbols, S N

8 Incremental Validation of String-based XML Data © Volker Linnemann et al. 82.10.2007 2. The XML Validation Problem Example:

9 Incremental Validation of String-based XML Data © Volker Linnemann et al. 92.10.2007 2. The XML Validation Problem Set of Finite State Machines generated out of a regular tree grammar Example:

10 Incremental Validation of String-based XML Data © Volker Linnemann et al. 102.10.2007 XML Schema Aware Pushdown Automaton PDA Z q0q0 r0q1r0q1 q0r1q1q0r1q1 r1q1r1q1 q0r2q1q0r2q1 r2q1r2q1 q1q1 Stack empty

11 Incremental Validation of String-based XML Data © Volker Linnemann et al. 112.10.2007 PDA 3. Efficiently Validating Updates Element State Index

12 Incremental Validation of String-based XML Data © Volker Linnemann et al. 122.10.2007 3. Efficiently Validating Updates The Element/State-Index referencing XML data and PDA states for the document 27 r0r0 s0r2s0r2 s1r2s1r2 r2r2 Stack empty /a/b/c 7OpenCs0

13 Incremental Validation of String-based XML Data © Volker Linnemann et al. 132.10.2007 3. Efficiently Validating Updates Finding the update position in the XML data using the index

14 Incremental Validation of String-based XML Data © Volker Linnemann et al. 142.10.2007 4. Efficiently Validating Updates How efficient is the incremental validation ? –PDA is generated only once for the XML scheme –Time for the validation is linear in the size of the updated part, it is independent of the total size of the document –Time for the index update is also linear in the size of the updated part, except for updating the offsetlist But: Offsetlist is not needed for validating the update, it is used only for searching

15 Incremental Validation of String-based XML Data © Volker Linnemann et al. 152.10.2007 4. Experiments Time to validate the XMark Sample Data Updated Element: 20 kB –Xenia global: PDA with no incremental update –Xenia local: PDA with incremental update

16 Incremental Validation of String-based XML Data © Volker Linnemann et al. 162.10.2007 5. Conclusion Incremental validation by using a Pushdown Automaton PDA: –Costs are in the size of the update operation –Validation is performed before updating the data → no invalid data In the paper: –formalism for generating the PDA –element/state index in detail

17 Incremental Validation of String-based XML Data © Volker Linnemann et al. 172.10.2007 5. Conclusion Directions for Future Work –Optimize Index Update –Index only for selected paths → Index Selection Problem –Update Index only when needed Thank you for your attention !!

18 University of Lübeck, Germany Institute of Information Systems Incremental Validation of String- Based XML Data in Databases, File Systems and Streams Beda Christoph Hammerschmidt 3, Christian Werner 2, Ylva Brandt 2, Volker Linnemann 1, Sven Groppe 1, Stefan Fischer 2 1 Institute of Information Systems U of Lübeck, Germany 2 Institute of Telematics U of Lübeck, Germany 3 Oracle Corp. Redwood Shores California, USA


Download ppt "University of Lübeck, Germany Institute of Information Systems Incremental Validation of String- Based XML Data in Databases, File Systems and Streams."

Similar presentations


Ads by Google