Presentation is loading. Please wait.

Presentation is loading. Please wait.

On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio,

Similar presentations


Presentation on theme: "On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio,"— Presentation transcript:

1 On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio, Finland

2 EML' 07 MontrealOn-the-fly Validation of XML2 The Talk in a Nutshell n How to support continuous validation in an XML editor... –easily –efficiently –against different schema languages? »DTD, XML Schema, Relax NG;..., NVDL for compound docs n Lesson: straightforward application of Java-XML APIs is effective, and quite efficient, too

3 EML' 07 MontrealOn-the-fly Validation of XML3 Intended Audience n Suitable for Java/XML developers –emphasis on practical use of technology n EXTREME enough? –Nothing extraordinary; some ideas non-obvious –Nothing extraordinary; some ideas non-obvious n Why Java? –built-in XML and GUI facilities (JAXP + Swing) –general and portable

4 EML' 07 MontrealOn-the-fly Validation of XML4 Look & Feel of ”Xeditor” - off - WF check, as XML or DTD XML or DTD - validate using DTD, using DTD, or against schema or against schema

5 EML' 07 MontrealOn-the-fly Validation of XML5 An Experimental Editor n Not production quality (yet) –UI unfinished –useful editor functionality missing –slightly unstable n Purpose to experiment with XML technology –examine, learn, and explain

6 EML' 07 MontrealOn-the-fly Validation of XML6 Background and Related Work n Travis, 1999: “Real-time XML Editor” –> Architag XRay2 n Oxygen, XMLMind, XMLSpy,... –internal solutions of commercial editors? n Clark’s nXML (on GNU Emacs) –~17,000 lines of Emacs Lisp (vs. ~1000 of ours) n DB research on incremental XML validation (Barbosa et al. 2004 & 2006, Balmin et al. 2004)

7 EML' 07 MontrealOn-the-fly Validation of XML7 Architectural solutions of Xeditor Separation of Concerns: Editor (Swing) ignorant of XML, XMLHandler (JAXP) ignorant of editing Separation of Concerns: Editor (Swing) ignorant of XML, XMLHandler (JAXP) ignorant of editing n After each edit, the doc is re-parsed completely n Pro: modularity and simplicity n Cons: –limited functionality (e.g. markup completion and syntax highlighting) –some inefficiency, but not too bad

8 EML' 07 MontrealOn-the-fly Validation of XML8 Basic Technicalities n Event-driven document checking/validation –modifications caught by a Swing DocumentListener –errors caught as SAX parse exceptions

9 EML' 07 MontrealOn-the-fly Validation of XML9 Error reporting in Xeditor

10 EML' 07 MontrealOn-the-fly Validation of XML10 Basic Technicalities (2) n JAXP is based on a Factory Design Pattern –Initialization of a Factory selects appropriate implementation of a parser, schema language, XSLT transformer,... (Implementation-independence, pluggability) –> extensibility by new schema languages

11 EML' 07 MontrealOn-the-fly Validation of XML11 Basic Technicalities (3) Well-formedness checking and DTD validation based on JAXP SAX interfaces: SAXParserFactory  SAXParser Well-formedness checking and DTD validation based on JAXP SAX interfaces: SAXParserFactory  SAXParser –Validating/non-validating parser selected based on editing mode –SAX instead of DOM relevant for efficiency –Only errors are observed; other events ignored n Schema-based validation using JAXP Validators (later)

12 EML' 07 MontrealOn-the-fly Validation of XML12 Efficiency Concerns n Rule of thumb: access via memory much faster than via disk, which again much faster than over network n Document passed to parser/validator as an in-memory stream in-memory stream in-memory stream –> normally no observable delays –Could cache external entities, like DTDs, too

13 EML' 07 MontrealOn-the-fly Validation of XML13 Different Schemas and Schema Languages n A Validator created when the user selects Schema

14 EML' 07 MontrealOn-the-fly Validation of XML14 Validation against the Schema n As already shown:

15 EML' 07 MontrealOn-the-fly Validation of XML15 Editing Schema Documents n Public schema files for schema languages used for validation (as external resource files) –xsd.rng or XMLSchema.xsd for XML Schema –relaxng.rng for Relax NG n Pro: uniformity; efficiency n Con: imprecision (vs validating by appropriate schema compiler)

16 EML' 07 MontrealOn-the-fly Validation of XML16 Editing DTDs n How to check the non-XML syntax of DTDs easily?

17 EML' 07 MontrealOn-the-fly Validation of XML17 Editing DTDs (2) n Soln 1: Wrap the DTD in the prolog of a dummy doc –how to check stand-alone DTDs (“external subsets”)? Soln 2: Parse a fixed dummy doc that refers to the DTD: Soln 2: Parse a fixed dummy doc that refers to the DTD: ]> ]> randomize ”foo”, to avoid conflict with actual DTD

18 EML' 07 MontrealOn-the-fly Validation of XML18 Editing DTDs (3) SYSTE PUBLIC

19 EML' 07 MontrealOn-the-fly Validation of XML19 Efficiency and Scalability n Is brute-force re-validation too inefficient? n No: Delays normally unnoticeable times for validating XMLSchema.xsd

20 EML' 07 MontrealOn-the-fly Validation of XML20 Validating a Larger Instance appr. speed / sec 28,000 lines, 1.2 MB 52,000 lines, 2.3 MB 180,000 lines, 8.1 MB An XML Schema of ~ 18,000 lines/800 KB

21 EML' 07 MontrealOn-the-fly Validation of XML21 Conclusions n Immediate validation against schemas in different languages (DTD, XML Schema, Relax NG) easy to support n Efficiency of brute-force application sufficient for moderate-sized documents

22 EML' 07 MontrealOn-the-fly Validation of XML22 Further Work n Potential application: teaching of XML n Practical enhancements –markup completion, highlighting etc. n Incremental validation –to remove dependency of document length –How to apply/modify standard interfaces? n Thank you! Questions? Comments?


Download ppt "On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio,"

Similar presentations


Ads by Google