Download presentation
Presentation is loading. Please wait.
Published byChristian Cooper Modified over 9 years ago
1
On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio, Finland
2
EML' 07 MontrealOn-the-fly Validation of XML2 The Talk in a Nutshell n How to support continuous validation in an XML editor... –easily –efficiently –against different schema languages? »DTD, XML Schema, Relax NG;..., NVDL for compound docs n Lesson: straightforward application of Java-XML APIs is effective, and quite efficient, too
3
EML' 07 MontrealOn-the-fly Validation of XML3 Intended Audience n Suitable for Java/XML developers –emphasis on practical use of technology n EXTREME enough? –Nothing extraordinary; some ideas non-obvious –Nothing extraordinary; some ideas non-obvious n Why Java? –built-in XML and GUI facilities (JAXP + Swing) –general and portable
4
EML' 07 MontrealOn-the-fly Validation of XML4 Look & Feel of ”Xeditor” - off - WF check, as XML or DTD XML or DTD - validate using DTD, using DTD, or against schema or against schema
5
EML' 07 MontrealOn-the-fly Validation of XML5 An Experimental Editor n Not production quality (yet) –UI unfinished –useful editor functionality missing –slightly unstable n Purpose to experiment with XML technology –examine, learn, and explain
6
EML' 07 MontrealOn-the-fly Validation of XML6 Background and Related Work n Travis, 1999: “Real-time XML Editor” –> Architag XRay2 n Oxygen, XMLMind, XMLSpy,... –internal solutions of commercial editors? n Clark’s nXML (on GNU Emacs) –~17,000 lines of Emacs Lisp (vs. ~1000 of ours) n DB research on incremental XML validation (Barbosa et al. 2004 & 2006, Balmin et al. 2004)
7
EML' 07 MontrealOn-the-fly Validation of XML7 Architectural solutions of Xeditor Separation of Concerns: Editor (Swing) ignorant of XML, XMLHandler (JAXP) ignorant of editing Separation of Concerns: Editor (Swing) ignorant of XML, XMLHandler (JAXP) ignorant of editing n After each edit, the doc is re-parsed completely n Pro: modularity and simplicity n Cons: –limited functionality (e.g. markup completion and syntax highlighting) –some inefficiency, but not too bad
8
EML' 07 MontrealOn-the-fly Validation of XML8 Basic Technicalities n Event-driven document checking/validation –modifications caught by a Swing DocumentListener –errors caught as SAX parse exceptions
9
EML' 07 MontrealOn-the-fly Validation of XML9 Error reporting in Xeditor
10
EML' 07 MontrealOn-the-fly Validation of XML10 Basic Technicalities (2) n JAXP is based on a Factory Design Pattern –Initialization of a Factory selects appropriate implementation of a parser, schema language, XSLT transformer,... (Implementation-independence, pluggability) –> extensibility by new schema languages
11
EML' 07 MontrealOn-the-fly Validation of XML11 Basic Technicalities (3) Well-formedness checking and DTD validation based on JAXP SAX interfaces: SAXParserFactory SAXParser Well-formedness checking and DTD validation based on JAXP SAX interfaces: SAXParserFactory SAXParser –Validating/non-validating parser selected based on editing mode –SAX instead of DOM relevant for efficiency –Only errors are observed; other events ignored n Schema-based validation using JAXP Validators (later)
12
EML' 07 MontrealOn-the-fly Validation of XML12 Efficiency Concerns n Rule of thumb: access via memory much faster than via disk, which again much faster than over network n Document passed to parser/validator as an in-memory stream in-memory stream in-memory stream –> normally no observable delays –Could cache external entities, like DTDs, too
13
EML' 07 MontrealOn-the-fly Validation of XML13 Different Schemas and Schema Languages n A Validator created when the user selects Schema
14
EML' 07 MontrealOn-the-fly Validation of XML14 Validation against the Schema n As already shown:
15
EML' 07 MontrealOn-the-fly Validation of XML15 Editing Schema Documents n Public schema files for schema languages used for validation (as external resource files) –xsd.rng or XMLSchema.xsd for XML Schema –relaxng.rng for Relax NG n Pro: uniformity; efficiency n Con: imprecision (vs validating by appropriate schema compiler)
16
EML' 07 MontrealOn-the-fly Validation of XML16 Editing DTDs n How to check the non-XML syntax of DTDs easily?
17
EML' 07 MontrealOn-the-fly Validation of XML17 Editing DTDs (2) n Soln 1: Wrap the DTD in the prolog of a dummy doc –how to check stand-alone DTDs (“external subsets”)? Soln 2: Parse a fixed dummy doc that refers to the DTD: Soln 2: Parse a fixed dummy doc that refers to the DTD: ]> ]> randomize ”foo”, to avoid conflict with actual DTD
18
EML' 07 MontrealOn-the-fly Validation of XML18 Editing DTDs (3) SYSTE PUBLIC
19
EML' 07 MontrealOn-the-fly Validation of XML19 Efficiency and Scalability n Is brute-force re-validation too inefficient? n No: Delays normally unnoticeable times for validating XMLSchema.xsd
20
EML' 07 MontrealOn-the-fly Validation of XML20 Validating a Larger Instance appr. speed / sec 28,000 lines, 1.2 MB 52,000 lines, 2.3 MB 180,000 lines, 8.1 MB An XML Schema of ~ 18,000 lines/800 KB
21
EML' 07 MontrealOn-the-fly Validation of XML21 Conclusions n Immediate validation against schemas in different languages (DTD, XML Schema, Relax NG) easy to support n Efficiency of brute-force application sufficient for moderate-sized documents
22
EML' 07 MontrealOn-the-fly Validation of XML22 Further Work n Potential application: teaching of XML n Practical enhancements –markup completion, highlighting etc. n Incremental validation –to remove dependency of document length –How to apply/modify standard interfaces? n Thank you! Questions? Comments?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.