On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio,

Slides:



Advertisements
Similar presentations
J0 1 Marco Ronchetti - Web architectures – Laurea Specialistica in Informatica – Università di Trento Java XML parsing.
Advertisements

The Java Platform and XML Portable Code, Portable Data James Duncan Davidson Staff Engineer, Sun Microsystems, Inc.
Inside an XSLT Processor Michael Kay, ICL 19 May 2000.
Revision and exam preparation. major topic areas XML language –XML structure advantages/ disadvantages applications supports interoperability –DTD structure.
Can schemas help SVG interwork with other markup vocabularies? MURATA Makoto (FAMILY Given) International University of Japan XHTML XForms SVG XML Events.
University of Jyväskylä/AHo & VLy Experiences of Document Transformations with XSLT and DOM Anne Honkaranta, Virpi Lyytikäinen, Pasi Tiitinen, University.
1 XML Data Management Course Outline and Organisation Werner Nutt.
1 XML: Advanced Guide Holly A. Hyland, FSA Andrew Smalera, XML Framework Session 14.
Software based Acceleration Methods for XML Signature (Or: is there such a method) Youjin Song DongGuk University, Korea Yuliang Zheng University of North.
XML Parsing Using Java APIs AIP Independence project Fall 2010.
SDPL 2002Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
1 Introducing Collaboration to Single User Applications A Survey and Analysis of Recent Work by Brian Cornell For Collaborative Systems Fall 2006.
XML DOM and SAX Parsers By Omar RABI. Introduction to parsers  The word parser comes from compilers  In a compiler, a parser is the module that reads.
Java API for XML Processing (JAXP) CSE 4/586: Distributed Systems Department of Computer Science and Engineering University at Buffalo, New York Jia Zhao.
XML A brief introduction ---by Yongzhu Li. XML --- a brief introduction 2 CSI668 Topics in System Architecture SUNY Albany Computer Science Department.
MC365 XML Parsers. Today We Will Cover: An overview of the Java API’s used for XML processing Creating an XML document in Java Parsing an XML document.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML is a set of rules for building markup languages. It is not just glorified HTML or only for the internet. XML is a family of technologies that can do.
SDPL : (XML APIs) JAXP1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –In Java: through JAXP –An overview of.
Using XML as the Foundation for a Network Resource Description Language Peter A. Dinda.
17 Apr 2002 XML Programming: JAXP Andy Clark. Java API for XML Processing Standard Java API for loading, creating, accessing, and transforming XML documents.
CS 355 – Programming Languages
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
SDPL 2003Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
Adapting Legacy Computational Software for XMSF 1 © 2003 White & Pullen, GMU03F-SIW-112 Adapting Legacy Computational Software for XMSF Elizabeth L. White.
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
Worshipping at the Shrine: Myths and Legends from comp.text.xml Kerry “the heretic” Raymond, CiTR.
Structured-Document Processing Languages Spring 2011 Course Review Repetitio mater studiorum est!
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
Introduction to XSLT By Ed Rosenthal And Dave Pion.
Java API for XML Processing (JAXP) توسط : محمّدمهدي حامد استاد راهنما : دكتر مسعود رهگذر.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
Games Development 2 Text-based Game Data CO3301 Week 4.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
1 XML Data Management Course Outline and Organisation Werner Nutt.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
2005 Epocrates, Inc. All rights reserved. Integrating XML with legacy relational data for publishing on handheld devices David A. Lee Senior member of.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
R. Addie & S. Dekeyser XML for M&C / USQ ? What ? Why ? How ? When ?
School of Computing and Information Systems CS 371 Web Application Programming XML and JSON Encoding Data.
Web Technologies COMP6115 Session 4: Adding a Database to a Web Site Dr. Paul Walcott Department of Computer Science, Mathematics and Physics University.
XML Steve Fisher/RAL. 20 October 2000XML - Steve Fisher/RAL2 Warning Information may not be all completely up to date.
December 4, ICSSEA’03 The SmartTools Software Factory The MDA approach and Generative programming for Software Development:
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
SAX2 and DOM2 Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
COSC617 Project XML Tools Mark Liu Sanjay Srivastava Junping Zhang.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Dom and XSLT Dom – document object model DOM – collection of nodes in a tree.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Structured-Document Processing Languages Spring 2004 Course Review Repetitio mater studiorum est!
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Sheet 1XML Technology in E-Commerce 2001Lecture 0 XML Technology in E-Commerce Klaas van den Berg & Ivan Kurtev 2000/2001 – trimester 3.
XML Tools (Chapter 4 of XML Book). What tools are needed for a complete XML application? n Fundamental components n Web infrasructure n XML development.
Using DSDL plus annotations for Netconf (+) data modeling Rohan Mahy draft-mahy-canmod-dsdl-01.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
1 Validation SAX-DOM. Objectives 2  Schema Validation Framework  XML Validation After Transformation  Workshops.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
What is a tree really? Patrick Durusau Society of Biblical Literature TEI 2003 Nancy, France.
Week-9 (Lecture-1) XML DTD (Data Type Document): An XML document with correct syntax is called "Well Formed". An XML document validated against a DTD is.
1 Introduction to XML Babak Esfandiari. 2 What is XML? introduced by W3C in 98 Stands for eXtensible Markup Language it is more general than HTML, but.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Implementing Language Extensions with Model Transformations
Implementing Language Extensions with Model Transformations
XML and Web Services (II/2546)
Presentation transcript:

On-the-fly Validation of XML Markup Languages using off-the-shelf Tools Mikko Saesmaa Pekka Kilpeläinen Dept of Computer Science University of Kuopio, Finland

EML' 07 MontrealOn-the-fly Validation of XML2 The Talk in a Nutshell n How to support continuous validation in an XML editor... –easily –efficiently –against different schema languages? »DTD, XML Schema, Relax NG;..., NVDL for compound docs n Lesson: straightforward application of Java-XML APIs is effective, and quite efficient, too

EML' 07 MontrealOn-the-fly Validation of XML3 Intended Audience n Suitable for Java/XML developers –emphasis on practical use of technology n EXTREME enough? –Nothing extraordinary; some ideas non-obvious –Nothing extraordinary; some ideas non-obvious n Why Java? –built-in XML and GUI facilities (JAXP + Swing) –general and portable

EML' 07 MontrealOn-the-fly Validation of XML4 Look & Feel of ”Xeditor” - off - WF check, as XML or DTD XML or DTD - validate using DTD, using DTD, or against schema or against schema

EML' 07 MontrealOn-the-fly Validation of XML5 An Experimental Editor n Not production quality (yet) –UI unfinished –useful editor functionality missing –slightly unstable n Purpose to experiment with XML technology –examine, learn, and explain

EML' 07 MontrealOn-the-fly Validation of XML6 Background and Related Work n Travis, 1999: “Real-time XML Editor” –> Architag XRay2 n Oxygen, XMLMind, XMLSpy,... –internal solutions of commercial editors? n Clark’s nXML (on GNU Emacs) –~17,000 lines of Emacs Lisp (vs. ~1000 of ours) n DB research on incremental XML validation (Barbosa et al & 2006, Balmin et al. 2004)

EML' 07 MontrealOn-the-fly Validation of XML7 Architectural solutions of Xeditor Separation of Concerns: Editor (Swing) ignorant of XML, XMLHandler (JAXP) ignorant of editing Separation of Concerns: Editor (Swing) ignorant of XML, XMLHandler (JAXP) ignorant of editing n After each edit, the doc is re-parsed completely n Pro: modularity and simplicity n Cons: –limited functionality (e.g. markup completion and syntax highlighting) –some inefficiency, but not too bad

EML' 07 MontrealOn-the-fly Validation of XML8 Basic Technicalities n Event-driven document checking/validation –modifications caught by a Swing DocumentListener –errors caught as SAX parse exceptions

EML' 07 MontrealOn-the-fly Validation of XML9 Error reporting in Xeditor

EML' 07 MontrealOn-the-fly Validation of XML10 Basic Technicalities (2) n JAXP is based on a Factory Design Pattern –Initialization of a Factory selects appropriate implementation of a parser, schema language, XSLT transformer,... (Implementation-independence, pluggability) –> extensibility by new schema languages

EML' 07 MontrealOn-the-fly Validation of XML11 Basic Technicalities (3) Well-formedness checking and DTD validation based on JAXP SAX interfaces: SAXParserFactory  SAXParser Well-formedness checking and DTD validation based on JAXP SAX interfaces: SAXParserFactory  SAXParser –Validating/non-validating parser selected based on editing mode –SAX instead of DOM relevant for efficiency –Only errors are observed; other events ignored n Schema-based validation using JAXP Validators (later)

EML' 07 MontrealOn-the-fly Validation of XML12 Efficiency Concerns n Rule of thumb: access via memory much faster than via disk, which again much faster than over network n Document passed to parser/validator as an in-memory stream in-memory stream in-memory stream –> normally no observable delays –Could cache external entities, like DTDs, too

EML' 07 MontrealOn-the-fly Validation of XML13 Different Schemas and Schema Languages n A Validator created when the user selects Schema

EML' 07 MontrealOn-the-fly Validation of XML14 Validation against the Schema n As already shown:

EML' 07 MontrealOn-the-fly Validation of XML15 Editing Schema Documents n Public schema files for schema languages used for validation (as external resource files) –xsd.rng or XMLSchema.xsd for XML Schema –relaxng.rng for Relax NG n Pro: uniformity; efficiency n Con: imprecision (vs validating by appropriate schema compiler)

EML' 07 MontrealOn-the-fly Validation of XML16 Editing DTDs n How to check the non-XML syntax of DTDs easily?

EML' 07 MontrealOn-the-fly Validation of XML17 Editing DTDs (2) n Soln 1: Wrap the DTD in the prolog of a dummy doc –how to check stand-alone DTDs (“external subsets”)? Soln 2: Parse a fixed dummy doc that refers to the DTD: Soln 2: Parse a fixed dummy doc that refers to the DTD: ]> ]> randomize ”foo”, to avoid conflict with actual DTD

EML' 07 MontrealOn-the-fly Validation of XML18 Editing DTDs (3) SYSTE PUBLIC

EML' 07 MontrealOn-the-fly Validation of XML19 Efficiency and Scalability n Is brute-force re-validation too inefficient? n No: Delays normally unnoticeable times for validating XMLSchema.xsd

EML' 07 MontrealOn-the-fly Validation of XML20 Validating a Larger Instance appr. speed / sec 28,000 lines, 1.2 MB 52,000 lines, 2.3 MB 180,000 lines, 8.1 MB An XML Schema of ~ 18,000 lines/800 KB

EML' 07 MontrealOn-the-fly Validation of XML21 Conclusions n Immediate validation against schemas in different languages (DTD, XML Schema, Relax NG) easy to support n Efficiency of brute-force application sufficient for moderate-sized documents

EML' 07 MontrealOn-the-fly Validation of XML22 Further Work n Potential application: teaching of XML n Practical enhancements –markup completion, highlighting etc. n Incremental validation –to remove dependency of document length –How to apply/modify standard interfaces? n Thank you! Questions? Comments?