SDPL 2004Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.

Slides:



Advertisements
Similar presentations
J0 1 Marco Ronchetti - Web architectures – Laurea Specialistica in Informatica – Università di Trento Java XML parsing.
Advertisements

The Java Platform and XML Portable Code, Portable Data James Duncan Davidson Staff Engineer, Sun Microsystems, Inc.
Multi-Model Digital Video Library Professor: Michael Lyu Member: Jacky Ma Joan Chung Multi-Model Digital Video Library LYU9904 Multi-Model Digital Video.
XML Parsing Using Java APIs AIP Independence project Fall 2010.
SDPL 2002Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
SDPL 2003Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through.
SAX A parser for XML Documents. XML Parsers What is an XML parser? –Software that reads and parses XML –Passes data to the invoking application –The application.
JAXP Transformation Package and Xalan Extensions 黃立昇
Parsing XML into programming languages JAXP, DOM, SAX, JDOM/DOM4J, Xerces, Xalan, JAXB.
Xerces The Apache XML Project Yvonne Yao. Introduction Set of libraries that provides functionalities to parse XML documents Set of libraries that provides.
21-Jun-15 SAX (Abbreviated). 2 XML Parsers SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard.
26-Jun-15 SAX. SAX and DOM SAX and DOM are standards for XML parsers--program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
MC365 XML Parsers. Today We Will Cover: An overview of the Java API’s used for XML processing Creating an XML document in Java Parsing an XML document.
JAX- Java APIs for XML by J. Pearce. Some XML Standards Basic –SAX (sequential access parser) –DOM (random access parser) –XSL (XSLT, XPATH) –DTD Schema.
XML: Java Dr Andy Evans. Java and XML Couple of things we might want to do: Parse/write data as XML. Load and save objects as XML. We’ll mainly discuss.
17 Apr 2002 XML Programming: TrAX Andy Clark. Java API for XML Processing Standard Java API for loading, creating, accessing, and transforming XML documents.
SDPL : (XML APIs) JAXP1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –In Java: through JAXP –An overview of.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
17 Apr 2002 XML Programming: JAXP Andy Clark. Java API for XML Processing Standard Java API for loading, creating, accessing, and transforming XML documents.
The Joy of SAX (and DOM, and JDOM…) Bill MacCartney 11 October 2004.
SDPL 2003Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
1 Document Object Model (DOM) MV4920 – XML 24 September 2001 Simon R. Goerger MAJ, US Army
Structured-Document Processing Languages Spring 2011 Course Review Repetitio mater studiorum est!
Java API for XML Processing (JAXP) توسط : محمّدمهدي حامد استاد راهنما : دكتر مسعود رهگذر.
Structured-Document Processing Languages Spring 2005 Course Review Repetitio mater studiorum est!
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
1 JAXP & XSLT. Objectives 2  TrAX API  Transforming XML Documents  Workshops.
3/29/2001 O'Reilly Java Java API for XML Processing 1.1 What’s New Edwin Goei Engineer, Sun Microsystems.
1 Java and XML Modified from presentation by: Barry Burd Drew University Portions © 2002 Hungry Minds, Inc.
EXtensible Markup Language (XML) James Atlas July 15, 2008.
SDPL 2002Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
SDPL 20113: XML APIs and SAX1 3. XML Processor APIs n How can (Java) applications manipulate structured (XML) documents? –An overview of XML processor.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
SAX. What is SAX SAX 1.0 was released on May 11, SAX is a common, event-based API for parsing XML documents Primarily a Java API but there implementations.
Beginning XML 4th Edition. Chapter 12: Simple API for XML (SAX)
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
Java API for XML Processing (JAXP) Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer.
SDPL 2002Notes 3.2: Document Object Model1 3.2 Document Object Model (DOM) n How to provide uniform access to structured documents in diverse applications.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
XML Study-Session: Part III
Document Object Model DOM. Agenda l Introduction to DOM l Java API for XML Parsing (JAXP) l Installation and setup l Steps for DOM parsing l Example –Representing.
Java and XML. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added.
COSC617 Project XML Tools Mark Liu Sanjay Srivastava Junping Zhang.
Structured-Document Processing Languages Spring 2007 Course Review Repetitio mater studiorum est!
Dom and XSLT Dom – document object model DOM – collection of nodes in a tree.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Structured-Document Processing Languages Spring 2004 Course Review Repetitio mater studiorum est!
Web services. DOM parsing and SOAP.. Summary. ● Exercise: SAX-Based checkInvoice(), ● push parsing, ● event-based parsing, ● traversal order is depth-first.
SDPL 20063: XML Processor Interfaces1 3. XML Processor APIs n How can (Java) applications manipulate structured (XML) documents? –An overview of XML processor.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
SDPL 2001Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How applications can manipulate structured documents? –An overview of document parser.
1 Validation SAX-DOM. Objectives 2  Schema Validation Framework  XML Validation After Transformation  Workshops.
1 Introduction SAX. Objectives 2  Simple API for XML  Parsing an XML Document  Parsing Contents  Parsing Attributes  Processing Instructions  Skipped.
Java API for XML Processing
Lecture Transforming Data: Using Apache Xalan to apply XSLT transformations Marc Dumontier Blueprint Initiative Samuel Lunenfeld Research Institute.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Parsing XML into programming languages
Using XML Tools CS551 – Fall 2001.
Java/XML.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Java API for XML Processing
WaysInJavaToParseXML
Presentation transcript:

SDPL 2004Notes 3: XML Processor Interfaces1 3.3 JAXP: Java API for XML Processing n How can applications use XML processors? –A Java-based answer: through JAXP –An overview of the JAXP interface »What does it specify? »What can be done with it? »How do the JAXP components fit together? [Partly based on tutorial “An Overview of the APIs” at /3_apis.html, from which also some graphics are borrowed]

SDPL 2004Notes 3: XML Processor Interfaces2 JAXP 1.1 n An interface for “plugging-in” and using XML processors in Java applications –includes packages »org.xml.sax: SAX 2.0 interface »org.w3c.dom: DOM Level 2 interface »javax.xml.parsers: initialization and use of parsers »javax.xml.transform: initialization and use of transformers (XSLT processors) n Included in JDK since version 1.4

SDPL 2004Notes 3: XML Processor Interfaces3 JAXP 1.2 n Current version, since June 2002 n Adds property strings for controlling schema-based validation: – properties/schemaLanguage – properties/schemaSource

SDPL 2004Notes 3: XML Processor Interfaces4 JAXP: XML processor plugin (1) n Vendor-independent method for selecting processor implementation at run time –principally through system properties javax.xml.parsers.SAXParserFactory javax.xml.parsers.DocumentBuilderFactory javax.xml.transform.TransformerFactory –Set on command line (for example, to use Apache Xerces as the DOM implementation): java -D javax.xml.parsers.DocumentBuilderFactory= org.apache.xerces.jaxp.DocumentBuilderFactoryImpl

SDPL 2004Notes 3: XML Processor Interfaces5 JAXP: XML processor plugin (2) –Set during execution (  Saxon as the XSLT impl): System.setProperty( "javax.xml.transform.TransformerFactory", "com.icl.saxon.TransformerFactoryImpl"); n By default, reference implementations used –Apache Crimson/Xerces as the XML parser –Apache Xalan as the XSLT processor n Currently supported only by a few compliant XML processors: –Parsers: Apache Crimson and Xerces, Aelfred, Oracle XML Parser for Java –XSLT transformers: Apache Xalan, Saxon

SDPL 2004Notes 3: XML Processor Interfaces6 JAXP: Functionality n Parsing using SAX 2.0 or DOM Level 2 n Transformation using XSLT –(We’ll study XSLT in detail later) n Adds functionality missing from SAX 2.0 and DOM Level 2: –controlling validation and handling of parse errors »error handling can be controlled in SAX, by implementing ErrorHandler methods –loading and saving of DOM Document objects

SDPL 2004Notes 3: XML Processor Interfaces7 JAXP Parsing API Included in JAXP package javax.xml.parsers Included in JAXP package javax.xml.parsers Used for invoking and using SAX … SAXParserFactory spf = SAXParserFactory.newInstance(); Used for invoking and using SAX … SAXParserFactory spf = SAXParserFactory.newInstance(); and DOM parser implementations: DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

SDPL 2004Notes 3: XML Processor Interfaces8 XML.getXMLReader() JAXP: Using a SAX parser (1) f.xml.parse( ”f.xml”) ”f.xml”).newSAXParser()

SDPL 2004Notes 3: XML Processor Interfaces9 JAXP: Using a SAX parser (2) n We have already seen this: SAXParserFactory spf = SAXParserFactory.newInstance(); try { try { SAXParser saxParser = spf.newSAXParser(); SAXParser saxParser = spf.newSAXParser(); XMLReader xmlReader = saxParser.getXMLReader();... XMLReader xmlReader = saxParser.getXMLReader();... xmlReader.setContentHandler(handler); xmlReader.setContentHandler(handler); xmlReader.parse(fileName);... xmlReader.parse(fileName);... } catch (Exception e) { } catch (Exception e) {System.err.println(e.getMessage()); System.exit(1); };

SDPL 2004Notes 3: XML Processor Interfaces10 f.xml JAXP: Using a DOM parser (1).parse(”f.xml”).newDocument().newDocumentBuilder()

SDPL 2004Notes 3: XML Processor Interfaces11 JAXP: Using a DOM parser (2) n Parsing a file into a DOM Document: DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilderFactory.newInstance(); try { // to get a new DocumentBuilder: DocumentBuilder builder = dbf.newDocumentBuilder(); Document domDoc = builder.parse(fileName); } catch (ParserConfigurationException e) { } catch (ParserConfigurationException e) {e.printStackTrace()); System.exit(1); };

SDPL 2004Notes 3: XML Processor Interfaces12 DOM building in JAXP XMLReader (SAXParser) XML ErrorHandler DTDHandler EntityResolver DocumentBuilder(ContentHandler) DOM Document DOM on top of SAX - So what?

SDPL 2004Notes 3: XML Processor Interfaces13 JAXP: Controlling parsing (1) n Errors of DOM parsing can be handled –by creating a SAX ErrorHandler » to implement error, fatalError and warning methods and passing it to the DocumentBuilder: builder.setErrorHandler(new myErrHandler()); domDoc = builder.parse(fileName); n Parser properties can be configured: –for both SAXParserFactories and DocumentBuilderFactories (before parser creation): factory.setValidating(true/false) factory.setNamespaceAware(true/false)

SDPL 2004Notes 3: XML Processor Interfaces14 JAXP: Controlling parsing (2) setIgnoringComments(true/false) setIgnoringElementContentWhitespace(true/false) setCoalescing(true/false) combine CDATA sections with surrounding text? combine CDATA sections with surrounding text? setExpandEntityReferences(true/false) n Further DocumentBuilderFactory configuration methods to control the form of the resulting DOM Document:

SDPL 2004Notes 3: XML Processor Interfaces15 JAXP Transformation API n earlier known as TrAX n Allows application to apply a Transformer to a Source document to get a Result document n Transformer can be created –from XSLT transformation instructions (to be discussed later) –without instructions »gives an identity transformation, which simply copies the Source to the Result

SDPL 2004Notes 3: XML Processor Interfaces16 XSLT JAXP: Using Transformers (1).newTransformer(…).transform(.,.) Source

SDPL 2004Notes 3: XML Processor Interfaces17 JAXP Transformation APIs javax.xml.transform: javax.xml.transform: –Classes Transformer and TransformerFactory ; initialization similar to parsers and parser factories n Transformation Source object can be –a DOM tree, a SAX XMLReader or an input stream n Transformation Result object can be –a DOM tree, a SAX ContentHandler or an output stream

SDPL 2004Notes 3: XML Processor Interfaces18 Source-Result combinations XMLReader (SAX Parser) Transformer DOM ContentHandler InputStream OutputStream DOMSourceResult

SDPL 2004Notes 3: XML Processor Interfaces19 JAXP Transformation Packages (2) n Classes to create Source and Result objects from DOM, SAX and I/O streams defined in packages –javax.xml.transform.dom, javax.xml.transform.sax, and javax.xml.transform.stream n Identity transformation to an output stream is a vendor-neutral way to serialize DOM documents (and the only option in JAXP) –“I would recommend using the JAXP interfaces until the DOM’s own load/save module becomes available” »Joe Kesselman, IBM & W3C DOM WG

SDPL 2004Notes 3: XML Processor Interfaces20 Serializing a DOM Document as XML text n By an identity transformation to an output stream: TransformerFactory tFactory = TransformerFactory.newInstance(); // Create an identity transformer: Transformer transformer = tFactory.newTransformer(); DOMSource source = new DOMSource(myDOMdoc); StreamResult result = new StreamResult(System.out); transformer.transform(source, result);

SDPL 2004Notes 3: XML Processor Interfaces21 Controlling the form of the result? We could specify the requested form of the result by an XSLT script, say, in file saveSpecSrc.xslt : We could specify the requested form of the result by an XSLT script, say, in file saveSpecSrc.xslt : <xsl:output encoding="ISO " indent="yes" <xsl:output encoding="ISO " indent="yes" doctype-system="reglist.dtd" /> doctype-system="reglist.dtd" /> </xsl:transform>

SDPL 2004Notes 3: XML Processor Interfaces22 Creating an XSLT Transformer n Then create a tailored transfomer: StreamSource saveSpecSrc = new StreamSource( new File(”saveSpec.xslt”) ); Transformer transformer = tFactory.newTransformer(saveSpecSrc); // and use it to transform a Source to a Result, // as before NB: Transformation instructions could be given also, say, as a DOMSource NB: Transformation instructions could be given also, say, as a DOMSource

SDPL 2004Notes 3: XML Processor Interfaces23 Other Java APIs for XML n JDOM –a Java-specific variant of W3C DOM – DOM4J ( ) DOM4J ( ) –roughly similar to JDOM; richer set of features: –powerful navigation with integrated XPath support n JAXB (Java Architecture for XML Binding) –compiles DTDs to DTD-specific classes for reading, manipulating and writing valid documents –

SDPL 2004Notes 3: XML Processor Interfaces24 Why then stick to DOM? n Other document models might be more convenient to use, but … “The DOM offers not only the ability to move between languages with minimal relearning, but to move between multiple implementations in a single language – which a specific set of classes such as JDOM can’t support” “The DOM offers not only the ability to move between languages with minimal relearning, but to move between multiple implementations in a single language – which a specific set of classes such as JDOM can’t support” »J. Kesselman, IBM & W3C DOM WG

SDPL 2004Notes 3: XML Processor Interfaces25 JAXP: Summary n An interface for using XML Processors –SAX/DOM parsers, XSLT transformers n Supports pluggability of XML processors n Defines means to control parsing, and handling of parse errors (through SAX ErrorHandlers) n Defines means to write out DOM Documents n Included in Java 2