Internet Technologies Java and XML (DOM and SAX) Some of the material for these slides came from the following sources: “XML a Manager’s Guide” by Kevin.

Slides:



Advertisements
Similar presentations
J0 1 Marco Ronchetti - Web architectures – Laurea Specialistica in Informatica – Università di Trento Java XML parsing.
Advertisements

Technische universität dortmund Service Computing Service Computing Prof. Dr. Ramin Yahyapour IT & Medien Centrum 22. Oktober 2009.
XML Parsers By Chongbing Liu. XML Parsers  What is a XML parser?  DOM and SAX parser API  Xerces-J parsers overview  Work with XML parsers (example)
Internet Technologies1 XML Messaging A PowerWarning application using servlets and SAX The PowerWarning Application is from “XML and Java” by Maruyama,
Lecture 4 Java Interfaces (review of inheritance and abstract classes) The XML DOM Java Examples Homework 3.
Summer A-2000, Project Course-- Carnegie Mellon University 1 Financial Engineering Project Course.
1 SAX and more… CS , Spring 2008/9. 2 SAX Parser SAX = Simple API for XML XML is read sequentially When a parsing event happens, the parser invokes.
SAX A parser for XML Documents. XML Parsers What is an XML parser? –Software that reads and parses XML –Passes data to the invoking application –The application.
1 The Simple API for XML (SAX) Part I ©Copyright These slides are based on material from the upcoming book, “XML and Bioinformatics” (Springer-
14-Jun-15 DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
Financial Engineering Project Course. Lecture 3 Object Oriented Design Inheritance Abstract Base Classes Polymorphism Using XML to represent the swap.
Week 5 Basic SAX Example From Chapter 5 of XML and Java Working with XML SAX Filters as described in Chapter 5.
21-Jun-15 SAX (Abbreviated). 2 XML Parsers SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C standard.
Summer A-2000, Project Course-- Carnegie Mellon University 1 Financial Engineering Project Course.
26-Jun-15 SAX. SAX and DOM SAX and DOM are standards for XML parsers--program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
Internet Technologies1 More XML Schema The main source for these slides is “The XML Companion” by Bradley Other resources:
OCT1 Java and XML (DOM and SAX) Some of the material for these slides came from the following sources: “XML a Manager’s Guide” by Kevin Dick “The XML Companion”
Processing XML Processing XML using XSLT Processing XML documents with Java (DOM) Next week -- Processing XML documents with Java (SAX)
Internet Technologies1 Java and XML (DOM and SAX) Some of the material for these slides came from the following sources: “XML a Manager’s Guide” by Kevin.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)
17 Apr 2002 XML Programming: SAX Andy Clark. SAX Design Premise Generic method of creating XML parser, parsing documents, and receiving document information.
1 Processing XML with Java Dr. Praveen Madiraju Modified from Dr.Sagiv ’ s slides.
Processing of structured documents Spring 2003, Part 5 Helena Ahonen-Myka.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools Leonidas Fegaras.
17 Apr 2002 XML Programming: JAXP Andy Clark. Java API for XML Processing Standard Java API for loading, creating, accessing, and transforming XML documents.
The Joy of SAX (and DOM, and JDOM…) Bill MacCartney 11 October 2004.
SDPL 2003Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
1 Java and XML Modified from presentation by: Barry Burd Drew University Portions © 2002 Hungry Minds, Inc.
SDPL 2002Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
SAX. What is SAX SAX 1.0 was released on May 11, SAX is a common, event-based API for parsing XML documents Primarily a Java API but there implementations.
Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER.
XML Processing in Java. Required tools Sun JDK 1.4, e.g.: JAXP (part of Java Web Services Developer Pack, already in Sun.
Java API for XML Processing (JAXP) Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer.
Sheet 1XML Technology in E-Commerce 2001Lecture 3 XML Technology in E-Commerce Lecture 3 DOM and SAX.
1 Processing XML with Java Modified from Dr.Sagiv ’ s slides.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
Document Object Model DOM. Agenda l Introduction to DOM l Java API for XML Parsing (JAXP) l Installation and setup l Steps for DOM parsing l Example –Representing.
SNU OOPSLA Lab. DOM/SAX Applications The ubiquitous XML(9) © copyright 2001 SNU OOPSLA Lab.
Apache DOM Parser©zwzOctober 24, 2002 Wenzhong Zhao Department of Computer Science The University of Kentucky.
© Marty Hall, Larry Brown Web core programming 1 Simple API for XML SAX.
XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX.
Schema Data Processing
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
Java Web 应用开发: J2EE 和 Tomcat 蔡 剑, Ph.D.. 本讲内容 Web 层技术 (III) Custom Tags JSP and XML JSTL.
SDPL 20063: XML Processor Interfaces1 3. XML Processor APIs n How can (Java) applications manipulate structured (XML) documents? –An overview of XML processor.
Simple API for XML (SAX) Aug’10 – Dec ’10. Introduction to SAX Simple API for XML or SAX was developed as a standardized way to parse an XML document.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
13-Mar-16 DOM. 2 Difference between SAX and DOM DOM reads the entire XML document into memory and stores it as a tree data structure SAX reads the XML.
USING ANDROID WITH THE DOM. Slide 2 Lecture Summary DOM concepts SAX vs DOM parsers Parsing HTTP results The Android DOM implementation.
1 Introduction SAX. Objectives 2  Simple API for XML  Parsing an XML Document  Parsing Contents  Parsing Attributes  Processing Instructions  Skipped.
21-Jun-16 Document Object Model DOM. SAX and DOM SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files DOM is a W3C.
Java API for XML Processing
Simple API for XML SAX. Agenda l Introduction to SAX l Installation and setup l Steps for SAX parsing l Defining a content handler l Examples Printing.
XML. Contents  Parsing an XML Document  Validating XML Documents.
{ XML Technologies } BY: DR. M’HAMED MATAOUI
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Jagdish Gangolly State University of New York at Albany
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Java API for XML Processing
DOM 8-Dec-18.
DOM 24-Feb-19.
SAX2 29-Jul-19.
WaysInJavaToParseXML
Presentation transcript:

Internet Technologies Java and XML (DOM and SAX) Some of the material for these slides came from the following sources: “XML a Manager’s Guide” by Kevin Dick “The XML Companion” by Bradley Java Documentation from Sun Microsystems “XML and Java” by Maruyama, Tamura and Uramoto On and Off the internet…

Internet Technologies Java and XML (DOM and SAX) Parser Operations with DOM and SAX overview Processing XML with SAX (locally and on the internet) Processing XML with DOM (locally and on the internet)

Internet Technologies FixedFloatSwap.xml

Internet Technologies FixedFloatSwap.dtd <!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) >

Internet Technologies Operation of a Tree-based Parser Tree-Based Parser Application Logic Document Tree Valid XML DTD XML Document

Internet Technologies Tree Benefits Some data preparation tasks require early access to data that is further along in the document (e.g. we wish to extract titles to build a table of contents) New tree construction is easier (e.g. XSLT works from a tree to convert FpML to WML)

Internet Technologies Operation of an Event Based Parser Event-Based Parser Application Logic Valid XML DTD XML Document

Internet Technologies Operation of an Event Based Parser Event-Based Parser Application Logic Valid XML DTD XML Document public void startDocument () public void endDocument () public void startElement (…)) public void endElement (…) public void characters (…)) public void error(SAXParseException e) throws SAXException { System.out.println("\n\n--Invalid document ---" + e); }

Internet Technologies Event-Driven Benefits We do not need the memory required for trees Parsing can be done faster with no tree construction going on

Internet Technologies XML API’s w/jaxpack

Internet Technologies Important SAX interfaces and classes class InputSource -- A single input source for an XML entity interface XMLReader -- defines parser behavior (implemented by Xerces’ SAXParser) Four core SAX2 handler interfaces: EntityResolver DTDHandler ContentHandler ErrorHandler Implemented by class DefaultHandler

Internet Technologies Processing XML with SAX interface XMLReader -- defines parser behavior (implemented by Xerces’ SAXParser) XMLReader is the interface that an XML parser's SAX2 driver must implement. This interface allows an application to set and query features and properties in the parser, to register event handlers for document processing, and to initiate a document parse.

Internet Technologies Processing XML with SAX We will look at the following interfaces and classes and then study an example interface ContentHandler -- reports on document events interface ErrorHandler – reports on validity errors class DefaultHandler – implements both of the above plus two others

Internet Technologies public interface ContentHandler Receive notification of general document events. This is the main interface that most SAX applications implement: if the application needs to be informed of basic parsing events, it implements this interface and registers an instance with the SAX parser using the setContentHandler method. The parser uses the instance to report basic document-related events like the start and end of elements and character data.

Internet Technologies void characters(…) Receive notification of character data. void endDocument(…) Receive notification of the end of a document. void endElement(…) Receive notification of the end of an element. void startDocument(…) Receive notification of the beginning of a document. void startElement(…) Receive notification of the beginning of an element. Some methods from the ContentHandler Interface

Internet Technologies public interface ErrorHandler Basic interface for SAX error handlers. If a SAX application needs to implement customized error handling, it must implement this interface and then register an instance with the SAX parser. The parser will then report all errors and warnings through this interface. For XML processing errors, a SAX driver must use this interface instead of throwing an exception: it is up to the application to decide whether to throw an exception for different types of errors and warnings. Note, however, that there is no requirement that the parser continue to provide useful information after a call to fatalError. fatalError

Internet Technologies public interface ErrorHandler Some methods are: void error(SAXParseException exception) Receive notification of a recoverable error. void fatalError(SAXParseException exception) Receive notification of a non-recoverable error. void warning(SAXParseException exception) Receive notification of a warning.

Internet Technologies public class DefaultHandler extends java.lang.Object implements EntityResolver, DTDHandler, ContentHandler, ErrorHandler Default base class for handlers. This class implements the default behaviour for four SAX interfaces: EntityResolver, DTDHandler, ContentHandler, and ErrorHandler.

Internet Technologies <!ELEMENT FixedFloatSwap ( Bank, Notional, Fixed_Rate, NumYears, NumPayments ) > FixedFloatSwap.dtd Input DTD

Internet Technologies <!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [ ] > &bankname; FixedFloatSwap.xml Input XML

Internet Technologies Processing // NotifyStr.java // Adapted from XML and Java by Maruyama, Tamura and // Uramoto import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.*; import javax.xml.parsers.*; public class NotifyStr extends DefaultHandler {

Internet Technologies public static void main (String argv []) throws IOException, SAXException { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); } XMLReader reader = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); InputSource inputSource = new InputSource(argv[0]); reader.setContentHandler(new NotifyStr()); reader.parse(inputSource); System.exit (0); }

Internet Technologies public NotifyStr() {} public void startDocument() throws SAXException { System.out.println("startDocument called:"); } public void endDocument() throws SAXException { System.out.println("endDocument called:"); }

Internet Technologies public void startElement(String namespaceURI, String localName, String qName, Attributes aMap) throws SAXException { System.out.println("startElement called: element name =" + localName); // examine the attributes for(int i = 0; i < aMap.getLength(); i++) { String attName = aMap.getLocalName(i); String type = aMap.getType(i); String value = aMap.getValue(i); System.out.println(" attribute name = " + attName + " type = " + type + " value = " + value); }

Internet Technologies public void characters(char[] ch, int start, int length) throws SAXException { // build String from char array String dataFound = new String(ch,start,length); System.out.println("characters called:" + dataFound); } }

Internet Technologies C:\McCarthy\www\95-733\examples\sax>java NotifyStr FixedFloatSwap.xml startDocument called: startElement called: element name =FixedFloatSwap startElement called: element name =Bank characters called:Pittsburgh National Corporation startElement called: element name =Notional attribute name = currency type = dollars|pounds value = pounds characters called:100 startElement called: element name =Fixed_Rate characters called:5 startElement called: element name =NumYears characters called:3 startElement called: element name =NumPayments characters called:6 endDocument called: Output

Internet Technologies Accessing the swap from the internet <!DOCTYPE FixedFloatSwap [ ] > &bankname; Saved under webapps/sax/fpml/FixedFloatSwap.xml

Internet Technologies The Deployment Descriptor <!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN" " SaxExample GetXML SaxExample /GetXML/* webapps/sax/WEB-INF/web.xml

Internet Technologies // This servlet file is stored under Tomcat in // webapps/sax/WEB-INF/classes/GetXML.java // This servlet returns a user selected xml file from // webapps/sax/fpml directory // and returns it as a string to the client. import java.io.*; import java.util.*; import javax.servlet.*; import javax.servlet.http.*; public class GetXML extends HttpServlet { // Servlet

Internet Technologies public void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException { System.out.println("doGet called with " + req.getPathInfo()); String theData = ""; String extraPath = req.getPathInfo(); extraPath = extraPath.substring(1); // read the file try { // open file and create a DataInputStream FileInputStream theFile = new FileInputStream( "D:\\jakarta-tomcat-4.0.1\\webapps\\sax\\fpml\\“ +extraPath);

Internet Technologies InputStreamReader is = new InputStreamReader(theFile); BufferedReader br = new BufferedReader(is); // read the file into the string theData String thisLine; while((thisLine = br.readLine()) != null) { theData += thisLine + "\n"; } catch(Exception e) { System.err.println("Error " + e); }

Internet Technologies PrintWriter out = res.getWriter(); out.write(theData); System.out.println("Wrote document to client"); //System.out.println(theData); out.close(); }

Internet Technologies // TomcatNotifyStr.java // Adapted from XML and Java by Maruyama, Tamura and Uramoto import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.*; import javax.xml.parsers.*; public class TomcatNotifyStr extends DefaultHandler { public static void main (String argv []) throws IOException, SAXException { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); } // Client

Internet Technologies XMLReader reader = XMLReaderFactory.createXMLReader( "org.apache.xerces.parsers.SAXParser"); String serverString = " String fileName = argv[0]; InputSource inputSource = new InputSource(serverString + fileName); reader.setContentHandler(new TomcatNotifyStr()); reader.parse(inputSource); System.exit (0); }

Internet Technologies public TomcatNotifyStr() {} public void startDocument() throws SAXException { System.out.println("startDocument called:"); } public void endDocument() throws SAXException { System.out.println("endDocument called:"); }

Internet Technologies public void startElement(String namespaceURI, String localName, String qName, Attributes aMap) throws SAXException { System.out.println("startElement called: element name =" + localName); // examine the attributes for(int i = 0; i < aMap.getLength(); i++) { String attName = aMap.getLocalName(i); String type = aMap.getType(i); String value = aMap.getValue(i); System.out.println(" attribute name = " + attName + " type = " + type + " value = " + value); }

Internet Technologies public void characters(char[] ch, int start, int length) throws SAXException { // build String from char array String dataFound = new String(ch,start,length); System.out.println("characters called:" + dataFound); } }

Internet Technologies Being served by the servlet <!DOCTYPE FixedFloatSwap [ ] > &bankname;

Internet Technologies C:\McCarthy\www\95-733\examples\sax>java TomcatNotifyStr FixedFloatSwap.xml startDocument called: startElement called: element name =FixedFloatSwap characters called: startElement called: element name =Bank characters called:Pittsburgh National Corporation characters called: startElement called: element name =Notional attribute name = currency type = CDATA value = pounds characters called:100 characters called: startElement called: element name =Fixed_Rate characters called:5 characters called: startElement called: element name =NumYears characters called:3 characters called: startElement called: element name =NumPayments characters called:6 characters called: endDocument called: Output

Internet Technologies Let’s Add Back the DTD… <!ELEMENT FixedFloatSwap ( Bank, Notional, Fixed_Rate, NumYears, NumPayments ) >

Internet Technologies And reference the DTD in the XML <!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [ ] > &bankname;

Internet Technologies We get new output How many times did we visit the servlet? Twice. Once for the xml and a second time for the DTD. C:\McCarthy\www\95-733\examples\sax>java TomcatNotifyStr FixedFloatSwap.xml startDocument called: startElement called: element name =FixedFloatSwap startElement called: element name =Bank characters called:Pittsburgh National Corporation startElement called: element name =Notional attribute name = currency type = dollars|pounds value = pounds characters called:100 startElement called: element name =Fixed_Rate characters called:5 startElement called: element name =NumYears characters called:3 startElement called: element name =NumPayments characters called:6 endDocument called:

Internet Technologies We don’t have to go through a servlet…Tomcat can send the files String serverString = " String fileName = argv[0]; InputSource is = new InputSource(serverString + fileName); But the servlet illustrates that the XML data can be generated dynamically.

Internet Technologies The InputSource Class The SAX and DOM parsers need XML input. The “output” produced by these parsers amounts to a series of method calls (SAX) or an application programmer interface to the tree (DOM). An InputSource object can be used to provided input to the parser. InputSurce SAX or DOM Tree Events application So, how do we build an InputSource object?

Internet Technologies Some InputSource constructors: InputSource(String pathToFile); InputSource(InputStream byteStream); InputStream(Reader characterStream); For example: String text = “ some xml ”; StringReader sr = new StringReader(text); InputSource is = new InputSource(sr); : myParser.parse(is); The InputSource Class

Internet Technologies But what about the DTD? public interface EntityResolver Basic interface for resolving entities. If a SAX application needs to implement customized handling for external entities, it must implement this interface and register an instance with the SAX parser using the parser's setEntityResolver method. The parser will then allow the application to intercept any external entities (including the external DTD subset and external parameter entities, if any) before including them.

Internet Technologies EntityResolver public InputSource resolveEntity(String publicId, String systemId) { // Add this method to the client above. The systemId String // holds the path to the dtd as specified in the xml document. // We may now access the dtd from a servlet and return an // InputStream or return null and let the parser resolve the // external entity. System.out.println("Attempting to resolve" + "Public id :" + publicId + "System id :" + systemId); return null; }

Internet Technologies The following examples were tested using Sun’s JAXP (Java API for XMP Parsing. This is available at and click on XMLhttp:// Processing XML with DOM

Internet Technologies XML DOM The World Wide Web Consortium’s Document Object Model Provides a common vocabulary to use in manipulating XML documents. May be used from C, Java, Perl, Python, or VB Things may be quite different “under the hood”. The interface to the document will be the same.

Internet Technologies I am The Cat in The Hat I am Little Cat A I am Little Cat B I am Little Cat C The XML File “cats.xml”

Internet Technologies Little cat A Little cat B I am little cat B topcat I am the cat in the hat Little cat D Little Cat C I am little cat C I am little cat A document XML doc doctypeelement textelement text element DOM Called the Document Element

Internet Technologies Agreement.xml

Internet Technologies document XML doc doctype FixedFloatSwap Notional FixedRate NumYearsNumPayments All of these nodes implement the Node interface

Internet Technologies Operation of a Tree-based Parser Tree-Based Parser Application Logic Document Tree Valid XML DTD XML Document

Internet Technologies Some DOM Documentation from JavaSoft

Internet Technologies The Node Interface The Node interface is the primary datatype for the entire Document Object Model. It represents a single node in the document tree. While all objects implementing the Node interface expose methods for dealing with children, not all objects implementing the Node interface may have children. For example, Text nodes may not have children.

Internet Technologies Properties All Nodes have properties. Not all properties are needed by all types of nodes. The attribute property is an important part of the Element node but is null for the Text nodes. We access the properties through methods…

Internet Technologies Some Methods of Node Example Methods are: String getNodeName() – depends on the Node type if Element node return tag name if Text node return #text

Internet Technologies Some Methods of Node Example Methods are: short getNodeType() Might return a constant like ELEMENT_NODE or TEXT_NODE or …

Internet Technologies Some Methods of Node Example Methods are: String getNodeValue() if the Node is an Element Node then return ‘null’ if the Node is a Text Node then return a String representing that text.

Internet Technologies Some Methods of Node Example Methods are: Node getParentNode() returns a reference to the parent

Internet Technologies Some Methods of Node Example Methods are: public Node getFirstChild() Returns the value of the firstChild property.

Internet Technologies Some Methods of Node Example Methods are: public NodeList getChildNodes() returns a NodeList object NodeList is an interface and not a Node.

Internet Technologies The NodeList Interface The NodeList interface provides the abstraction of an ordered collection of nodes, without defining or constraining how this collection is implemented. The items in the NodeList are accessible via an integral index, starting from 0.

Internet Technologies There are only two methods of the NodeList Interface public Node item(int index) Returns the item at index in the collection. If index is greater than or equal to the number of nodes in the list, this returns null.

Internet Technologies There are only two methods of the NodeList Interface public int getLength() Returns the value of the length property.

Internet Technologies The Element Interface public interface Element extends Node By far the vast majority of objects (apart from text) that authors encounter when traversing a document are Element nodes. Inheritance Nothing prevents us from extending one interface in order to create another. Those who implement Element just have more promises to keep.

Internet Technologies The Element Interface public interface Element extends Node Some methods in the Element interface String getAttribute(String name) Retrieves an attribute value by name.

Internet Technologies The Element Interface public interface Element extends Node Some methods in the Element interface public String getTagName() Returns the value of the tagName property.

Internet Technologies The Element Interface public interface Element extends Node Some methods in the Element interface public NodeList getElementsByTagName(String name) Returns a NodeList of all descendant elements with a given tag name, in the order in which they would be encountered in a preorder traversal of the Element tree..

Internet Technologies The CharacterData Interface public interface CharacterData extends Node The CharacterData interface extends Node with a set of attributes and methods for accessing character data in the DOM. For clarity this set is defined here rather than on each object that uses these attributes and methods. No DOM objects correspond directly to CharacterData, though Text and others do inherit the interface from it. All offsets in this interface start from 0.

Internet Technologies The CharacterData Interface public interface CharacterData extends Node An example method: public String getData() Returns the value of the the character data of the node that implements this interface. The Text interface extends CharacterData. public void setData(String data) is also available.

Internet Technologies The Document Interface public interface Document extends Node The Document interface represents the entire HTML or XML document. Conceptually, it is the root of the document tree, and provides the primary access to the document's data.

Internet Technologies The Document Interface public interface Document extends Node Some methods: public Element getDocumentElement() Returns the value of the documentElement property. This is a convenience attribute that allows direct access to the child node that is the root element of the document. For HTML documents, this is the element with the tagName "HTML".

Internet Technologies The Document Interface Some methods: public NodeList getElementsByTagName(String tagname) Returns a NodeList of all the Elements with a given tag name in the order in which the would be encountered in a preorder traversal of the Document tree. Parameters: tagname - The name of the tag to match on. The special value "*" matches all tags. Returns: A new NodeList object containing all the matched Elements.

Internet Technologies FixedFloatSwap.xml

Internet Technologies document XML doc doctype FixedFloatSwap Notional FixedRate NumYearsNumPayments FixedFloatSwap.xml

Internet Technologies An Example import java.io.File; import org.w3c.dom.*; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.DocumentBuilder; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; Process a local file

Internet Technologies public class Simulator3 { public static void main(String argv[]) { Document doc; if(argv.length != 1 ) { System.err.println("usage: java Simulator3 documentname"); System.exit(1); } try { DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();

Internet Technologies doc = docBuilder.parse(new File(argv[0])); Element top = doc.getDocumentElement(); top.normalize(); // concatenate adjacent text nodes NodeList elementList = top.getElementsByTagName("*"); int listLength = elementList.getLength(); for(int i = 0; i < listLength; i++) { Element e = (Element)elementList.item(i); System.out.print(e.getNodeName()); Text t = (Text)e.getFirstChild(); System.out.println(t.getNodeValue()); }

Internet Technologies } catch(SAXParseException err) { System.out.println("Parsing error" + ", line " + err.getLineNumber() + ", URI " + err.getSystemId()); System.out.println(" " + err.getMessage()); } catch(SAXException e) { Exception x = e.getException(); ((x == null) ? e : x).printStackTrace(); } catch (Throwable t) { t.printStackTrace(); } System.exit(0); }

Internet Technologies FixedFloatSwap.xml

Internet Technologies Output Notional100 Fixed_Rate5 NumYears3 NumPayments6

Internet Technologies Another DOM Example The program then displays the DOM tree. A Java Program that reads FixedFloatSwap.xml from Tomcat and performs validation against the server based DTD.

Internet Technologies import java.net.*; import java.io.*; import org.w3c.dom.*; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.DocumentBuilder; import org.xml.sax.*; public class Simulator6 { public static void main(String argv[]) { try { DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance(); docBuilderFactory.setValidating(true); docBuilderFactory.setNamespaceAware(true); Process a file on the internet.

Internet Technologies DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder(); docBuilder.setErrorHandler( new org.xml.sax.ErrorHandler() { public void fatalError(SAXParseException e) throws SAXException { System.out.println("Fatal error"); // an exception will be thrown by SAX } public void error(SAXParseException e) throws SAXParseException { System.out.println("Validity error"); throw e; } Register our own event handler

Internet Technologies public void warning(SAXParseException err) throws SAXParseException { System.out.println("** Warning" + ", line " + err.getLineNumber() + ", uri " + err.getSystemId()); System.out.println(" " + err.getMessage()); throw err; } ); public interface ErrorHandler Basic interface for SAX error handlers. If a SAX application needs to implement customized error handling, it must implement this interface and then register an instance with the SAX parser using the parser's setErrorHandler method. The parser will then report all errors and warnings through this interface. The parser shall use this interface instead of throwing an exception: it is up to the application whether to throw an exception for different types of errors and warnings. Note, however, that there is no requirement that the parser continue to provide useful information after a call to fatalError (in other words, a SAX driver class could catch an exception and report a fatalError).

Internet Technologies InputSource is = new InputSource(" Document doc = docBuilder.parse(is); System.out.println("No Problems found"); // Let’s print the tree TreePrinter tp = new TreePrinter(doc); tp.print(); } Tomcat’s port. Under webapps/ROOT/fpml A single input source for an XML entity.

Internet Technologies catch(SAXParseException err) { System.out.println("Catching raised exception"); System.out.println("Parsing error" + ", line " + err.getLineNumber() + ", URI " + err.getSystemId()); System.out.println(" " + err.getMessage()); } catch(SAXException e) { System.out.println("Catch clause 2"); Exception x = e.getException(); ((x == null) ? e : x).printStackTrace(); } catch (Throwable t) { System.out.println("Catch clause 3"); t.printStackTrace(); } System.exit(0); }

Internet Technologies A TreePrint Class import org.w3c.dom.*; public class TreePrinter { private Document doc; private int currentIndent; public TreePrinter(Document d) { currentIndent = 2; doc = d; } public void print() { privatePrint(doc,currentIndent); }

Internet Technologies document XML doc doctype FixedFloatSwap Notional FixedRate NumYearsNumPayments FixedFloatSwap.xml

Internet Technologies public void privatePrint(Node n, int indent) { for(int i = 0; i < indent; i++) System.out.print(" "); switch( n.getNodeType()) { // Print information as each node type is encountered case n.DOCUMENT_NODE : System.out.println(n.getNodeName() + "...Document Node"); break; case n.ELEMENT_NODE : System.out.println(n.getNodeName() + "...Element Node"); break; case n.TEXT_NODE : System.out.println(n.getNodeName() + "...Text Node"); break; case n.CDATA_SECTION_NODE: System.out.println(n.getNodeName() + "...CDATA Node"); break; case n.PROCESSING_INSTRUCTION_NODE: System.out.println(" "+ "...PI Node"); break;

Internet Technologies case n.COMMENT_NODE: System.out.println(" " + "...Comment node"); break; case n.ENTITY_NODE: System.out.println("ENTITY "+ n.getNodeName()+ "...Entity Node"); break; case n.ENTITY_REFERENCE_NODE: System.out.println("&"+n.getNodeName()+";" + "...Entity Reference Node"); break; case n.DOCUMENT_TYPE_NODE: System.out.println("DOCTYPE"+n.getNodeName()+ "...Document Type Node"); break; default: System.out.println("?" + n.getNodeName()); } Node child = n.getFirstChild(); while(child != null) { privatePrint(child, indent+currentIndent); child = child.getNextSibling(); }

Internet Technologies Output C:\McCarthy\www\Financial Engineering\FixedFloatSwap>java Simulator6 No Problems found #document...Document Node DOCTYPEFixedFloatSwap...Document Type Node FixedFloatSwap...Element Node #text...Text Node Notional...Element Node #text...Text Node Fixed_Rate...Element Node #text...Text Node NumYears...Element Node #text...Text Node NumPayments...Element Node #text...Text Node

Internet Technologies Building a DOM Tree From Scratch 100 Let’s create this file from within a java program. MyGradeBook.xml

Internet Technologies GOAL C:\McCarthy\www\95-733\examples\dom>java DomExample C:\McCarthy\www\95-733\examples\dom>type MyGradeBook.xml 100

Internet Technologies // DomExample.java // Building an xml document from scratch import java.io.*; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.ParserConfigurationException; import org.w3c.dom.*; import org.apache.xml.serialize.XMLSerializer; // not standard import org.apache.xml.serialize.OutputFormat; // not standard

Internet Technologies public class DomExample { private Document document; public DomExample () { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); try { DocumentBuilder builder = factory.newDocumentBuilder(); document = builder.newDocument(); } catch (Throwable t) { t.printStackTrace (); }

Internet Technologies // Ask the Document object for various types // of nodes and // add them to the tree. Element root = document.createElement("GradeBook"); document.appendChild(root); Element student = document.createElement("Student"); root.appendChild(student); Element score = document.createElement("Score"); student.appendChild(score); Text value = document.createTextNode("100"); score.appendChild(value);

Internet Technologies // Write the Document to disk using Xerces. try { FileOutputStream fos = new FileOutputStream( "MyGradeBook.xml"); XMLSerializer xmlWriter = new XMLSerializer(fos, null); xmlWriter.serialize(document); } catch(IOException ioe) { ioe.printStackTrace(); }

Internet Technologies public static void main(String a[]) { DomExample tree = new DomExample(); }