Download presentation
Presentation is loading. Please wait.
Published byCorey Thompson Modified over 9 years ago
1
SNU OOPSLA Lab. DOM/SAX Applications The ubiquitous XML(9) © copyright 2001 SNU OOPSLA Lab.
2
SNU OOPSLA Lab. The ubiquitous XML 2 DOM SAX How to make XML application? DOM/SAX Applications DOM
3
SNU OOPSLA Lab. The ubiquitous XML 3 Contents of DOM What is the DOM ? Java implementation Nodes Elements Attributes Node lists DOM
4
SNU OOPSLA Lab. The ubiquitous XML 4 What is DOM ? DOM(Document Object Model) Was developed by W3C Specify how future Web browser and embedded scripts should access HTML and XML documents DOM
5
SNU OOPSLA Lab. The ubiquitous XML 5 Java implementation SUN provide a class for parsing XML, called Xml Document. Xml Document method parses XML file, build the document tree. To use the SUN parser => import org.w3c.dom.*; import com.sun.xml.tree.*; import org.xml.sax.*; DOM
6
SNU OOPSLA Lab. The ubiquitous XML 6 Nodes (1/4) Nodes describe elements, text, comments, processing instructions, CDATA section, entity references... The Node interface itself defines a number of methods. 1. Each node has characteristics (type, name, value) 2. Having a contextual location in the document tree. 3. Capability to modify its contents. DOM
7
SNU OOPSLA Lab. The ubiquitous XML 7 Nodes (2/4) Node characteristics getNodeType => determining its type getNodeName => returning the name of the node setNodeValue => replacing the value of node hasChildNodes => whether node has children or not getAttributes => accessing attribute DOM
8
SNU OOPSLA Lab. The ubiquitous XML 8 Nodes (3/4) Node navigation When processing a document via the DOM interface, it is to use node as a stepping-stones. Each node has methods that return references to surrounding nodes. getParentNode( ) getPreviousSibling( ) getFirstChild( ) getChildNodes( ) getLastChild( ) getNextSibling( ) DOM
9
SNU OOPSLA Lab. The ubiquitous XML 9 Nodes (4/4) Node manipulation remove child method. appendChild method insertbefore method replaceChild method Ex) Old Child New Child DOM
10
SNU OOPSLA Lab. The ubiquitous XML 10 Documents An entire XML document is represented by a special type of node. - getDoctype - getImplementation - getDocumentElement - getElementsByTagName DOM
11
SNU OOPSLA Lab. The ubiquitous XML 11 Elements Element interface Extends the Node interfaces Adds element-specific functionality General element processing - getTagName method - getElementsByTagName method - normalize method Ex) Here is some text “here is some text” DOM
12
SNU OOPSLA Lab. The ubiquitous XML 12 Attributes Attribute characteristics - getName method - getValue - setValue - getSpecified Creating attribute - createAttribute DOM
13
SNU OOPSLA Lab. The ubiquitous XML 13 Node lists The Nodelist interface contains two method - Node item(int index); int getLength( ); 3 getLength( ); node 1 node 0 getLength( ) node 2 Item(1); DOM
14
SNU OOPSLA Lab. The ubiquitous XML 14 Named node maps The NamedNodemap interface is designed to contain nodes, in no particular order, that can be accessed by name. 4 getLength( ); Lang ID getLength( ) Security Item(1); Added getNamedItem(“Security”); setNamedItem(O); removeNamedItem(“Added”) DOM
15
SNU OOPSLA Lab. The ubiquitous XML 15 DOM SAX How to make XML application? DOM/SAX Applications SAX
16
SNU OOPSLA Lab. The ubiquitous XML 16 Contents of SAX What is SAX? Call-backs and interfaces The Parser Document handlers Attribute lists Error handlers Locators Handler bases SAX
17
SNU OOPSLA Lab. The ubiquitous XML 17 What is SAX? SAX(the Simple API for XML) Is a standard API for event-driven processing of XML data Allowing parsers to deliver information to applications in digestible chunks SAX
18
SNU OOPSLA Lab. The ubiquitous XML 18 Call-backs and interfaces The SAX interface are: Parser Document Handler AttributeList ErrorHandler EntityResolver Locator DTD Handler SAX
19
SNU OOPSLA Lab. The ubiquitous XML 19 The Parser The Work of Parser The parser developer creates a class that actually parses the XML document or data stream The parser reads the XML source data Stops reading when encounters a meaningful object Sends the information to the main application by calling an appropriate method Waits for this method to return before continuing SAX
20
SNU OOPSLA Lab. The ubiquitous XML 20 Document handlers In order for the application to receive basic markup events from the parser, the application developer must create a class that implements the DocumentHandler interface. Application Parser Document Handler create give startDocument() startElement() characters() endElement() endDocument() …………. parsing Feedback When event driven Event driven SAX
21
SNU OOPSLA Lab. The ubiquitous XML 21 Attribute lists A wrapper object for all attribute details int getLength(); … to associate how many attributes are present. String getName(int i); … to discover the name of one of the attributes String getType(int i); … when a DTD is in use, to get a data type String getType(String name); assigned to each attribute. String getValue(int i); … to get the value of an attribute String getValue(String name); SAX
22
SNU OOPSLA Lab. The ubiquitous XML 22 Error handlers When the application needs to be informed of warnings and errors It can implement ErrorHandler interface SAX
23
SNU OOPSLA Lab. The ubiquitous XML 23 Locators Necessity An error message is not particularly helpful when no indication is given as to where the error occurred. Locator interface can tell the entity, line number and character number of the warning or error SAX
24
SNU OOPSLA Lab. The ubiquitous XML 24 Handler bases HandlerBase class Providing some sensible default behavior for each event, which could be subclassed to add application-specific functionality SAX
25
SNU OOPSLA Lab. The ubiquitous XML 25 DOM/SAX Applications DOM SAX How to make XML application? Making XML Application
26
SNU OOPSLA Lab. The ubiquitous XML 26 Contents XML Application Architecture Parser Basics Kinds of Parsers The Document Object Model(DOM) DOM Application The Simple API for XML(SAX) SAX Application Making XML Application
27
SNU OOPSLA Lab. The ubiquitous XML 27 XML Application Architecture An XML Application is typically built around an XML parser It has an interface to its users, and an interface to some sort of back-end data store XML Application User Interface Data Store XML Parser Making XML Application
28
SNU OOPSLA Lab. The ubiquitous XML 28 Parser Basics A piece of code that reads a document and analyzes its structure How to use a parser Create a parser object Pass your XML document to the parser Process the results Building an XML Application is obviously more involved than this Making XML Application
29
SNU OOPSLA Lab. The ubiquitous XML 29 Kinds of Parsers Validating versus non-validating parsers Validating parsers validate XML documents as they parse them Non-validating parsers ignore any validation errors Parsers that support the Document Object Model(DOM) Parsers that support the Simple API for XML(SAX) Making XML Application
30
SNU OOPSLA Lab. The ubiquitous XML 30 DOM Parser Tree structure that contains all of the elements of a document Provides a variety of functions to examine the contents and structure of the document Making XML Application
31
SNU OOPSLA Lab. The ubiquitous XML 31 SAX Parser Generates events at various points in the document It’s up to you to decide what to do with each of those events Making XML Application
32
SNU OOPSLA Lab. The ubiquitous XML 32 DOM vs SAX Why use DOM? Need to know a lot about the structure of a document Need to move parts of the document around Need to use the information in the document more than once Why use SAX? Only need to extract a few elements from an XML document Making XML Application
33
SNU OOPSLA Lab. The ubiquitous XML 33 DOM DOM interfaces Node : The base data type of the DOM. Element : The vast majority of the objects you’ll deal with are Elements. Attr : Represents an attribute of an element. Text : The actual content of an Element or Attr. Document : Represents the entire XML document. Making XML Application
34
SNU OOPSLA Lab. The ubiquitous XML 34 Common DOM methods getDocumentElement() Returns the root element of the document. getFirstChild() and getLastChild() Returns the first or last child of a given Node. getNextSibling() and getPreviousSibling() These methods return the next or previous sibling of a given Node) getAttribute(attrName) For a given Node, returns the attribute with the requested name - Document Class - Node Class Making XML Application
35
SNU OOPSLA Lab. The ubiquitous XML 35 Our first DOM Application! Shakespeare William British 1564 1616 Sonnet 130 My mistress’s eyes are … Sonnet.xml First Application simply reads an XML document and writes the document’s contents to standard output Parse the sonnet.xml Making XML Application
36
SNU OOPSLA Lab. The ubiquitous XML 36 domOne to Watch Over Me public class domOne { public void parseAndPrint(String uri)... public void printDOMTree(Node node)... public static void main(String argv[])... domOne.java Create a new class called domOne It has two methods, parseAndPrint and printDOMTree In main method process the command line, create a domOne object, pass the file name to domOne object domOne object creates a parser object, parses the document, then process the DOM tree via the printDOMTree method Making XML Application
37
SNU OOPSLA Lab. The ubiquitous XML 37 Create a domOne object public static void main(String argv[]) { if (argv.length == 0) { System.out.println("Usage:... ");... System.exit(1); } domOne d1 = new domOne(); d1.parseAndPrint(argv[0]); } Sonnet.xml Create a separate class called domOne To parse the file and print the results, create a new instance of the domOne class Use a recursive function to go through the DOM tree and print out the results Making XML Application
38
SNU OOPSLA Lab. The ubiquitous XML 38 Create a parser object try { DOMParser parser = new DOMParser(); parser.parse(uri); doc = parser.getDocument(); } In a parseAndPrint method Create a new Parser object using a DOMParser object DOMParser object : a java class that implements the DOM interface Exception An invalid URI, a DTD that can’t be found, or an XML document that isn’t valid or well-formed Making XML Application
39
SNU OOPSLA Lab. The ubiquitous XML 39 Parse the XML document try { DOMParser parser = new DOMParser(); parser.parse(uri); doc = parser.getDocument(); }... if (doc != null) printDOMTree(doc); Parsing the document is don with a single line of code Get the Document object created by the parser Pass it the printDOMTree Method Making XML Application
40
SNU OOPSLA Lab. The ubiquitous XML 40 Process the DOM tree public void printDOMTree(Node node) { int nodeType = Node.getNodeType(); switch (nodeType) { case DOCUMENT_NODE: printDOMTree(((Document)node).GetDocumentElement());... case ELEMENT_NODE:... NodeList children = node.getChildNodes(); if (children != null) { for(int i =0; i < children.getLength(); i++) printDOMTree(children.item(i); } Call the printDOMTree recursively for each of the node’s children Making XML Application
41
SNU OOPSLA Lab. The ubiquitous XML 41 Nodes a-plenty Document Statistics for sonnet.xml: ==================================== Document Nodes:1 Element Nodes:23 Entity Reference Nodes:0 CDATA Sections: 0 Text Nodes:45 Processing Instructions:0 ---------- Total: 69 Nodes Just run domCounter program that counts the number of nodes In sonnet.xml, there are twenty-four tags. Why not twenty-four nodes? There are actually 69 nodes in sonnet.xml; one document node, 23 element nodes, and 45 text nodes. Making XML Application
42
SNU OOPSLA Lab. The ubiquitous XML 42 Sample node listing Shakespeare 1. The Document node 2. The Element node corresponding to the tag 3. A Text node containing the carriage return at the end of the tag and the two spaces in front of the tag 4. The Element node corresponding to the tag 5. A Text node containing the carriage return at the end of the tag and the four spaces in front of the tag 6. The Element node corresponding to the tag 7. A Text node containing the characters “Shakespeare” The nodes returned by the parser All of the blank spaces at the start of the lines at the left are Text Making XML Application
43
SNU OOPSLA Lab. The ubiquitous XML 43 Brief : DOM Believe it or not, that’s about all you need to know to work with DOM objects. Our domOne code did several things: Created a Parser object Gave the Parser an XML document to parse Took the Document object from the Parser and examined it Making XML Application
44
SNU OOPSLA Lab. The ubiquitous XML 44 A wee listing of SAX events startDocument Signals the start of the document. endDocument Signals the end of the document. startElement Signals the start of an element. endElement Signals the end of an element. Characters Contains character data, similar to a DOM Text node. Making XML Application
45
SNU OOPSLA Lab. The ubiquitous XML 45 SAX interfaces The SAX API actually defines four interfaces for handling events EntityHandler TDHandler DocumentHandler ErrorHandler All of these interfaces are implemented by HandlerBase. Making XML Application
46
SNU OOPSLA Lab. The ubiquitous XML 46 Our first SAX Application! Shakespeare William British 1564 1616 Sonnet 130 My mistress’s eyes are … Sonnet.xml This application is similar to domOne, except it uses the SAX API instead of DOM Parse the sonnet.xml Making XML Application
47
SNU OOPSLA Lab. The ubiquitous XML 47 SAX method in saxOne.java public class saxOne extends HandlerBase { public void startDocument()... public void startElement(String name, AttributeList attrs)... public void characters(char ch[], int start, int length)... public void ignorableWhitespace(char ch[],int start, int length)... public void endElement(String name)... public void endDocument()... public void warning(SAXParseException ex)... public void error(SAXParseException ex)... public void fatalError(SAXParseException ex) throws SAXException saxOne.java SAX methods that handle SAX events Making XML Application
48
SNU OOPSLA Lab. The ubiquitous XML 48 Create a saxOne object Create a separate class called saxOne The main procedure creates an instance of this class and uses it to parse the XML document saxOne extends the HandlerBase class, we can use saxOne as an event handler for a SAX parser public static void main(String argv[]) { if (argv.length == 0) { System.out.println("Usage:... ");... System.exit(1); } saxOne s1 = new saxOne(); s1.parseURI(argv[0]); } Making XML Application
49
SNU OOPSLA Lab. The ubiquitous XML 49 Create a Parser object It first creates a new Parser object In this sample, we use the SAXParser class instead of DOMParser setDocumentHandler and setErrorHandler tell our newly-created SAXParser to use saxOne to handle events SAXParser parser = new SAXParser(); parser.setDocumentHandler(this); parser.setErrorHandler(this); try { parser.parse(uri); } Making XML Application
50
SNU OOPSLA Lab. The ubiquitous XML 50 Parse the XML document Once our SAXParser object is set up, it takes a single line of code to process our document. SAXParser parser = new SAXParser(); parser.setDocumentHandler(this); parser.setErrorHandler(this); try { parser.parse(uri); } Making XML Application
51
SNU OOPSLA Lab. The ubiquitous XML 51 Process SAX events public void startDocument()... public void startElement(String name, AttributeList attrs)... public void characters(char ch[], int start, int length)... public void ignorableWhitespace(char ch[],int start, int length)... As the SAXParser object parses our document, it calls our implementations of the SAX event handlers as the various SAX events occur. Each event handler writes the appropriate information to System.out Ex) For startElement events, we write the XML syntax of the original tag out to the screen. Making XML Application
52
SNU OOPSLA Lab. The ubiquitous XML 52 A cavalcade of ignorable events Document Statistics for sonnet.xml: ==================================== DocumentHandler Events: startDocument1 endDocument1 startElement23 endElement23 processingInstruction0 character20 ignorableWhitespace25 ErrorHandler Events: warning0 error0 fatalError0 ---------- Total: 93 events The SAX interface returns more events than you might think One advantage of the SAX interface is that the twenty-five ignorableWhitespace events are simply ignored We don’t have to write code to handle those events Making XML Application
53
SNU OOPSLA Lab. The ubiquitous XML 53 Sample event listing Shakespeare 1. A startDocument event 2. A startElement event for the element 3. An ignorableWhitespace event for the line break and the two blank spaces in front of the tag 4. A startElement event for the element 5. An ignorableWhitespace event for the line break and the four blank spaces in front of the tag 6. A startElement event for the tag 7. A character event for the characters “Shakespeare” 8. An endElement event for the tag The events returned by the parser Making XML Application
54
SNU OOPSLA Lab. The ubiquitous XML 54 SAX vs DOM – part one Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. Many a brave soul did it send hurrying down to Hades, and many a hero did it yield a prey to dogs and vultures, for so were the counsels of Jove fulfilled from the day on which the son of Atreus, king of men, and great Achilles, first fell out with one another. And which of the gods was it that set them on to quarrel? It was the son of Jove and Leto; for he was angry with the king and sent a pestilence upon... SAX API would be much more efficient Doing this with the DOM would take a lot of memory Making XML Application
55
SNU OOPSLA Lab. The ubiquitous XML 55 SAX vs DOM – part one... Mrs. Mary McGoon 1401 Main Street Anytown NC 34829... If we were parsing an XML document containing 10,000 address, and we wanted to sort them by last name?? DOM would automatically store all of the data We could use DOM functions to move the nodes n the DOM tree Making XML Application
56
SNU OOPSLA Lab. The ubiquitous XML 56 Brief : SAX At this point, we’ve covered the two major APIs for working with XML documents We’ve also discussed when you might want to use each one Thinks some advanced parser functions that you might need as you build an XML application Making XML Application
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.