The XML Document Object Model (DOM) Aug’10 – Dec ’10
Introduction to XML DOM used by programmers as a way to manipulate the content of an XML document used by programmers as a way to manipulate the content of an XML document This chapter covers the following ❑ The purpose of the XML Document Object Model ❑ How the DOM specification was developed at the W3C ❑ Important XML DOM interfaces and objects ❑ How to add and delete elements and attributes from a DOM and manipulate a DOM tree in other way Aug’10 – Dec ’10
Purpose of the XML DOM provides an interface to create XML documents create XML documents to navigate them to navigate them to add, modify, or delete parts of XML docs while they are held in memory to add, modify, or delete parts of XML docs while they are held in memory Aug’10 – Dec ’10
XML Parsing Parsing means taking a stream of characters and producing an internal representation conforming to a predetermined structure. the result is an in-memory model of the XML, known as the XML DOM Raises issues when serializing an XML DOM Some features of the input are lost after parsing, like XML declaration and its specified encoding whether attributes were quoted with ‘ ‘ or “ “ Aug’10 – Dec ’10
two differently constructed documents, once parsed, can yield the same Infoset (XML Information set) two differently constructed documents, once parsed, can yield the same Infoset (XML Information set) Serialization is the process of storing an object’s state to a permanent form, such as a file or a database, or converting it to a form that can be transmitted between machines Deserialization – opposite of serialization The DOM can be created directly in memory using the appropriate DOM methods such as createElement() and appendChild(). Aug’10 – Dec ’10
DOM concepts: the XML DOM representation: equivalent to a hierarchical treelike structure consisting of nodes XML document itself: Document node DOM Document node : at the apex of a tree Document node differs significantly from an XPath root node XML parser checks for well-formedness and,optionally, validity of the document. The XML DOM may then be constructed as an in-memory representation of the XML document Aug’10 – Dec ’10
DOM Document Aug’10 – Dec ’10 Root node Document Element (documentElem ent) Element (firstChildEleme nt) EleAttribute (attributeNode) Text node Element (secondChildEle ment) Text Node Comment (Example comment)
Serialized version <documentElement> Text Node Text Node </documentElement> Aug’10 – Dec ’10
Interfaces and Objects An interface is a more abstract concept than an object general concept – interface Specific Instance – object interface describes the properties and behavior of a class of objects (methods) Document interface defined in the XML DOM. documentElement property Document node- implements the Document interface Interface and Objects – same properties and methods Aug’10 – Dec ’10
The Document Object Model at the W3C main page for the DOM specification is XML DOM : logical model of an XML document XML DOM Level 1 provides an interface- implementation left to the creators a shared interface – improves productivity DOM Level 1 specification provided no common interface – only proprietary failed to provide a universal interface did not include a way to create an XML document DOM Level 1 specified language bindings -Java and ECMAScript For very large XML documents, the Simple API for XML (SAX), or.NET’s XmlReader are preferred Aug’10 – Dec ’10
DOM level 2 and level 3 DOM Level 2 added some new functionality DOM Level 2 added some new functionality support for namespaced elements support for namespaced elements The DOM Level 2 specification documents and their location can be found at In 2004 DOM Level 3 was finalized standards for URI handling, namespace resolution, and how the DOM maps to the XML Infoset Aug’10 – Dec ’10
XML DOM Implementations provides all interfaces described in a particular level of the DOM specification free to provide additional interfaces Two Ways to View DOM Nodes 1. hierarchy of Node objects 2. view the root of the tree as a Document node (or object) whose descendant nodes are objects of different specialized types Aug’10 – Dec ’10
Overview of the XML DOM root of the DOM hierarchy is always a Document node The child nodes of the Document node : DocumentType node, Element node, ProcessingInstruction nodes, and Comment nodes If fragment of an XML doc :DocumentFragment node child nodes of the DocumentFragment and Element node: Element nodes orElement nodes or Comment, ProcessingInstruction, Text, CDATASection, and EntityReferenceComment, ProcessingInstruction, Text, CDATASection, and EntityReference Aug’10 – Dec ’10
Attribute : Attr node associated with Element node but is not considered to be a child (compare with Xpath attributes) Entity node and EntityReference node Child nodes: Element, Comment,ProcessingInstruction, Text, CDATASection, and EntityReference Aug’10 – Dec ’10
Tools MSXML (Microsoft XML Core Services) a set of services that allow applications written in JScript, VBScript, and Microsoft development tools to build Windows- native XML-based applications a set of services that allow applications written in JScript, VBScript, and Microsoft development tools to build Windows- native XML-based applications Internet Explorer 5.0 or above – To run DOM examples Microsoft Jscript – To manipulate XML DOM Aug’10 – Dec ’10
MSXML3 msxmltest.htmlUsage new ActiveXObject - Dom Object created using MSXML3 loadXML - loads the XML file into the XML DOM object document.write - text will be displayed on the web page alert - text will be displayed in a pop up window Aug’10 – Dec ’10
Navigating to the document element Once the XML document is loaded into the DOM object, the documentElement can be accessed as, Document Object - represents the entire XML document - root of document tree and gives primary access to document data Document Element - returns the root node of the document - objXMLDOM.documentElement.nodeName - syntax uses period character to indicate properties or methods of an object Myxmltest2.html Aug’10 – Dec ’10
Node Object One way of viewing nodes in an XML DOM is as specializations of the Node object Node object has properties and methods that are also available on all other types of XML DOM node XML DOM Programming consists of : - retrieving and setting some of these properties directly - or using the methods defined in the interface to manipulate the object that instantiates the interface or related objects Aug’10 – Dec ’10
Node Object Node object of DOM Level 2 has 14 properties : ❑ attributes—This is a read-only property whose value is a NamedNodeMap object. ❑ childNodes—This is a read-only property whose value is a NodeList object. ❑ firstChild—This is a read-only property whose value is a Node object. ❑ lastChild—This is a read-only property whose value is a Node object. ❑ localName—This is a read-only property that is a String. Aug’10 – Dec ’10
Node Object ❑ namespaceURI—This is a read-only property whose value is a String. ❑ nextSibling—This is a read-only property whose value is a Node object. ❑ nodeName—This is the name of the node, if it has one, and its value is a String type. ❑ nodeType—This is a read-only property that is of type number. The number value of the nodeType property maps to the names of the node types mentioned earlier. ❑ nodeValue—This property is of type String. When the property is being set or retrieved, a DOMException can be raised. ❑ ownerDocument—This is a read-only property whose value is a Document object. Aug’10 – Dec ’10
Node Object ❑ parentNode—This is a read-only property whose value is a Node object. ❑ prefix—This property is a String. When the property is being set, a DOMException can be raised. ❑ previousSibling—This is a read-only property whose value is a Node Depending on the particular node object, there may not be a retrievable useful value for some properties made available by the Node interface. For example, Document object does not have a parent node Comment node has no attributes or child nodes Only text nodes and attributes have non-null nodeValue. Aug’10 – Dec ’10
Exploring Child Nodes objXMLDOM.documentElement.firstChild.nodeName - Uses the documentElement and firstChild properties to retrieve the name of the node that is the first child of the document element of the document. objXMLDOM.documentElement.firstChild.firstChild.nodeValue - Retrieves the value of the first child of the first child of the document element node in the document. ChildNodes.html Aug’10 – Dec ’10
Methods of the Node Object ❑ appendChild(newChild)—This method returns a Node object. The newChild argument is a Node object. This method can raise a DOMException object. ❑ cloneNode(deep)—This method returns a Node object. The deep argument is a Boolean value. If true, then all nodes underneath this node are also copied; otherwise, only the node itself. ❑ hasAttributes()—This method returns a Boolean value. It has no arguments. ❑ hasChildNodes()—This method returns a Boolean value. It has no arguments. ❑ insertBefore(newChild, refChild)—This method returns a Node object. The newChild and refChild arguments are each Node objects. This method can raise a DOMException object. Aug’10 – Dec ’10
Methods of the Node Object ❑ isSupported(feature, version)—This method returns a Boolean value. The feature and version arguments are each String values. ❑ normalize()—This method has no return value and takes no arguments. ❑ removeChild(oldChild)—This method returns a Node object. The oldChild argument is a Node object. This method can raise a DOMException object. ❑ replaceChild(newChild, oldChild)—This method returns a Node object. The newChild and oldChild arguments are each Node objects. This method can raise a DOMException object. Aug’10 – Dec ’10
Loading an XML Document loadXML() – supply literal characters equivalent to a well-formed XML document load() – load an existing XML document var objXMLDOM = new ActiveXObject(“Mxxml2.DOMDocument.3.0”); - A DOM Document node is created with no descendant nodes. objXMLDOM.load(“C:\\SimpleDoc.xml”); - XML document SimpleDoc.xml is loaded, its XML parsed and the appropriate node tree is created inside the objXMLDOM object. SimpleDoc.xml Aug’10 – Dec ’10
Deleting a Node var objToBeDeleted = objXMLDOM.documentElement.firstChild; If XML document is, <Book> This is Chapter 1 This is Chapter 1 This is Chapter 2 This is Chapter 2 This is Chapter 3 This is Chapter 3 </Book>objXMLDOM.documentElement.removeChild(objToBeDeleted);alert(objXMLDOM.xml); This deletes first of the three Chapter element nodes in the document. DeleteNode.html Aug’10 – Dec ’10
Adding new nodes createTextNode () - creates new Text node createElement() - creates new Element node appendChild() - add as child of an element node - if the node already has child node, appendChild() method adds new node as child after existing nodes insertBefore() - inserts new node before another element node AddNode.html Aug’10 – Dec ’10
Effect of text nodes xml:space attribute preserves white spaces This is chapter 1 This is chapter 1 </Book> objXMLDOM.documentElement.childNodes.length returns 7 If xml:space=“preserve” is not mentioned the number of child nodes returned is 3 WhiteSpace.html Aug’10 – Dec ’10
The NamedNodeMap Object A named node map is an unordered set of objects The attributes property of the Node object is a NamedNodeMap object. The NamedNodeMap object has a single property, the length property which is a Number value. The value of the length property indicates how many nodes are in the named node map. The NamedNodeMap object has 7 methods: ❑ getNamedItem(name)—This method returns a Node object. The name argument is a String value. ❑ getNamedItemNS(namespaceURI, localName)—This method returns a Node object. The namespaceURI and localName arguments are String values. Aug’10 – Dec ’10
The NamedNodeMap Object ❑ item(index)—This method returns a Node object. The index argument is a Number value. ❑ removeNamedItem(name)—This method returns a Node object. The name argument is a String value. This method can raise a DOMException object if the item doesn’t exist. ❑ removeNamedItemNS(namespaceURI, localName)—This method returns a Node object. The namespaceURI and localName arguments are String values. This method can raise aDOMException object if the item does not exist. ❑ setNamedItem(node)—This method returns a Node object. The node argument is a new Node. This method can raise a DOMException object. ❑ setNamedItemNS(node)—This is the same as setNamedItem except it handles namespaced nodes. Aug’10 – Dec ’10
Adding and Removing Attributes NamedNodeMap Interface is used to alter the values of attributes objXMLDOM.documentElement.lastChild.attributes;createAttribute()removeNamedItem()setNamedItem()ChangeAttributes.html Aug’10 – Dec ’10
The NodeList Object NodeList is a list of nodes childNodes property of the Node object has the value that is a NodeList. NodeList object can be used to process all child nodes of a specified node The NodeList object has one property, that is a read only property of type Number length NodeList object has one method, the item() method. Takes a single argument which is a number value and returns a Node object. item(3) returns fourth child node. Index starts from 0. Aug’10 – Dec ’10
The DOMException Object When an error occurs, an exception is thrown, which is caught by the exception handler. Eg : Syntax incorrect Property or method name specified wrong Trying to change the value of a read-only property DOMException.html Aug’10 – Dec ’10
The Document Interface Document Interface has three properties : ❑ documentElement—This read-only property returns an Element object. ❑ doctype—This read-only property is a DocumentType object, corresponding to a DOCTYPE declaration, if present, in the XML document. ❑ implementation—This read-only property is a DOMImplementation object. Aug’10 – Dec ’10
The Document Interface Document Interface has 14 methods : ❑ createAttribute(name)—This method returns an Attr object. The name argument is a String value. This method can raise a DOMException object. ❑ createAttributeNS(namespaceURI, qualifiedName)—This method returns an Attr object. The namespaceURI and qualifiedName arguments are String values. This method can raise a DOMException object if the name contains an invalid character. ❑ createCDATASection(data)—This method returns a CDATASection object. The data argument is a String value. ❑ createComment(data)—This method returns a Comment object. The data argument is a String value. ❑ createDocumentFragment()—This method takes no argument and returns a DocumentFragment object. Aug’10 – Dec ’10
The Document Interface ❑ createElement(tagName)—This method returns an Element object. The tagName argument is a String value. This method can raise a DOMException object if the name contains an invalid character. ❑ createElementNS(namespaceURI, qualifiedName)—This method returns an Element object. The namespaceURI and qualifiedName arguments are String values. This method can raise a DOMException object. ❑ createEntityReference(name)—This method returns an EntityReference object. The name argument is a String value. This method can raise a DOMException object if the name contains an invalid character. ❑ createProcessingInstruction(target, data)—This method returns a ProcessingInstruction object. The target and data arguments are each of type String. This method can raise a DOMException object if the target contains an invalid character. ❑ createTextNode(data)—This method returns a Text object. The data argument is a String value. Aug’10 – Dec ’10
The Document Interface ❑ getElementById(elementId)—This method returns an Element object. The elementId argument is a String value. ❑ getElementsByTagName(tagname)—This method returns a NodeList object. The tagname argument is a String value. ❑ getElementsByTagNameNS(namespaceURI, localName)—This method returns a NodeList object. The namespaceURI and localName arguments are String values. ❑ importNode(importedNode, deep)—This method returns a Node object. The importedNode argument is a Node object. The deep argument is a Boolean value. This method can raise a DOMException object. Aug’10 – Dec ’10