Document Object Model
The XML DOM (Document Object Model) defines a standard way for accessing and manipulating XML documents. The DOM presents an XML document as a tree structure, with elements, attributes, and text as nodes.
DOM is a standard defined by the W3C, just like XML DOM was not designed specifically for Java technology (unlike SAX) DOM is cross-platform and cross language Uses OMG’s IDL to define interfaces IDL to language binding
Access XML document as a tree structure Composed of mostly element nodes and text nodes Can “walk” the tree back and forth Larger memory requirements Fairly heavyweight to load and store Use it when for walking and modifying the tree
XML Document Parser Creates Tree Input
XML document is represented as a tree A tree is made of nodes There are 12 different node types Nodes may contain other nodes (depending on node types) parent node contains child nodes
Document node Document Fragment node Element node Attribute node Text node Comment node Processing instruction node Document type node Entity node Entity reference node CDATA section node Notation node
A document node contains one element node (root element node) one or more processing instruction nodes An element node may contain other element nodes one or more text nodes one or more attribute nodes An attribute node contain a text node
Alan Turing computer scientist
XML Document node element node “people” element node “person” element node “name” element node “first_name” * text node “Alan” » element node “last_name” * text node “Turing” element node “profession” » text node “computer scientist” attribute node “born” » text node “1912”
All modern browsers have a build-in XML parser that can be used to read and manipulate XML. The parser reads XML into memory and converts it into an XML DOM object that can be accesses with JavaScript
xmlDoc=new ActiveXObject ("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.load(“people.xml"); Code explained: The first line creates an empty Microsoft XML document object. The second line turns off asynchronized loading, to make sure that the parser will not continue execution of the script before the document is fully loaded. The third line tells the parser to load an XML document called “people.xml".
Programming Interface The DOM models XML as a set of node objects. The nodes can be accessed with JavaScript or other programming languages. The programming interface to the DOM is defined by a set standard properties and methods. Properties are often referred to as something that is (i.e. nodename is “name"). Methods are often referred to as something that is done (i.e. delete “name").
XML DOM Properties These are some typical DOM properties: x.nodeName - the name of x x.nodeValue - the value of x x.parentNode - the parent node of x x.childNodes - the child nodes of x x.attributes - the attributes nodes of x
XML DOM Methods x.getElementsByTagName( name ) - get all elements with a specified tag name x.appendChild( node ) - insert a child node to x x.removeChild( node ) - remove a child node from x
Accessing Nodes You can access a node in three ways: 1. By using the getElementsByTagName() method 2. By looping through (traversing) the nodes tree. 3. By navigating the node tree, using the node relationships
The JavaScript code to get the text from the first element in people.xml: txt=xmlDoc.getElementsByTagName(“name")[ 0].childNodes[0].nodeValue
After the execution of the statement, txt will hold the value “Alan" Explained: xmlDoc - the XML DOM object created by the parser. getElementsByTagName(“Name")[0] - the first element childNodes[0] - the first child of the element (the text node) nodeValue - the value of the node (the text itself)
DOM Node List Length The length property defines the length of a node list (the number of nodes). You can loop through a node list by using the length property: xmlDoc=loadXMLDoc(“people.xml"); x=xmlDoc.getElementsByTagName(“name"); for (i=0;i<x.length;i++) { document.write(x[i].childNodes[0].nodeValue); document.write(" "); }
Example explained: Load “people.xml" into xmlDoc.xml Get all element nodes For each title element, output the value of its text node
Node Types The documentElement property of the XML document is the root node. The nodeName property of a node is the name of the node. The nodeType property of a node is the type of the node.
xmlDoc=loadXMLDoc(“people.xml"); x=xmlDoc.documentElement.childNodes; for (i=0;i<x.length;i++) { if (x[i].nodeType==1) {//Process only element nodes (type 1) document.write(x[i].nodeName); document.write(" "); }
Example explained: Load “people.xml" into xmlDoc. Get the child nodes of the root element For each child node, check the node type of the node. If the node type is "1" it is an element node Output the name of the node if it is an element node
Get the Value of an Element xmlDoc=loadXMLDoc(“people.xml"); x=xmlDoc.getElementsByTagName("title")[0].c hildNodes[0]; txt=x.nodeValue; x.nodeValue=“charles";
The nodeType Property The nodeType property specifies the type of node. nodeType is read only. The most important node types are: