XML for E-commerce III Helena Ahonen-Myka
In this part... n Transforming XML n Traversing XML n Web publishing frameworks
Transforming XML
Extensible Stylesheet Language n a language for transforming XML documents n an XML vocabulary for specifying the formatting of XML documents
XSLT n specifies the conversion of a document from one format to another n XSLT transformation (stylesheet) is a valid XML document n based on hierarchical tree structure n mechanism for matching patterns within the original XML document and applying formatting to that data
XML Path Language (XPath) n a mechanism for referring to the wide variety of element and attribute names and values in an XML document n also non-validated documents have to be able to be transformed: DTD cannot be used to outline the structure n tree-based: specify the path to the element
XPath n specify the part relative to the current element being processed n …or relative to the root: reference an element that is not in the current element’s scope n …or using pattern matching: find an element whose parent is element E and which has a sibling element F
XPath: examples n Match the element named Book relative to the current element: n
XPath: Examples n Match the element named Contents nested within the Book element n n Match the Contents element using an absolute path: n
XPath: examples n Match the focus attribute of the current element: n n Match the focus attribute of the Chapter element: n
XPath: examples n Match any Para element with an Appendix ancestor element: n … select=”Appendix//Para” n id(”W11”) matches the element with unique ID ´W11´ n Para[1] matches any Para element that is the first Para child element of its parent
XPath: examples n Para[last()=1] matches any Para element that is the only Para child element of its parent n Items/Item[position()>1] matches any Item element that has an Items parent and that is not the first Item child of its parent
XPath n Because often the input document is not fixed, an Xpath expression can result in the evaluation of no input data, one input element or attribute, or multiple input elements or attributes n the result of evaluating an Xpath expression is referred to as a node set.
XSL stylesheet is an XML document n must be well-formed n must contain an XML declaration n must declare all the namespaces it uses n the XSL namespace (prefix xsl:) defines elements that are needed for performing transformations
Skeleton XSL stylesheet <xsl:stylesheet xmlns:xsl=” xmlns:JavaXML=” version=”1.0”>
XSL Template n locates a particular element or set of elements within the input XML document and applies a rule or set of rules to the resulting XML
Printing all the data: <xsl:stylesheet xmlns:xsl=” xmlns:JavaXML= ” version=”1.0”>
Notes: n JavaXML:Book is the root element n xsl:apply-templates tells the processor to match the children of the current element and apply their templates n each element has a default template, which contains xsl:apply-templates and printing the data content n result (prev slide): all the data of the document is printed (unformatted)
Generating HTML Here is my HTML page!
HTML Output Here is my HTML page! Java and XML Introduction What Is It? How Do I Use It?...
xsl:value-of element Produces:... Java and XML...
Looping and iteration n xsl:for-each Table of Contents
Conditional processing: if n xsl:if : nodes that conform to both an XPath expression and some user- defined criteria n only chapters with focus=Java:
Conditional processing: choose (Java Focus) (XML Focus)
Adding elements and attributes n xsl:element, xsl:attribute XML is great! Produces: is great!
Numbering
Sorting James Clark …
Sorting
Copying parts without transforming n sometimes a part should be passed as such, without any transformation n assume: JavaXML:Copyright contains some HTML formatting:
Formatting objects (e.g.for PDF) <fo:block font-size=”24pt” text-align-last=”centered” space-before.optimum=”24pt”> Produces: <fo:block font-size=”24pt” text-align=”centered” space-before.optimum=”24pt”> Java and XML
Traversing XML n In transforming documents, random access to an document is needed n SAX cannot look backward or forward n difficult to locate siblings and children n DOM: access to any part of the tree
DOM n Level 1: navigation of content within a document n Level 2: modules and options for specific content models, such as XML, HTML, and CSS n Level 3
DOM Java bindings n Interfaces and classes that define and implement the DOM n java-binding.html n bindings often included in the parser implementations (the parser generates a DOM tree)
Parsing using a DOM parser import org.apache.xerces.parsers.DOMParser; DOMParser parser = new DOMParser(); parser.parse(uri);
Output is important n the entire document is parsed and added into the output tree, before any processing takes place n handle: org.w3c.dom.Document object = one level above the root element in the document parser.parse(uri); Document doc = parser.getDocument();
Printing a document Private static void printTree(Node node) { switch (node.getNodeType()) { case Node.DOCUMENT_NODE: // Print the contents of the Document object break; case Node.ELEMENT_NODE: // Print the element and its attributes break; case Node.TEXT_NODE:...
…the Document node Case Node.DOCUMENT_NODE: System.out.println(” \n”); Document doc = (Document)node; printTree(doc.getDocumentElement()); break;
… elements Case Node.ELEMENT_NODE: String name= node.getNodeName(); System.out.print(”<” + name); // Print out attributes… (see next slide…) System.out.println(”>”); // recurse on each child NodeList children = node.getChildNodes(); if (children != null) { for (int i=0; i<children.getLength(); i++) { printTree(children.item(i)); } System.out.println(” ”);
… and their attributes case Node.ELEMENT_NODE: String name = node.getNodeName(); System.out.print(”<” + name); NamedNodeMap attributes = node.getAttributes(); for (int i=0; i<attributes.getLength(); i++) { Node current = attributes.item(i); System.out.print(” ” + current.getNodeName() + ”=\”” + current.getNodeValue() + ”\””); } System.out.println(”>”);...
…textual nodes case Node.TEXT_NODE: case Node.CDATA_SECTION_NODE: System.out.print(node.getNodeValue()); break;
Web publishing frameworks n See: chapter/ch09.html