XML: eXtensible Markup Language Creating portable data
Java E-Commerce © Martin Cooke, 2003 Plan DOM & SAX Processing XML in Java Transforming XML with XSL 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Reminder: XML example <?xml version="1.0" encoding="UTF-8"?> <Curriculum> <Course Title="Z" Lect="Bogdanov"> <Lecture>Props</Lecture> <Lecture>Predicates</Lecture> <Lecture>Sets</Lecture> </Course> </Curriculum> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
SAX & DOM
Document object model (DOM) A means of representing an XML document within a program Not tied to Java: cross-language, cross-platform Programmers use DOM through APIs Parsing XML results in a DOM tree Tree can be traversed, searched, mutated, output Loads entire document into memory slow hungry DOM read/write model (can generate XML doc or modify XML doc) creates model of document in memory for later use slower 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Simple API for XML (SAX) Another way to process XML Incremental parsing Works via parser callbacks Programmer registers handlers with parser As XML is being parsed, these handlers are called Does not allow random access to XML doc Fast and efficient Not mutable (unless you build a copy) Good model for transformations (see later) SAX read-only model event-based e.g. when a new <ELEMENT> is read in, a method is called to process it fast 28/02/2019 Java E-Commerce © Martin Cooke, 2003
The view from Java
Java E-Commerce © Martin Cooke, 2003 JDOM Java-centric, high-performance alternative to DOM & SAX Far simpler in use Provides high performance of SAX Rapid parsing and output Provides document model of DOM without memory problems Mutable, random-access Integrated with Java Collections framework: Methods return List, Map etc JDOM all the benefits of DOM but fast 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Parsing XML with JDOM // Get a document builder SAXBuilder builder = new SAXBuilder(); // use it to read/parse an XML document Document doc = builder.build(new File(args[0])); // get and print the root element Element root = doc.getRootElement(); System.out.println("Root is " + root); // now get children whose name matches "Course" List children = root.getChildren("Course"); Iterator i = children.iterator(); while (i.hasNext()) { Element child = (Element) i.next(); System.out.println(" --> "+child); } 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Parsing XML with JDOM SaxBuilder implements the Builder interface, which mandates several build methods, from InputStream File URL Can also use DOMBuilder (slower) Builder pattern allows client to contruct a complex object by specifying its type and content, and without having to worry about the internals // Get a document builder SAXBuilder builder = new SAXBuilder(); // use it to read/parse an XML document Document doc = builder.build(new File(args[0])); // get and print the root element Element root = doc.getRootElement(); System.out.println("Root is " + root); // now get children whose name matches "Course" List children = root.getChildren("Course"); Iterator i = children.iterator(); while (i.hasNext()) { Element child = (Element) i.next(); System.out.println(" --> "+child); } 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Parsing XML with JDOM // Get a document builder SAXBuilder builder = new SAXBuilder(); // use it to read/parse an XML document Document doc = builder.build(new File(args[0])); // get and print the root element Element root = doc.getRootElement(); System.out.println("Root is " + root); // now get children whose name matches "Course" List children = root.getChildren("Course"); Iterator i = children.iterator(); while (i.hasNext()) { Element child = (Element) i.next(); System.out.println(" --> "+child); } Element is fundamental class in JDOM 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Parsing XML with JDOM // Get a document builder SAXBuilder builder = new SAXBuilder(); // use it to read/parse an XML document Document doc = builder.build(new File(args[0])); // get and print the root element Element root = doc.getRootElement(); System.out.println("Root is " + root); // now get children whose name matches "Course" List children = root.getChildren("Course"); Iterator i = children.iterator(); while (i.hasNext()) { Element child = (Element) i.next(); System.out.println(" --> "+child); } Many JDOM methods return Collection objects such as List 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Parsing XML with DTD validation // Get a document builder SAXBuilder builder = new SAXBuilder(true); // use it to read/parse an XML document Document doc = builder.build(new File(args[0])); // get and print the root element Element root = doc.getRootElement(); System.out.println("Root is " + root); // now get children whose name matches "Course" List children = root.getChildren("Course"); Iterator i = children.iterator(); while (i.hasNext()) { Element child = (Element) i.next(); System.out.println(" --> "+child); } 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Output // Get a document builder SAXBuilder builder = new SAXBuilder(); // use it to read/parse an XML document Document doc = builder.build(new File(args[0])); // get and print the root element Element root = doc.getRootElement(); System.out.println("Root is " + root); // now get children whose name matches "Course" List children = root.getChildren("Course"); Iterator i = children.iterator(); while (i.hasNext()) { Element child = (Element) i.next(); System.out.println(" --> "+child); } Root is [Element: <Curriculum />] --> [Element: <Course />] 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Generating XML Element curriculum = new Element("Curriculum"); Document doc = new Document(curriculum); Element course = new Element("Course"); course.addAttribute("Title","Java"); course.addAttribute("Lect","Brown"); Element lecture1 = new Element("Lecture"). setText("Overview"); Element lecture2 = new Element("Lecture"). setText("Basics"); course.addContent(lecture1); course.addContent(lecture2); curriculum.addContent(course); XMLOutputter out = new XMLOutputter(" ",true); out.output(doc,System.out); Creating a new XML document is simply a matter of generating a root element Note that this omits the try catch 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Generating XML Element curriculum = new Element("Curriculum"); Document doc = new Document(curriculum); Element course = new Element("Course"); course.addAttribute("Title","Java"); course.addAttribute("Lect","Brown"); Element lecture1 = new Element("Lecture"). setText("Overview"); Element lecture2 = new Element("Lecture"). setText("Basics"); course.addContent(lecture1); course.addContent(lecture2); curriculum.addContent(course); XMLOutputter out = new XMLOutputter(" ",true); out.output(doc,System.out); Similarly, straightforward to produce elements with attributes Note that this omits the try catch 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Generating XML Element curriculum = new Element("Curriculum"); Document doc = new Document(curriculum); Element course = new Element("Course"); course.addAttribute("Title","Java"); course.addAttribute("Lect","Brown"); Element lecture1 = new Element("Lecture"). setText("Overview"); Element lecture2 = new Element("Lecture"). setText("Basics"); course.addContent(lecture1); course.addContent(lecture2); curriculum.addContent(course); XMLOutputter out = new XMLOutputter(" ",true); out.output(doc,System.out); … add content Note that this omits the try catch 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Generating XML Element curriculum = new Element("Curriculum"); Document doc = new Document(curriculum); Element course = new Element("Course"); course.addAttribute("Title","Java"); course.addAttribute("Lect","Brown"); Element lecture1 = new Element("Lecture"). setText("Overview"); Element lecture2 = new Element("Lecture"). setText("Basics"); course.addContent(lecture1); course.addContent(lecture2); curriculum.addContent(course); XMLOutputter out = new XMLOutputter(" ",true); out.output(doc,System.out); … and to output the document Arguments to XMLOutputter ensure indenting and pretty-printing Can also output DOM SAX events Note that this omits the try catch 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Adding a DTD Element curriculum = new Element("Curriculum"); Document doc = new Document(curriculum); doc.setDocType (new DocType("Curriculum","Curric.DTD")); Element course = new Element("Course"); course.addAttribute("Title","Java"); course.addAttribute("Lect","Brown"); Element lecture1 = new Element("Lecture"). setText("Overview"); Element lecture2 = new Element("Lecture"). setText("Basics"); course.addContent(lecture1); course.addContent(lecture2); curriculum.addContent(course); XMLOutputter out = new XMLOutputter(" ",true); out.output(doc,System.out); Note that this omits the try catch 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Nested generation Source: JDOM slides 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Accessing nested elements Source: JDOM slides 28/02/2019 Java E-Commerce © Martin Cooke, 2003
JDOM likely to be part of Java core Source: http://www.jdom.org/downloads/oraoscon01-jdom.ppt 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Suggested homework Download and familiarise yourself with JDOM from jdom.org Note that this omits the try catch 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Transforming XML
Java E-Commerce © Martin Cooke, 2003 Motivation XML is useful in itself as a means of storing and transmitting data of all kinds… … but it can be made more useful via transformation Why? Store in one format, display in another Deliver same content in different forms (html, pdf, wml, different XML) 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 How? Option 1: DOM manipulation in a programming language such as Java Gives complete control 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 How? Option 1: DOM manipulation in a programming language such as Java Gives complete control Option 2: Using stylesheets Specifically, using the eXtensible Stylesheet Language for Transformations, XSLT Usually simpler (if you know XSLT), especially for XML-to-XML transformation 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 XSLT XSLT (extensible stylesheet language for transformation) Subset of XSL, but frequently use XSL to refer to XSLT Odd at first to consider a transformation as a ‘style’ application 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Basic idea The transformation takes an input tree and outputs a tree <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Curriculum SYSTEM "Curric.DTD"> <Curriculum> <Course Title="Z" Lect="Kyrill Bogdanov"> <Lecture>Propositions</Lecture> <Lecture>Predicates</Lecture> <Lecture>Sets</Lecture> </Course> <Course Title="UML" Lect="Marian Gheorge"> <Lecture>Use Cases</Lecture> <Lecture>Class Diagrams</Lecture> </Curriculum> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Basic idea <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Curriculum SYSTEM "Curric.DTD"> <Curriculum> <Course Title="Z" Lect="Kyrill Bogdanov"> <Lecture>Propositions</Lecture> <Lecture>Predicates</Lecture> <Lecture>Sets</Lecture> </Course> <Course Title="UML" Lect="Marian Gheorge"> <Lecture>Use Cases</Lecture> <Lecture>Class Diagrams</Lecture> </Curriculum> The transformation consists of a set of transformation rules, each of which focuses on one level of the tree Each rule 1. Matches node in tree 2. Specifies structure of output 3. Indicates action for child nodes Finally, the results of all these rule applications are assembled to form the output tree 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Example rule <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <html> <body> <h1> A simple transformation </h1> </body> </html> </xsl:template> </xsl:stylesheet> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Example rule The stylesheet is an XML document ALL content must be well-formed XML (even generated HTML) <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <html> <body> <h1> A simple transformation </h1> </body> </html> </xsl:template> </xsl:stylesheet> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Example rule The stylesheet is an XML document ALL content must be well-formed XML (even generated HTML) The xsl namespace allows the processor to distinguish stylesheet infrastructure from the rest A namespace is an XML concept which prevents naming clashes It also helps preprocessors to extract certain elements Eg mathML elements <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <html> <body> <h1> A simple transformation </h1> </body> </html> </xsl:template> </xsl:stylesheet> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Example rule Outputs <html> <body> <h1> A simple transformation </h1> </body> </html> <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <html> <body> <h1> A simple transformation </h1> </body> </html> </xsl:template> </xsl:stylesheet> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Example rule Outputs <html> <body> <h1> A simple transformation </h1> </body> </html> <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <html> <body> <h1> A simple transformation </h1> </body> </html> </xsl:template> </xsl:stylesheet> Match attribute selects a level of the input tree: here, the root No further rules to match, so default is to copy everything 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Apply-templates <xsl:template match="Curriculum"> <html> <body> <h1>Curriculum</h1> <ul> <xsl:apply-templates/> </ul> </body> </html> </xsl:template> <xsl:template match="Course"> <li> <xsl:value-of select="@Title"/></li> <html> <body> <h1>Curriculum</h1> <ul> <li>Z</li> <li>UML</li> </ul> </body> </html> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Apply-templates <xsl:template match="Curriculum"> <html> <body> <h1>Curriculum</h1> <ul> <xsl:apply-templates/> </ul> </body> </html> </xsl:template> <xsl:template match="Course"> <li> <xsl:value-of select="@Title"/></li> <html> <body> <h1>Curriculum</h1> <ul> <li>Z</li> <li>UML</li> </ul> </body> </html> The apply-templates rule looks for appropriate templates to apply (here, Course) value-of extracts attribute values 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Loops and numbering <xsl:template match="Curriculum"> <html> <body> <h1>Curriculum</h1> <xsl:for-each select="Course"> <xsl:number value="position()"/>. <xsl:value-of select="@Title"/> <ul> <xsl:apply-templates/> </ul> </xsl:for-each> </body> </html> </xsl:template> <xsl:template match="Lecture"> <li><xsl:value-of select="."/></li> 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Loops and numbering <xsl:template match="Curriculum"> <html> <body> <h1>Curriculum</h1> <xsl:for-each select="Course"> <xsl:number value="position()"/>. <xsl:value-of select="@Title"/> <ul> <xsl:apply-templates/> </ul> </xsl:for-each> </body> </html> </xsl:template> <xsl:template match="Lecture"> <li><xsl:value-of select="."/></li> Don’t have to use separate templates. Here, Course is matched using select select=“.” matches the context (current) node Many built-in functions such as position() 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Match patterns element Match=“Course” attribute Match=“@Lecturer1” alternatives Match=“Lecture | Tutorial” absolute Match=“/*/Course” Elements in context Match=“Course/Lecture” 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 How-to Download Xalan from Apache Various options, but simplest for testing is to add xalan.jar and xerxes.jar to java classpath from command line: java org.apache.xalan.xslt.Process -IN foo.xml -XSL foo.xsl -OUT foo.out 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Other means of applying XSLT stylesheets On the client browser (IE5 implements XSLT) From a servlet (see later), using a Transformer instance As part of a web publishing framework eg. Cocoon from Apache 28/02/2019 Java E-Commerce © Martin Cooke, 2003
More complex transforms Some conversions need more support Eg XML to PDF Formatting Objects to PDF (FOP) 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Summary Good support in Java for XML JDOM Transformation of XML is often most easily accomplished using XSL, but can be done programmatically Web publishing frameworks exist to simplify document delivery to multiple devices 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Resources
Java E-Commerce © Martin Cooke, 2003 Books Ray (2001) Learning XML, O’Reilly, 0-596-00046-4 McLaughlin (2000) Java & XML, O’Reilly, 0-596-00016-2 Check if later version 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Online documents XML Tutorials for Programmers http://www-106.ibm.com/developerworks/education/tutorial-prog/abstract.html (online XML parser -- requires registration) Transforming XML to PDF http://www-106.ibm.com/developerworks/education/transforming-xml/xmltopdf Why XML? http://www.w3.org/XML/1999/XML-in-10-points XSL for fun and diversion http://www-106.ibm.com/developerworks/library/hands-on-xsl/ Simplify XML programming with JDOM http://www-106.ibm.com/developerworks/java/library/j-jdom/ Easy Java/XML integration with JDOM, Part 1 http://www.javaworld.com/jw-05-2000/jw-0518-jdom.html Tip: Using JDOM and XSLT http://www-106.ibm.com/developerworks/java/library/x-tipjdom.html 28/02/2019 Java E-Commerce © Martin Cooke, 2003
Java E-Commerce © Martin Cooke, 2003 Websites http://www.xml.org http://www.jdom.org 28/02/2019 Java E-Commerce © Martin Cooke, 2003