Extensible MarkUp Language
AGENDA OVERVIEW OF XML DATA TYPE DEFINITION LANGUAGE XML SCHEMA XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER EXTENSIBLE SYTLESHEET TRANSFORMATIONS
OVERVIEW OF XML What is Markup language? Markup languages are designed for the processing, definition and presentation of text. The language specifies code for formatting, both the layout and style, within a textfile. The well known markup languages are HTML and XML. XML is a A framework for defining markup languages Each language is targeted at its own application domain with its markup tags. There is a common set of generic tools for processing XML documents
How is XML different from HTML? Markup languages generally combine two distinct functions of representing text (document) –the ‘look’ and the ‘structure’. HTML and XML have different sets of goals. While HTML was designed to display data and hence focused on the ‘look’ of the data, XML was designed to describe and carry data and hence focuses on ‘what data is’. HTML is about displaying data and XML is about describing data. HTML and XML are complementary to each other.
XML FEATURES XML can be used to create new languages. Ex: WML, VRML XML uses the concept of DTD (Document Type Definition) to describe data XML with DTD is self descriptive XML separates data from display formats XML can be used as a format to exchange data Data can be stored in either files or databases JAVA=Portable Programs XML=Portable Data
XML Syntax XML Syntax consists of XML Declaration XML Elements XML Attributes XML Declaration The first line of an XML document should always consist of an XML declaration defining the version of XML XML Element XML is a markup language that is used to store data in a self-explanatory manner. Making the data "self-explanatory" comes about by containing information in elements. If a piece of text is a title then it will be contained within a "title" element. XML Attributes Attributes are used to specify additional information about the element. An attribute for an element appears within the opening tag. The syntax for including an attribute in an element is:
SAMPLE APPLICATION XML for dummies introduction to xml Markup languages Features of XML XML syntax Elements must be enclosed in tags Elements must be properly nested
DOCUMENT TYPE DEFINITION LANGUAGE A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes. A DTD is associated with an XML document via a Document Type Declaration, which is a tag that appears near the start of the XML document. The declaration establishes that the document is an instance of the type defined by the referenced DTD. The declarations in a DTD are divided into an internal subset and an external subset
Internal DTD Declaration If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax: Example XML document with an internal DTD: <!DOCTYPE book[ ]> 1243 john
External DTD Declaration If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax: Example XML document with an external DTD: Tove Jani And the file "note.dtd" which contains the DTD:
XML SCHEMA An XML schema is a description of a type of XML document An XML schema describes the structure of an XML document. The XML Schema language is also referred to as XML Schema Definition (XSD). An XML Schema: defines elements that can appear in a document defines attributes that can appear in a document defines the order of child elements defines the number of child elements defines data types for elements and attributes
SCHEMA LOCATION ! In an instance document, the attribute xsi:schemaLocation <purchaseReport xmlns=" xmlns:xsi=" xsi:schemaLocation=" period="P3M" periodEnding=" ">
XML PARSERS Parser is breaking (a sentence) down into its component parts with an explanation of the form, function, and syntactical relationship of each part. Compilers parse text to identify the program elements and check that it conforms to the correct syntax. An XML parser is the piece of software that reads XML files and makes the information from those files available to applications and programming XML parser is a Software that reads an XML document, identifies all the XML tags and passes the data to the application All modern browsers have a build-in XML parser that can be used to read and manipulate XML. The parser reads XML into memory and converts it into an XML DOM object that can be accessed with JavaScript.
XML PARSERS DOM PARSERS 1) DOM Characteristics 2) DOM in Action 3) DOM Tree and Nodes 4) DOM Programming Procedures SAX PARSERS 1) SAX Features 2) SAX Operational Model 3) SAX Programming Procedures 4) Benefits Of SAX JAXB PARSERS 1) JAXB Design Goals 2) JAXB Binding Lifecycle 3) JAXB Runtime Operations 4) JAXB Programming
DOM PARSER DOM is cross-platform and cross language Uses OMG’s IDL to define interfaces IDL to language binding DOM CHARACTERISTICS Access XML document as a tree structure Composed of mostly element nodes and text nodes Can “walk” the tree back and forth Larger memory requirements Fairly heavyweight to load and store Use it when for walking and modifying the tree.
DOM IN ACTION
DOM TREE AND NODES XML document is represented as a tree A tree is made of nodes There are 12 different node types Node Types Document node Document Fragment node Element node Attribute node Text node Comment node Processing instruction node Document type node Entity node Entity reference node CDATA section node Notation node
Example XML Document Alan Turing DOM Tree Example XML Document node element node “people” element node “name” element node “first_name” text node “Alan” element node “last_name” text node “Turing”
Interfaces for DOM NodeList NamedNodeMap DOMImplementation Node Interface Primary data type in DOM Represents a single node in a DOM tree Every node is Node interface type Methods in Node Interface Useful Node interface methods public short getNodeType() public String getNodeName( ) public String getNodeValue( ) public NamedNodeMap getAttributes(); public NodeList getChildNodes( )
NodeList Interface Represents a collection of nodes Return type of getChildNodes() method of Node interface public interface NodeList { public Node item(int index); public int getLength(); NamedNodeMap Interface Represents a collection of nodes each of which can identified by name Return type of getAttributes() method of Node interface Document Interface Contains factory methods for creating other nodes(elements, text nodes) Method to get root element node
DocumentType Interface public interface DocumentType extends Node { public String getName(); public NamedNodeMap getEntities(); public NamedNodeMap getNotations(); public String getPublicId(); public String getSystemId(); public String getInternalSubset (); } Code Example case Node.PROCESSING_INSTRUCTION_NODE: System.out.println("<?" + node.getNodeName() + " " + node.getNodeValue() +
DOM Programming Procedures Create a parser object Set Features and Read Properties Parse XML document and get Document object Perform operations Traversing DOM Manipulating DOM Creating a new DOM Writing out DOM
CREATING A DOM OBJECT import org.w3c.dom.Document; import org.xml.sax.SAXException ; import java. io.IOException ; String xmlFile = "file:///xerces-1_3_0/data/personal. xml"; DOMParser parser = new DOMParser(); try { parser.parse(xmlFile); } catch (SAXException se) { se.printStackTrace(); } catch (IOException ioe) { ioe.printStackTrace(); } Document document = parser. getDocument
Generating A New DOM try { // Generate a new DOM tree Document doc= new DocumentImpl (); Element root = doc.createElement("person"); // Create Root Element Element item = doc.createElement("name"); // Create element item. appendChild( doc.createTextNode("Jeff") ); root.appendChild( item ); // atach element to Root element item = doc.createElement("height"); item. appendChild( doc.createTextNode("1.80" ) ); } catch ( Exception ex ) { ex.printStackTrace(); }
SAX PARSER Simple API for XML Started as community-driven project SAX Features Event-driven: You provide event handlers Fast and lightweight: Document does not have to be entirely in memory Sequential read access only One-time access Does not support modification of document
SAX Operational Model XML DOCUMENT PARSER PROVIDED HANDLER Input Events
SAX Programming Procedures
SAX Event Handlers
SAX Parser Example XMLReader parser = null; -- try { // Create XML (non-validating) parser parser = XMLReaderFactory.createXMLReader(); // Create event handler myContentHandler handler = new myContentHandler(); parser.setContentHandler(handler); // Call parsing method parser.parse(args[0]); } catch(SAXException ex){ System.err.println(ex.getMessage()); } catch(Exception ex){ System.err.println(ex.getMessage()); }
SAX Event Handler class myContentHandler implements ContentHandler { // ContentHandler methods public void startDocument(){ System.out.println(“XML Document START”); } public void endDocument() { System.out.println(“XML Document END”); } public void startElement(String namespace, String name, String qName, Attributes atts) { System.out.println(“ ”); } public void endElement(String namespace, String name, String qName) { System.out.println(“ ); } public void characters(char[] chars, int start, int length){ System.out.println(new String(chars, start, length);}
Benefits of SAX It is very simple It is very fast Useful when custom data structures are needed to model the XML document Can parse files of any size without impacting memory usage Drawbacks of SAX SAX provides read-only access No random access to documents Searching of documents is not easy
JAXB PARSER Provides an efficient and standard way of mapping between XML and Java code Programmers don't have to create application Java objects anymore themselves Programmers do not have to deal with XML structure, instead deal with meaning business data JAXB Design Goals Easy to use : Don't have to deal with complexities of SAX and DOM Customizable : Allows keeping pace with schema evolution Portable: JAXB components can be replaced without having to make significant changes to the rest of the source code
How to Use JAXB Develop or obtain XML schema Generate the Java source files Develop JAXB client application Compile the Java source codes With the classes and the binding framework and write Java applications that: 1) Build object trees representing XML data
JAXB Binding Lifecycle
JAXB Runtime Operations Provide the following functionality for schema derived classes Unmarshal Process (access or modify) Marshal Validation A factory generates Unmarshaller, Marshaller and Validator instances for JAXB technology based applications Pass content tree as parameter to Marshaller and Validator instances
JAXB PROGRAMMING
EXTENSIBLE STYLESHEET TRANSFORMATION (XSLT) Extensible Stylesheet Language (XSL)is a language for expressing stylesheets XSL is made of two parts: XSL Transformation (XSLT) XSL Formatting Objects (XSL-FO) Viewpoints of XML Presentation Oriented Publishing (POP): Useful for Browsers and Editors Message Oriented Middleware (MOM): Useful for Machine-to-Machine data exchange. E.g.: Business-to-Business communication
XSLT is useful in: Transforming data into a viewable format in a browser (POP) Transforming business data between content models (MOM)
<xsl:stylesheet version="1.0" xmlns:xsl=" Folks in Brandeis XML class XSLT Stylesheet RESULT Folks in Brandeis XML class
XSLT stylesheet language template value-of apply-templates for-each if when, choose, otherwise Sort filtering
THANK YOU