What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized Markup Language) Mechanism to identify structures in a document Mechanism to identify structures in a document Markup language for documents containing structured information Markup language for documents containing structured information Self-Descriptive Self-Descriptive Buzz Word Buzz Word
XML and HTML Similar in nature Similar in nature Labels Labels Elements Elements plus content plus content Reference Specification Reference Specification WWW consortium (W3C) WWW consortium (W3C) HTML transitional HTML transitional XHTML XHTML XML 1.0 XML 1.0
XML Document Structure Declaration Declaration Elements Elements Attributes Attributes Character Data Character Data Processing Instructions Processing Instructions Comments Comments Entity References Entity References
Declaration Start of the file Start of the file Optional Optional Future proof Future proof
Elements <Report> XML Report </Report> Highest level termed as the root element Highest level termed as the root element Contains: Contains: Start tag Start tag Some Content Some Content End tag End tag
Attributes XML Report </Report> Contains: Contains: Name Name Value Value
Character Data XML Report XML Report</Report> Element Content Element Content Special Symbols Special Symbols ‘&’ and ‘<‘ ‘&’ and ‘<‘ See Entity References See Entity References
Comments
Entity References EntityCharacter << >> && '‘ "“
Processing Instructions Show a processing instruction at the appropriate place in the node tree (DOM) Show a processing instruction at the appropriate place in the node tree (DOM) Firing a processing instruction event (SAX) Firing a processing instruction event (SAX)
Well-Formed XML Tags must be nested properly Tags must be nested properly All start tags must have end tags All start tags must have end tags Use quotation marks properly for tag attributes Use quotation marks properly for tag attributes Use entity references Use entity references
Document Type Definition Set of rules Set of rules May be included in the document itself May be included in the document itself May be linked externally May be linked externally Confirming to a DTD Confirming to a DTD Well-formed Well-formed valid valid
Document Type Definition *<!DOCTYPE rootElementName [ …insert declarations here… ]> * * *<!ATTLIST element name (value1|value2)
XML Parsing Document Object Model (DOM) Document Object Model (DOM) Simple API for XML (SAX) Simple API for XML (SAX)
Document Object Model (DOM) Document Model driven Document Model driven Build a tree model of the elements in the document Build a tree model of the elements in the document Allow for application to access the tree Allow for application to access the tree DOM XML parser DOM XML parser Converts XML documents into Java Tree object model Converts XML documents into Java Tree object model
Simple API for XML (SAX) Event driven Event driven SAX XML parser processes elements serially SAX XML parser processes elements serially XML application provides callback functions to handle elements XML application provides callback functions to handle elements
Freely Available XML Parsers Apache Software Foundation Xerces XML Parser (xml.apache.org) Apache Software Foundation Xerces XML Parser (xml.apache.org) Open source Open source Oracle XML Parser Version 2 ( Oracle XML Parser Version 2 ( Must register Must register SAX2 Parser ( SAX2 Parser ( Freely available Freely available
References