EXtensible Markup Language(XML) XML CORE CONCEPTS
HISTORY OF XML SGML, the parent of both HTML and XML. The complexity of SGML & lack of content tagging in HTML led to need of new markup language for WWW, & it’s XML. In 1996, the design organization for technologies related to the WWW, the W3C began the process of designing XML that combines the flexibility of SGML & HTML. XML version 1.0 was defined in a February 1998 by W3C recommendation.
What’s XML? XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to carry data, not to display data. XML tags are not predefined. You must define your own tags. XML is designed to be self-descriptive. XML is a W3C Recommendation.
DESIGN GOALS OF XML XML shall be straightforwardly usable over the internet. XML shall support a wide variety of applications. It shall be easy to write programs that process XML documents. XML documents should be human-legible & reasonably clear. XML design should be prepared quickly. Design of XML shall be formal & concise XML documents shall be easy to create.
DIFFERENCE BETWEEN XML & HTML XML is not a replacement for HTML. XML and HTML were designed with different goals: XML was designed to transport and store data, with focus on what data is. HTML was designed to display data, with focus on how data looks. HTML is about displaying information, while XML is about carrying information or describing data. XML was created to structure, store, and transport information.
FEATURES OF XML XML is a software- and hardware-independent tool for carrying information. XML is Extensible. XML is a complement to HTML. XML is Just Plain Text. XML transmits data between all sorts of applications, and is becoming more and more popular in the area of storing and describing information.
How can XML be used? XML can keep data separated from your HTML. XML can be used to store data inside HTML documents. XML can be used as a format to exchange information. XML can be used to store data in files or in database. XML Simplifies Data Transport & Platform independent.
- XHTML the latest version of HTML. XML Makes Your Data More Available. XML is Used to Create New Internet Languages. Some examples are, - XHTML the latest version of HTML. - WSDL for describing available web services. - WAP and WML as markup languages for handheld devices. - RSS languages for news feeds. - RDF and OWL for describing resources and ontology. - SMIL for describing multimedia for the web
GENERAL FORMAT OF XML XML Documents Form a Tree Structure. XML documents must contain a root element. This element is "the parent" of all other elements. The tree starts at the root and branches to the lowest level of the tree. All elements can have sub elements (child elements): <root> <child> <subchild>.....</subchild> </child> </root> The terms parent, child, and sibling are used to describe the relationships between elements All elements can have text content and attributes (just like in HTML).
ELEMENTS OF XML An XML element is everything from (including) the element's start tag to (including) the element's end tag. An element can contain other elements, simple text or a mixture of both. Elements can also have attributes. XML Naming Rules. XML elements must follow these naming rules: · Names can contain letters, numbers, and other characters. · Names cannot start with a number or punctuation character. · Names cannot start with the letters xml (or XML, or Xml, etc). · Names cannot contain spaces.
Cont… XML elements are, . Versions & Encoding type . Declarations . Processing Instruction . Attributes . CDATA (Character DATA) . Comments . Special Characters . Hyperlinks . Document Type Definition (DTD)
WHAT’S DTD? Document Type Definition (DTD). Defines the syntax, grammar & semantics. Defines the document structure What Elements, Attributes, Entities, etc are permitted? How are the document elements related & structured? Referenced by or defined in XML documents, but it’s not XML! Enables validation of XML documents using an XML Parser. Can be referenced to by more than one XML document. DTD’s may reference other DTD’s.
XML SYNTAX RULES A well-formed XML document is a document that conforms to the XML syntax rules, like: · it must begin with the XML declaration. · it must have one unique root element. · start-tags must have matching end-tags. · elements are case sensitive. · all elements must be closed. · all elements must be properly nested. · all attribute values must be quoted. · entities must be used for special characters. Even if documents are well-formed they can still contain errors, and those errors can be rectified by XML tools like ALTOVA XML SPY,etc.
Valid XML Documents A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a DTD: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE note SYSTEM "Note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
XML DTD The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal elements: <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]>
Explanation !DOCTYPE note defines that the root element of the document is note !ELEMENT note defines that the note element contains four elements: "to, from, heading, body" !ELEMENT to defines the to element to be of type "#PCDATA" !ELEMENT from defines the from element to be of type "#PCDATA" !ELEMENT heading defines the heading element to be of type "#PCDATA" !ELEMENT body defines the body element to be of type "#PCDATA" #PCDATA means parse-able text data.
EXAMPLE FOR XML(STRUCTURE)
<bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
Using DTD for Entity Declaration A doctype declaration can also be used to define special characters and character strings, used in the document: Example <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE note [ <!ENTITY nbsp " "> <!ENTITY writer "Writer: Donald Duck."> <!ENTITY copyright "Copyright: W3Schools.“>]> <note><to>Tove</to><from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> <footer>&writer; ©right;</footer> </note>
Why Use XML Schemas? XML Schemas are much more powerful than DTDs. XML Schemas Support Data Types One of the greatest strength of XML Schemas is the support for data types. With support for data types: · It is easier to describe allowable document content. · It is easier to validate the correctness of data. · It is easier to work with data from a database. · It is easier to define data facets (restrictions on data). · It is easier to define data patterns (data formats). · It is easier to convert data between different data types. XML Schemas use XML Syntax. XML Schemas are written in XML.
BENEFITS OF XML SCHEMA You don't have to learn a new language. You can use your XML editor to edit your Schema files. You can use your XML parser to parse your Schema files. You can manipulate your Schema with the XML DOM. You can transform your Schema with XSLT.
XML Schemas Secure Data Communication. XML Schemas are extensible, because they are written in XML. With an extensible Schema definition you can: · Reuse your Schema in other Schemas · Create your own data types derived from the standard types · Reference multiple schemas in the same document Well-Formed is not Enough. A well-formed XML document is a document that conforms to the XML syntax rules, like:
SIMPLE EXAMPLE FOR SCHEMA <complexType name= “Customer”> <sequence> <element name= “Person” type=“Name” /> <element name= “Address” type=“Address” /> </sequence> </complexType> <complexType name=“Address”> <element name=“Street” type=“string” /> <element name=“City” type=“string” /> <element name=“State” type=“string” /> <element name=“PostalCode” type=“int” /> <element name=“Country” type=“string” />
WHAT’S DOM DOM stands for Document Object Model Programming interface for HTML & XML documents An in-memory representation of a document Defines the document structure through an object model Tree-view of a document Nodes, elements and attributes, text elements, etc W3C defined the DOM Level 1 and Level 2 Core http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/ http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/
<?xml version=“1.0”?> GENERATING THE DOM STRUCTURE <?xml version=“1.0”?> XML Document Parser Dom Tree Root Element Child Element Text
Java-based DOM Java DOM API defined by org.w3c.dom package Semantically similar to JavaScript DOM API, but many small syntactic differences Nodes of DOM tree belong to classes such as Node, Document, Element, Text Non-method properties accessed via methods Ex: parentNode accessed by calling getParentNode() 26 Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Java-based DOM Default parser is non-validating and non-namespace-aware. 27 Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Java-based DOM Methods such as getElementsByTagName() return instance of NodeList getLength() method returns # of items item() method returns an item 28 Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
EXAMPLE FOR DOM <html><body> <script type="text/javascript"> var xmlDoc; xmlDoc=ActiveXObject("Microsoft.XMLDOM")) { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async=false; xmlDoc.load("cd_catalog.xml"); } document.write("<table border='1'>"); var x=xmlDoc.getElementsByTagName("CD"); for (i=0;i<x.length;i++) { document.write("<tr><td>"); document.write(x[i].getElementsByTagName("ARTIST")[0].childNodes[0].nodeValue); document.write("</td><td>"); document.write(x[i].getElementsByTagName("TITLE")[0].childNodes[0]. nodeValue); document.write("</td></tr>"); } document.write("</table>"); </script> </body> </html>
WHAT’S XSLT(STYLESHEET)? Widely used and open standard defined by the W3C A sub-specification of XSL http://www.w3.org/TR/1999/REC-xslt-19991116 Designed to be used independently of XSL Designed primarily for the transformation needed in XSL W3C defines XSLT: “a language for transforming XML documents” XSLT is more than a language – it’s an XML programming language Can have rules, evaluate conditions, etc Offers the ability to transform one XML document into another Transform an XDR Schema to and XSD Schema! Transform an XML document into an HTML document
XSL The Extensible Stylesheet Language (XSL) is an XML vocabulary typically used to transform XML documents from one form to another form XSL document Input XML document XSLT Processor Output XML document 31 Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
XSL Components of XSL: XSL Transformations (XSLT): defines XSL namespace elements and attributes XML Path Language (XPath): used in many XSL attribute values (ex: child::message) XSL Formatting Objects (XSL-FO): XML vocabulary for defining document style (print-oriented) 32 Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
The XSLT Process – Overview Target Schema XSLT XSLT Processor XSLT Style Sheet XML Source Document XML Target Document Source Schema
XML and Browsers An XML document can contain a processing instruction telling a browser to: Apply XSLT to create an XHTML document: 34 Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
XML and Browsers An XML document can contain a processing instruction telling a browser to: Apply CSS to style the XML document: 35 Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0
Transformation Process Overview Pass source document to an XSLT processor Processor contains a loaded XSLT style-sheet Processor then: Loads the specified Stylesheet templates... Traverses the source document, node by node... Where a node matches a template... Applies the template to the node Outputs the (new) XML or HTML result document
Process of “Transmutation” <Orders > <OrderNo> 10 </OrderNo> <ProductNo> 100 </ProductNo> <ProductNo> 200 </ProductNo> </Orders > <OrderNo> 20 </OrderNo> <ProductNo> 501 </ProductNo> <HTML> <BODY> <TABLE border = “3”> <TR> <TD> 10 </TD> <TD> 100</TD> </TR> <TR> <TD> 10 </TD> <TD> 200</TD> </TR> <TR> <TR></TR> <TD> 20 </TD> <TD> 501 </TD> </TR> </TABLE> </BODY> </HTML> XSLT Processor XSLT Stylesheet
SIMPLE EXAMPLE FOR XSLT <?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href="simple.xsl"?> <breakfast_menu> <food> <name>Belgian Waffles</name> <price>$5.95</price> <description> Famous Belgian Waffles </description> <calories>650</calories> </food> </breakfast_menu>
THANK YOU