Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML Introduction Bill Jerome.

Similar presentations


Presentation on theme: "XML Introduction Bill Jerome."— Presentation transcript:

1 XML Introduction Bill Jerome

2 Motivation “Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing” – World Wide Web Consortium

3 Motivation This is not meant to be a universal data encoding. (See SGML) Intent is to encode publishable data with XML so that it can be presented to others through transformation which adheres to guidelines Thus, a text author need only write in XML and send to book publishers who would simply have to transform the XML into a book that meats their publication guidelines

4 Motivation for the web Page authors could learn to write XML rather than HTML and then transform their pages for display on the web. That same XML could be transformed into printed text, etc., without altering the XML at all. Choosing stylesheets would allow a total change in appearance (similar to CSS) without altering content

5 Motivation for web From a technical standpoint, the goal is to separate data for publication from its presentation To do it in a way that generates a more real separation that Cascading Style Sheets do it

6 Realities In fact, transformations can be much more powerful. Table of contents, indices, repositories for other sources (ex: search) can all be transformed out of the XML documents XML is of limited use by itself and requires design of guidelines (DTDs or schemas) Simple to learn to do, not so simple to do right

7 XML syntax If you know HTML, XML is already familiar More formalized
All tags have end tags Certain document structure to identify XML doc (like <HTML></HTML>) Maximum one root node per document <tag>Marked up text</tag> <tag2 attribute=“setting”>More text</tag2>

8 XML not direct parent of HTML
What in html is not really XML? Images <img src=“foo.jpg”> No end Line breaks <br> Empty (untagged content) <html><head></head><body>Here is untagged text</body></html>

9 XHTML A “cleaned up” form of HTML that is XML compliant
Exists in the form of the XHTML DTD Is not rendered right in most browsers in many cases <img src=“foo.jpg”/> just treated as error <br/> sometimes isn’t and won’t work

10 XML doesn’t look helpful
At first glance we have gained very little by switching to XML. Benefit will come from structure of XML tags, described by DTDs or Schemas We will focus on DTDs which are not as rich as Schemas but are older and more widely used

11 Document Type Descriptors
DTDs provide promises about XML documents. They outline exactly what tags are allowed, how the are allowed to relate (do the nest, are they required like <body>, etc.) Provide for some more complicated formations (‘A textbook must have at least one author, may or may not have an editor, but must one and only one copyright’)

12 More on DTDs Are not a complete language. Cannot specify “three authors with at least one who is also an editor” with a DTD XML documents then reference a DTD that they claim to adhere to within the header of the document XML Parsers should always enforce DTDs and not assume the XML document is correct

13 Well-formedness vs. validity
An XML document is well-formed if it can be interpreted as an XML document. Thus it has matched start/end tags in an XML formatted document An XML document is valid if it adheres to the DTD it references

14 XML Stylesheet So now that we have a promise of how content must be arranged and content to go with it, how do we publish? Via XML Stylesheet Transformation (XSLT) XSL files explain to the transformer how to interpret all elements of a particular DTD XSLT engine then applies those rules to the XML file

15 XSLT XSL files are ugly. The syntax (like DTDs as well) is itself XML. Is not intended for easy readability Is extremely powerful. XSL allows ‘callbacks’ to its engine within XSL. These are the parts you swap out to get different outputs

16 Map We have introduced three new file types. Here is a rough map of how they fit: .xsl .html XSLT Engine .dtd Informs .xml

17 Map Different XSLs book.xsl .ps web.xsl .html XSLT Engine .dtd Informs
.xml

18 Map Different content book.xsl .ps .ps .ps .ps web.xsl .html .html
XSLT Engine .dtd .html Informs .html .xml .xml .xml .xml

19 Example XML <?xml version="1.0"?> <recipe name=“Cookies”>
<author>Carol Schmidt</author> <ingredients> <item unit=“C” measurement=“2/3”> butter</item> <item unit=“C” measurement=“2”> brown sugar</item> </ingredients> </recipe>

20 Example DTD <?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT recipe (author, ingredients)> <!ATTLIST recipe name CDATA #REQUIRED > <!ELEMENT ingredients (item+)> <!ELEMENT item (#PCDATA|sub_item)*> <!ATTLIST item measurement CDATA #REQUIRED unit CDATA #REQUIRED > <!ELEMENT author (#PCDATA)>

21 Example XSL <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl=" xmlns:fo=" <xsl:template match="/"> <xsl:apply-templates/> </xsl:template> <xsl:template match="recipe"> <html> <head/> <body> <p/> <b>Recipe Name:</b> <xsl:value-of <br/> </body> </html> <xsl:template match="ingredients"> <xsl:for-each select="item"> <xsl:value-of <xsl:value-of <xsl:value-of select="."/> </xsl:for-each> <xsl:template match="author"> <i> </i> </xsl:stylesheet>

22 XSL Output Recipe Name:Cookies Carol Schmidt 2/3Cbutter 2C brown sugar

23 Resources http://www.w3.org/XML/ http://www.w3.org/Style/XSL/


Download ppt "XML Introduction Bill Jerome."

Similar presentations


Ads by Google