1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of a document. Because it is extensible, XML can be used to create a wide variety of document types.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 3 Introducing XML XML is a subset of the Standard Generalized Markup Language (SGML) which was introduced in the 1980s. SGML is very complex and can be costly. These reasons led to the creation of Hypertext Markup Language (HTML), a more easily used markup language. XML can be seen as sitting between SGML and HTML – easier to learn than SGML, but more robust than HTML.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 4 The Limits of HTML HTML was designed for formatting text on a Web page. It was not designed for dealing with the content of a Web page. HTML is limited to a set of predefined elements. HTML can be inconsistently applied.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 5 XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It must be easy to write programs that process XML documents
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 6 XML Design Goals 5.XML documents should be clear and easily understood by nonprogrammers 6.The design of XML must be exact and concise 7.XML documents must be easy to create 8.Terseness in XML markup is of minimal importance
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 7 XML Vocabularies
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 8 Well-Formed and Valid XML Documents An XML document is well-formed if it contains no syntax errors and fulfills all of the specifications for XML code as defined by the W3C. An XML document is valid if it is well-formed and also satisfies the rules laid out in the DTD or schema attached to the document.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 9 The Structure of an XML Document XML documents consist of three parts –The prolog –The document body –The epilog The prolog is optional and provides information about the document itself
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 10 The Structure of an XML Document The document body contains the document’s content in a hierarchical tree structure. The epilog is also optional and contains any final comments or processing instructions.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 11 The XML Declaration The XML declaration is always the first line of code in an XML document. It tells the processor what follows is written using XML. It can also provide any information about how the parser should interpret the code. The complete syntax is: A sample declaration might look like this:
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 12 Inserting Comments Comments or miscellaneous statements go after the declaration. Comments may appear anywhere after the declaration. The syntax for comments is: This is the same syntax for HTML comments
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 13 Working with Elements Elements are the basic building blocks of XML files. Elements contain an opening tag and a closing tag –Content is stored between tags
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 14 Working with Elements A closed element, has the following syntax: Content Example: Miles Davis
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 15 Working with Elements Element names are case sensitive An open or empty element is an element that contains no content. Elements can be nested, as follows: 1/2 cup soy sauce 1/2 cup plum sauce 1/2 cup pineapple juice
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 16 Working with Elements Nested elements are called child elements. Elements must be nested correctly. Child elements must be enclosed within their parent elements.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 17 The Element Hierarchy All elements must be nested within a single document or root element. There can be only one root element.
The Element Hierarchy New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 18
Writing the Document Body New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 19
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 20 Working with Attributes An attribute is a feature or characteristic of an element. Attributes are text strings and must be placed in single or double quotes. The syntax is: …
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 21 Using Character and Entity References Special characters, such as the symbol for the British pound, can be inserted into your XML document by using a character reference. The syntax is: &#nnn;
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 22 Using Character and Entity References Character is a character reference number or name from the ISO/IEC character set. Character references in XML are the same as in HTML. XML also supports entity references, which are named references to special symbols or extra content found in external files or extended text strings.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 23 Using Character and Entity References
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 24 Using Character and Entity References
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 25 Parsed Character Data Parsed character data, or pcdata consists of all those characters that XML treats as parts of the code of XML document –The XML declaration –The opening and closing tags of an element –Empty element tags –Character or entity references –Comments –Processing instructions
Character Data and Whitespace Character data is not processed, but instead is treated as pure data content. Whitespace refers to nonprintable characters such as spaces (created by pressing the Spacebar), new line characters (created by pressing the Enter key), or tab characters (created by pressing the Tab key). New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 26
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 27 CDATA Sections A CDATA section is a large block of text the XML processor will interpret only as text and not attempt to parse. The syntax to create a CDATA section is: <! [CDATA [ character data ] ]>
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 28 CDATA Sections This example shows an element named htmlcode that contains a CDATA section, which is used to store several HTML taks for the thehelpfulcook.com Web site <![CDATA[ thehelpfulcook.com An online source of recipes and cooking tips ]]>
CDATA Sections New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 29
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 30 Parsing an XML Document
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 31 Parsing an XML Document XML documents can be opened in Internet Explorer or in Netscape Navigator. If there are no syntax errors, IE will display the document’s contents in an expandable/collapsible outline format including all markup tags. Starting with version 6.0, the Netscape browser included a built-in XML parser. Opera versions 8.0 and above include an XML parser, as do all versions of Mozilla Firefox and Apple Safari.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 32 Parsing an XML Document
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 33 Formatting XML Data with CSS Link the XML document to a style sheet to format the document. The XML processor will combine the style sheet with the XML document and apply any formatting codes defined in the style sheet to display a formatted document. –Cascading Style Sheets (CSS)
Formatting XML Data with CSS New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 34
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 35 Formatting XML Data with CSS To apply a style sheet to a document, use the following syntax: selector {attribute1:value1; attribute2:value2; …} selector is an element (or set of elements) from the XML document. attribute and value are the style attributes and attribute values to be applied to the document.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 36 Formatting XML Data with CSS For example: author {color:red; font-weight:bold} will display the text of the artist element in a red boldface type.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 37 Inserting a Processing Instruction The link from the XML document to a style sheet is created using a processing instruction. A processing instruction is a command that gives instructions to the XML parser.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 38 Inserting a Processing Instruction For example: Style is the type of style sheet to access and sheet is the name and location of the style sheet.
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 39 The dp.css Style Sheet
Linking to the dp.css Style Sheet New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 40
New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 41 The recipes Document Formatted with the dp.css Style Sheet