XP 1 TUTORIAL 1 CREATING AN XML DOCUMENT
XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of a document. Because it is extensible, XML can be used to create a wide variety of document types.
XP 3 INTRODUCING XML XML is a subset of the Standard Generalized Markup Language (SGML) which was introduced in the 1980s. SGML is very complex and can be costly. These reasons led to the creation of Hypertext Markup Language (HTML), a more easily used markup language. XML can be seen as sitting between SGML and HTML – easier to learn than SGML, but more robust than HTML.
XP 4 THE LIMITS OF HTML HTML was designed for formatting text on a Web page. It was not designed for dealing with the content of a Web page. Additional features have been added to HTML, but they do not solve data description or cataloging issues in an HTML document. Because HTML is not extensible, it cannot be modified to meet specific needs. Browser developers have added features making HTML more robust, but this has resulted in a confusing mix of different HTML standards.
XP 5 THE LIMITS OF HTML HTML cannot be applied consistently. Different browsers require different standards making the final document appear differently on one browser compared with another.
XP 6 THE 10 PRIMARY XML DESIGN GOALS 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It must be easy to write programs that process XML documents 5.The number of optional features in XML must be kept to a minimum, ideally zero
XP 7 THE 10 PRIMARY XML DESIGN GOALS — CONTINUED 6.XML documents should be clear and easily understood by nonprogrammers 7.The XML design should be prepared quickly 8.The design of XML must be exact and concise 9.XML documents must be easy to create 10.Terseness in XML markup is of minimum importance
XP 8 XML VOCABULARIES
XP 9 WELL-FORMED AND VALID XML DOCUMENTS An XML document is well-formed if it contains no syntax errors and fulfills all of the specifications for XML code as defined by the W3C. An XML document is valid if it is well-formed and also satisfies the rules laid out in the DTD or schema attached to the document.
XP 10 THE STRUCTURE OF AN XML DOCUMENT XML documents consist of three parts –The prolog –The document body –The epilog The prolog is optional and provides information about the document itself
XP 11 THE STRUCTURE OF AN XML DOCUMENT The document body contains the document’s content in a hierarchical tree structure. The epilog is also optional and contains any final comments or processing instructions.
XP 12 THE STRUCTURE OF AN XML DOCUMENT: CREATING THE PROLOG The prolog consists of four parts in the following order: –XML declaration –Miscellaneous statements or comments –Processing instructions –Document type declaration
XP 13 THE STRUCTURE OF AN XML DOCUMENT: THE XML DECLARATION The XML declaration is always the first line of code in an XML document. It tells the processor what follows is written using XML. It can also provide any information about how the parser should interpret the code. The complete syntax is: A sample declaration might look like this:
XP 14 THE STRUCTURE OF AN XML DOCUMENT: INSERTING COMMENTS Comments or miscellaneous statements go after the declaration. Comments may appear anywhere after the declaration. The syntax for comments is: This is the same syntax for HTML comments
XP 15 ELEMENTS Elements are the basic building blocks of XML files. Elements contain an opening tag and a closing tag –Content is stored between tags
XP 16 ELEMENTS A closed element, has the following syntax: Content Example: Miles Davis
XP 17 ELEMENT Element names are case sensitive Elements can be nested, as follows: Kind of Blue So What ((:22) Blue in Green (5:37)
XP 18 ELEMENTS Nested elements are called child elements. Elements must be nested correctly. Child elements must be enclosed within their parent elements.
XP 19 ELEMENTS AND ATTRIBUTES All elements must be nested within a single document or root element. There can be only one root element. An open or empty element is an element that contains no content. They can be used to mark sections of the document for the XML parser.
XP 20 WORKING WITH ATTRIBUTES An attribute is a feature or characteristic of an element. Attributes are text strings and must be placed in single or double quotes. The syntax is: …
XP 21 ELEMENTS AND ATTRIBUTES: ADDING ELEMENTS TO THE JAZZ.XML FILE This figure shows the revised document document elements {
XP 22 CHARACTER REFERENCES Special characters, such as the symbol for the British pound, can be inserted into your XML document by using a character reference. The syntax is: &#nnn;
XP 23 CHARACTER REFERENCES Character is a character reference number or name from the ISO/IEC character set. Character references in XML are the same as in HTML.
XP 24 CHARACTER REFERENCES This figure shows commonly used character reference numbers
XP 25 CHARACTER REFERENCES This figure shows the revised Jazz.XML file character reference
XP 26 PARSED CHARACTER DATA Parsed character data, or pcdata consists of all those characters that XML treats as parts of the code of XML document –The XML declaration –The opening and closing tags of an element –Empty element tags –Character or entity references –Comments
XP 27 CDATA SECTIONS A CDATA section is a large block of text the XML processor will interpret only as text. The syntax to create a CDATA section is: <! [CDATA [ Text Block ] ]>
XP 28 CDATA SECTIONS In this example, a CDATA section stores several HTML tags within an element named HTMLCODE: <![CDATA[ The Jazz Warehouse Your Online Store for Jazz Music ] ]>
XP 29 CDATA SECTIONS This figure shows the revised Jazz.XML file CDATA section
XP 30 PARSING AN XML DOCUMENT
XP 31 DISPLAYING AN XML DOCUMENT IN A WEB BROWSER XML documents can be opened in Internet Explorer or in Netscape Navigator. If there are no syntax errors. IE will display the document’s contents in an expandable/collapsible outline format including all markup tags. Netscape will display the contents but neither the tags nor the nested elements.
XP 32 DISPLAYING AN XML DOCUMENT IN A WEB BROWSER To display the Jazz.xml file in a Web browser: 1. Start the browser and open the Jazz.xml file located in the tutorial.01x/tutorial folder of your Data Disk. 2. Click the minus (-) symbols. 3. Click the resulting plus (+) symbols.
XP 33 DISPLAYING AN XML DOCUMENT IN A WEB BROWSER
XP 34 LINKING TO A STYLE SHEET Link the XML document to a style sheet to format the document. The XML processor will combine the style sheet with the XML document and apply any formatting codes defined in the style sheet to display a formatted document. There are two main style sheet languages used with XML: –Cascading Style Sheets (CSS) and Extensible Style Sheets (XSL)
XP 35 LINKING TO A STYLE SHEET There are some important benefits to using style sheets: –By separating content from format, you can concentrate on the appearance of the document –Different style sheets can be applied to the same XML document –Any style sheet changes will be automatically reflected in any Web page based upon the style sheet
XP 36 APPLYING A STYLE TO AN ELEMENT To apply a style sheet to a document, use the following syntax: selector {attribute1:value1; attribute2:value2; …} selector is an element (or set of elements) from the XML document. attribute and value are the style attributes and attribute values to be applied to the document.
XP 37 APPLYING A STYLE TO AN ELEMENT For example: artist {color:red; font-weight:bold} will display the text of the artist element in a red boldface type.
XP 38 CREATING PROCESSING INSTRUCTIONS The link from the XML document to a style sheet is created using a processing statement. A processing instruction is a command that gives instructions to the XML parser.
XP 39 CREATING PROCESSING INSTRUCTIONS For example: Style is the type of style sheet to access and sheet is the name and location of the style sheet.
XP 40 THE JW.CSS STYLE SHEET This figure shows the cascading style sheet stored in the jw.css file
XP 41 LINKING TO THE JW.CSS STYLE SHEET This figure shows how to link the JW.css style sheet to the Jazz.xml file processing instruction to access the jw.css style sheet
XP 42 THE JAZZ.XML DOCUMENT FORMATTED WITH THE JW.CSS STYLE SHEET This figure shows the formatted jazz.xml file
XP 43 TUTORIAL 2 WORKING WITH NAMESPACES
XP 44 COMBINING XML VOCABULARIES A document that combines several vocabularies is known as a compound document
XP 45 WORKING WITH NAMESPACES Name collision occurs when elements from two or more documents share the same name. Name collision is not a problem if you are not concerned with validation. The document content only needs to be well-formed. However, name collision will keep a document from being validated.
XP 46 NAME COLLISION This figure shows name collision
XP 47 DECLARING A NAMESPACE A namespace is a defined collection of element and attribute names. Names that belong to the same namespace must be unique. Elements can share the same name if they reside in different namespaces. Namespaces must be declared before they can be used.
XP 48 DECLARING A NAMESPACE A namespace can be declared in the prolog or as an element attribute. The syntax for an attribute used to declare a namespace in the prolog is: xmlns:prefix=“URI” Where URI is a Uniform Resource Identifier that assigns a unique name to the namespace, and prefix is a string of letters that associates each element or attribute in the document with the declared namespace.
XP 49 DECLARING A NAMESPACE For example, > Declares a namespace with the prefix “mod” and the URI The URI is not a Web address. A URI identifies a physical or an abstract resource.
XP 50 URIs, URLs, AND URNs A physical resource is a resource one can access and work with such as a file, a Web page, or an e- mail address. A URL is one type of URI. An abstract resource is one that doesn’t have any physical existence, the URI is used as an identifier or an ID.
XP 51 TUTORIAL 3 VALIDATING AN XML DOCUMENT
XP 52 CREATING A VALID DOCUMENT You validate documents to make certain necessary elements are never omitted. For example, each customer order should include a customer name, address, and phone number.
XP 53 CREATING A VALID DOCUMENT Some elements and attributes may be optional, for example an address. An XML document can be validated using either DTDs (Document Type Definitions) or schemas.
XP 54 CUSTOMER INFORMATION COLLECTED BY KRISTEN This figure shows customer information collected by Kristen
XP 55 THE STRUCTURE OF KRISTEN’S DOCUMENT This figure shows the overall structure of Kristen’s document
XP 56 DECLARING A DTD A DTD can be used to: – Ensure all required elements are present in the document – Prevent undefined elements from being used – Enforce a specific data structure – Specify the use of attributes and define their possible values – Define default values for attributes – Describe how the parser should access non-XML or non-textual content
XP 57 DECLARING A DTD There can only be one DTD per XML document. A document type definition is a collection of rules or declarations that define the content and structure of the document. A document type declaration attaches those rules to the document’s content.
XP 58 DECLARING A DTD You create a DTD by first entering a document type declaration into your XML document. DTD in this tutorial will refer to document type definition and not the declaration. While there can only be one DTD, it can be divided into two parts: an internal subset and an external subset.
XP 59 DECLARING A DTD An internal subset is declarations placed in the same file as the document content. An external subset is located in a separate file.
XP 60 DECLARING A DTD The DOCTYPE declaration for an internal subset is: <!DOCTYPE root [ declarations ]> Where root is the name of the document’s root element, and declarations are the statements that comprise the DTD.
XP 61 DECLARING A DTD The DOCTYPE declaration for external subsets can take two forms: one that uses a SYSTEM location and one that uses a PUBLIC location. The syntax is: or
XP 62 DECLARING A DTD Here, root is the document’s root element, identifier is a text string that tells an application how to locate the external subset, and uri is the location and filename of the external subset. Use the PUBLIC location form when the DTD needs to be limited to an internal system or when the XML document is part of an old SGML application.
XP 63 DECLARING A DTD The SYSTEM location form specifies the name and location of the external subset through the “uri” value. Unless your application requires a public identifier, you should use the SYSTEM location form.
XP 64 DECLARING A DTD A DOCTYPE declaration can indicate both an external and an internal subset. The syntax is: <!DOCTYPE root SYSTEM “URI” [ declarations ]> or <!DOCTYPE root PUBLIC “id” “URL” [ declarations ]>
XP 65 DECLARING A DTD If you place the DTD within the document, it is easier to compare the DTD to the document’s content. However, the real power of XML comes from an external DTD that can be shared among many documents written by different authors.
XP 66 DECLARING A DTD If a document contains both an internal and an external subset, the internal subset takes precedence over the external subset if there is a conflict between the two. This way, the external subset would define basic rules for all the documents, and the internal subset would define those rules specific to each document.
XP 67 COMBINING AN EXTERNAL AND INTERNAL DTD SUBSET This figure shows how to combine an external and an internal DTD subset
XP 68 WRITING THE DOCUMENT TYPE DECLARATION This figure shows how to insert an internal DTD subset
XP 69 DECLARING DOCUMENT ELEMENTS Every element used in the document must be declared in the DTD for the document to be valid. An element type declaration specifies the name of the element and indicates what kind of content the element can contain.
XP 70 DECLARING DOCUMENT ELEMENTS The element declaration syntax is: Where element is the element name and content-model specifies what type of content the element contains.
XP 71 DECLARING DOCUMENT ELEMENTS The element name is case sensitive. DTDs define five different types of element content: – Any elements. No restrictions on the element’s content. – Empty elements. The element cannot store any content.
XP 72 DECLARING DOCUMENT ELEMENTS – #PCDATA. The element can only contain parsed character data. – Elements. The element can only contain child elements. – Mixed. The element contains both a text string and child elements.
XP 73 TYPES OF ELEMENT CONTENT ANY content: The declared element can store any type of content. The syntax is: EMPTY content: This is reserved for elements that store no content. The syntax is:
XP 74 TYPES OF ELEMENT CONTENT Parsed Character Data content: These elements can only contain parsed character data. The syntax is: The keyword #PCDATA stands for “parsed- character data” and is any well-formed text string.
XP 75 TYPES OF ELEMENT CONTENT ELEMENT content.: The syntax for declaring that elements contain only child elements is: Where children is a list of child elements.
XP 76 TYPES OF ELEMENT CONTENT The declaration indicates the customer element can only have one child, named phone. You cannot repeat the same child element more than once with this declaration.
XP 77 ELEMENT SEQUENCES AND CHOICES A sequence is a list f elements that follow a defined order. The syntax is: The order of the child elements must match the order defined in the element declaration. A sequence can be applied to the same child element.
XP 78 ELEMENT SEQUENCES AND CHOICES Thus, indicates the customer element should contain three child elements for each customer.
XP 79 ELEMENT SEQUENCES AND CHOICES Choice is the other way to list child elements and presents a set of possible child elements. The syntax is: where child1, child2, etc. are the possible child elements of the parent element.
XP 80 ELEMENT SEQUENCES AND CHOICES For example, This allows the customer element to contain either the name element or the company element. However, you cannot have both the customer and the name child elements since the choice model allows only one of the child elements.
XP 81 MODIFYING SYMBOLS Modifying symbols are symbols appended to the content model to indicate the number of occurrences of each element. There are three modifying symbols: – a question mark (?), allow zero or one of the item. – a plus sign (+), allow one or more of the item. – an asterisk (*), allow zero or more of the item.
XP 82 MODIFYING SYMBOLS For example, would allow the document to contain one or more customer elements to be placed within the customer element. Modifying symbols can be applied within sequences or choices. They can also modify entire element sequences or choices by placing the character immediately following the closing parenthesis of the sequence or choice.
XP 83 MIXED CONTENT Mixed content elements contain both character data and child elements. The syntax is: This form applies the * modifying symbol to a choice of character data or elements. Therefore, the parent element can contain character data or any number of the specified child elements, or it can contain no content at all.
XP 84 MIXED CONTENT Because you cannot constrain the order in which the child elements appear or control the number of occurrences for each element, it is better not to work with mixed content if you want a tightly structured document.
XP 85 DECLARING ELEMENT ATTRIBUTES For a document to be valid, all the attributes associated with elements must also be declared. To enforce attribution properties, you must add an attribute-list declaration to the document’s DTD.
XP 86 ELEMENT ATTRIBUTES IN KRISTEN’S DOCUMENT This figure shows element attributes in Kristen's document
XP 87 DECLARING ELEMENT ATTRIBUTES The attribute-list declaration : – Lists the names of all attributes associated with a specific element – Specifies the data type of the attribute – Indicates whether the attribute is required or optional – Provides a default value for the attribute, if necessary
XP 88 DECLARING ELEMENT ATTRIBUTES The syntax to declare a list of attributes is: <!ATTLIST element attribute1 type1 default1 attribute2 type2 default2 attribute3 type3 default3…> Where element is the name of the element associated with the attributes, attribute is the name of an attribute, type is the attribute’s data type, and default indicates whether the attribute is required or implied, and whether it has a fixed or default value.
XP 89 DECLARING ELEMENT ATTRIBUTES Attribute-list declaration can be placed anywhere within the document type declaration, although it is easier if they are located adjacent to the declaration for the element with which they are associated.
XP 90 WORKING WITH ATTRIBUTE TYPES While all attribute types are text strings, you can control the type of text used with the attribute. There are three general categories of attribute values: – CDATA – enumerated – Tokenized CDATA types are the simplest form and can contain any character except those reserved by XML. Enumerated types are attributes that are limited to a set of possible values.
XP 91 WORKING WITH ATTRIBUTE TYPES The general for of an enumerated type is: attribute (value1 | value2 | value3 | …) For example, the following declaration: customer custType (home | business )> restricts CustType to either “home” or “business”
XP 92 WORKING WITH ATTRIBUTE TYPES Another type of enumerated attribute is notation. It associates the value of the attribute with a declaration located elsewhere in the DTD. The notation provides information to the XML parser about how to handle non-XML data. Tokenized types are text strings that follow certain rules for the format and content. The syntax is: attribute token
XP 93 WORKING WITH ATTRIBUTE TYPES There are seven tokenized types. For example, the ID token is used with attributes that require unique values. For example, if a customer ID needs to be unique, you may use the ID token: customer custID ID This ensures each customer will have a unique ID.
XP 94 ATTRIBUTE TYPES This figure shows the attribute types
XP 95 ATTRIBUTE DEFAULTS The final part of an attribute declaration is the attribute default. There are four possible defaults: – #REQUIRED: the attribute must appear with every occurrence of the element. – #IMPLIED: The attribute is optional. – An optional default value: A validated XML parser will supply the default value if one is not specified. – #FIXED: The attribute is optional but if one is specified, it must match the default.
XP 96 INSERTING ATTRIBUTE-LIST DECLARATIONS This figure the revised contents of the Orders.xml file attribute declaration
XP 97 WORKING WITH ENTITIES Entities are storage units for a document’s content. The most fundamental entity is the XML document itself and is known as the document entity. Entities can also refer to: – a text string – a DTD – an element or attribute declaration – an external file containing character or binary data
XP 98 WORKING WITH ENTITIES Entities can be declared in a DTD. How to declare an entity depends on how it is classified. There are three factors involved in classifying entities: – The content of the entity – How the entity is constructed – Where the definition of the entity is located.
XP 99 GENERAL PARSED ENTITIES General entities are declared in the DTD of a document. The syntax is: Where entity is the name assigned to the entity and value is the general entity’s value. For example, an entity named “DCT5Z” can be created to store a product description:
XP 100 GENERAL PARSED ENTITIES After an entity is declared, it can be referenced anywhere within the document. &DCT5Z; This is interpreted as Tapan Digital Camera 5 Mpx - zoom
XP 101 ENTITIES IN THE ITEMS.DTD FILE This figure shows the entities in the codestxt.dtd file entity nameentity value
XP 102 PARAMETER ENTITIES Parameter entities are used to store the content of a DTD. For internal parameter entities, the syntax is: where entity is the name of the parameter entity and value is a text string of the entity’s value. For external parameter entities, the syntax is: where uri is the name assigned to the parameter entity.
XP 103 PARAMETER ENTITIES Parameter entity references can only be placed where a declaration would normally occur, such as an internal or external DTD. Parameter entities used with an internal DTD do not offer any time or effort savings. However, an external parameter entity can allow XML to use more than one DTD per document by combining declarations from multiple DTDs.
XP 104 USING PARAMETER ENTITIES TO COMBINE MULTIPLE DTDS This figure shows how to combine multiple DTDs using parameter entities
XP 105 UNPARSED ENTITIES You need to create an unparsed entity in order to reference binary data such as images or video clips, or character data that is not well formed. The unparsed entity includes instructions for how the unparsed entity should be treated. A notation is declared that identifies a resource to handle the unparsed data.
XP 106 UNPARSED ENTITIES For example, to create a notation named “audio” that points to an application Recorder.exe: Once the notation has been declared, you then declare an unparsed entity that instructs the XML parser to associate the data to the notation.
XP 107 UNPARSED ENTITIES For example, to take unparsed data in an audio file and assign it to an unparsed entity named “Theme:”, use the following: Here, the notation is the jpeg notation that points to the paint.exe file. This declaration does not tell the paint.exe application to run the file but simply identifies for the XML parser what resource is able to handle the unparsed data.
XP 108 URIs, URLs, AND URNs A proposed type of URI is the URN or Universal Resource Name. A URN is a persistent resource identifier, meaning the user need only know the name of a resource. An agency would then retrieve a copy of the resource independent of its location. URNs take the form: urn:NID:NSS
XP 109 APPLYING A NAMESPACE TO AN ELEMENT Once it has been declared and its URI specified, the namespace is applied to elements and attributes by inserting the namespace prefix before each element name that belongs to the namespace. content Here, prefix is the namespace prefix and element is the local part of the element name.
XP 110 APPLYING A NAMESPACE TO AN ELEMENT Prefixed names are called qualified names and an element name without a namespace prefix is called an unqualified name. Qualified names can be added to a document using code entered directly into the document. However, the more common way is to add the xmlns attribute to an element.
XP 111 DECLARING A NAMESPACE AS AN ELEMENT ATTRIBUTE The syntax is: xmlns:prefix=“URI” Where prefix and URI are the prefix and URI for the namespace.
XP 112 DECLARING A NAMESPACE AS AN ELEMENT ATTRIBUTE For example, the code: Laser4C (PR205) Entry level color laser printer color laser 320
XP 113 DECLARING A NAMESPACE AS AN ELEMENT ATTRIBUTE …applies the namespace namespace to the model element and all of its child elements. While the “mod” prefix was only added to the model element name, the XML parser considers the other elements parts of the model namespace and they inherit the namespace.
XP 114 DECLARING A NAMESPACE AS AN ELEMENT ATTRIBUTE They are unqualified elements, though, because they lack a namespace prefix. Declaring a namespace by adding it as an attribute of the document’s root element places all elements in the namespace. All elements thus are children of the root element.
XP 115 DECLARING A DEFAULT NAMESPACE You can specify a default namespace by omitting the prefix in the namespace declaration. The element containing the namespace attribute and all of its child elements are assumed to be part of the default namespace.
XP 116 USING NAMESPACES WITH ATTRIBUTES Attributes, like elements, can become qualified by adding the namespace prefix to the attribute name. For example, content
XP 117 USING NAMESPACES WITH ATTRIBUTES No element may contain two attributes with the same name. No element may contain two qualified attribute names with the same local part, pointing to identical namespaces, even if the prefixes are different.
XP 118 ADDING A NAMESPACE TO A STYLE SHEET: DECLARING A NAMESPACE To declare a namespace in a style sheet, you add the following rule to the style sheet prefix url(uri); Where prefix is the namespace previx and uri is the URI of the namespace mod url(
XP 119 APPLYING A NAMESPACE TO A SELECTOR Once you’ve declared a namespace in a style sheet, you can associate selectors with that namespace using the syntax: prefix|selector {attribute1:value1; attribute2:value2;…} For example: mod|title {width: 150px} You also can use the wildcard symbol (*) to apply a style to any element within a namespace or to elements across different namespaces
XP 120 DEFINING NAMESPACES WITH THE ESCAPE CHARACTER Not all browsers support the use of rule A proposal implement in the Internet Explorer browser was to insert the backslash escape character before the namespace prefix in CSS style sheets: prefix\:selector {attribute1:value1; attribute2:value2;…} Browsers like Firefox, Opera, and Netscape do not support this method with XML documents
XP 121 DECLARING AND APPLYING A NAMESPACE IN A STYLE SHEET To declare a namespace in a CSS style sheet, add the following rule before any style prefix url(uri); where prefix is the namespace prefix and uri is the namespace URI. If no prefix is specified, the namespace URI is the default namespace for selectors in the style sheet. To apply a namespace to a selector, use the form prefix|selector {attribute1:value1; attribute2:value2;...} where prefix is the namespace prefix and selector is a selector for an element or group of elements in the document. For Internet Explorer browsers, use the following form to apply a namespace to a selector: prefix\:selector {attribute1:value1; attribute2:value2;...}
XP 122 COMBINING STANDARD VOCABULARIES Standard vocabularies may be combined within single documents
XP 123 CONVERTING HTML TO XHTML Use your text editor to open the reptxt.htm file from the tutorial.02x/tutorial folder. Enter your name and the date in the comment section at the top of the document. Save the file as report.htm. Insert the following xml declaration as the very first line in the file (above the comment section): Add the following attribute to the opening tag: xmlns="
XP 124 CONVERTING HTML TO XHTML
XP 125 ADDING THE ELEMENTS OF THE PARTS VOCABULARY Return to the order.xml file in your text editor. Copy the parts element from the parts namespace, including all of the elements and contents it contains. Return to the report.htm file in your text editor and paste the copied elements directly below the h2 heading “Parts List.” Add the following attribute to the opening tag: xmlns:pa=" Below the link element that links the report.htm file to the report.css style sheet, insert the following link element: Save the changes and open the report.htm file in your Web browser
XP 126 ADDING THE ELEMENTS OF THE PARTS VOCABULARY
XP 127 DESCRIBING THE ITEMS IN THE PARTS LIST Return to the report.htm file in your text editor. Scroll down to the first title element in the parts namespace. Directly after the opening tag, insert the text Title Directly after the opening tag in the next line, insert the text Description Directly after the opening tag in the following line, insert the text Parts in Stock Repeat the previous 3 steps, as necessary
XP 128 DESCRIBING THE ITEMS IN THE PARTS LIST
XP 129 DESCRIBING THE ITEMS IN THE PARTS LIST
XP 130 ADDING ELEMENTS FROM THE MODELS VOCABULARY Return to the report.htm file in your text editor and add the following namespace declaration to the opening tag: xmlns:mod=" Add the following link to the document’s head: In the table cell directly after the Title table heading, insert the element Laser4C (PR205)
XP 131 ADDING ELEMENTS FROM THE MODELS VOCABULARY In the table cell directly after the Description table heading, insert the element Entry level color laser printer In the table cell directly after the Type table heading, insert the element color laser In the table cell directly after the “Items to be Built” table heading, insert the element 320
XP 132 ADDING ELEMENTS FROM THE MODELS VOCABULARY
XP 133 TUTORIAL 4 WORKING WITH SCHEMAS
XP 134 SCHEMAS A schema is an XML document that defines the content and structure of one or more XML documents. The XML document containing the content is called the instance document.
XP 135 COMPARING SCHEMAS AND DTDS This figure compares schemas and DTDs
XP 136 SCHEMA VOCABULARIES There is no single schema form. Several schema “vocabularies” have been developed in the XML language. Support for a particular schema depends on the XML parser being used for validation.
XP 137 SCHEMA VOCABULARIES This figure shows a few schema vocabularies
XP 138 STARTING A SCHEMA FILE A schema is always placed in a separate XML document that is referenced by the instance document.
XP 139 ELEMENTS AND ATTRIBUTES OF THE PATIENTS DOCUMENT This figure shows the elements and attributes of the patients.xml document
XP 140 SCHEMA TYPES XML Schema recognize two categories of element types: complex and simple. A complex type element has one or more attributes, or is the parent to one or more child elements. A simple type element contains only character data and has no attributes.
XP 141 SCHEMA TYPES This figure shows types of elements
XP 142 SIMPLE TYPE ELEMENTS Use the following syntax to declare a simple type element in XML Schema: Here, name is the name of the element in the instance document and type is the data type of the element. If a namespace prefix is used with the XML Schema namespace, any XML Schema tags must be qualified with the namespace prefix.
XP 143 UNDERSTANDING DATA TYPES XML Schema supports two data types: built-in and user-derived. A built-in data type is part of the XML Schema specifications and is available to all XML Schema authors. A user-derived data type is created by the XML Schema author for specific data values in the instance document.
XP 144 DECLARING AN ATTRIBUTE An attribute is another example of a simple type. The syntax to define an attribute is Where name is the name of the attribute, type is the data type, default is the attribute’s default value, and fixed is a fixed value for the attribute.
XP 145 ASSOCIATING ATTRIBUTES AND ELEMENTS The basic structure for defining a complex type element with XML Schema is declarations Where name is the name of the element and declarations is schema commands specific to the type of complex element being defined.
XP 146 ASSOCIATING ATTRIBUTES AND ELEMENTS Four complex type elements that usually appear in an instance document are the following: – The element is an empty element and contains only attributes. – The element contains textual content and attributes but no child elements. – The element contains child elements but not attributes. – The element contains both child elements and attributes.
XP 147 EMPTY ELEMENTS AND ATTRIBUTES The code to declare the attributes of an empty element is attributes Where attributes is the set of declarations that define the attributes associated with the element. For example, the empty element
XP 148 SIMPLE CONTENT AND ATTRIBUTES If an element is not empty and contains textual content (but no child elements), the structure of the complex type element is slightly different. attributes
XP 149 SPECIFYING THE USE OF AN ATTRIBUTE An attribute may or may not be required with a particular element. To indicate whether an attribute is required, you add the use attribute to the element declaration or reference. The use attribute has the following values: – required—The attribute must always appear with the element – optional—The use of the attribute is optional with the element – prohibited—The attribute cannot be used with the element
XP 150 REFERENCING AN ELEMENT OR ATTRIBUTE XML Schema allows for a great deal of flexibility in designing complex types. Rather than nesting the attribute declaration within the element, you can create a reference to it. The code to create a reference to an element or attribute declaration is Where elemName is the name used in an element declaration and attName is the name used in an attribute declaration
XP 151 WORKING WITH CHILD ELEMENTS Another kind of complex type element contains child elements, but no attributes. To define these child elements, use the code structure elements Where elements is the list of simple type element declarations for each child element, and compositor defines how the child elements are organized.
XP 152 USING COMPOSITORS XML Schema supports the following compositors: – sequence defines a specific order for the child elements – choice allows any one of the child elements to appear in the instance document – all allows any of the child elements to appear in any order in the instance document; however, they must appear either only once or not all.
XP 153 WORKING WITH CHILD ELEMENTS AND ATTRIBUTES The code for a complex type element that contains both attributes and child elements is elements attributes
XP 154 SPECIFYING MIXED CONTENT When the mixed attribute is set to the value “true,” XML Schema assumes that the element contains both text and child elements. The structure of the child elements can then be defined with the conventional method. For example, the XML content Patient Cynthia Davis was enrolled in the Tamoxifen Study on 8/15/2003. can be declared in the schema file using the following complex type:
XP 155 APPLYING A SCHEMA To attach a schema to the document, you must do the following: – Declare a namespace for XML Schema in the instance document. – Indicate the location of the schema file. To declare the XML Schema namespace in the instance document, you add the following attribute to the document’s root element: xmlns:xsi=" instance"
XP 156 APPLYING A SCHEMA If there is no namespace for the contents of the instance document, add the following attribute to the root element: xsi:noNamespaceSchemaLocation="schema"
XP 157 UNDERSTANDING DATA TYPES A primitive data type, also called a base type, is one of 19 fundamental data types not defined in terms of other types. A derived data type is a collection of 25 data types that the XML Schema developers created based on the 19 primitive types.
XP 158 UNDERSTANDING DATA TYPES This figure shows the 44 built-in data types
XP 159 UNDERSTANDING DATA TYPES This figure shows a partial description of XML string data types
XP 160 UNDERSTANDING DATA TYPES This figure shows a partial description of XML numeric data types
XP 161 UNDERSTANDING DATA TYPES This figure shows a partial description of XML date and time data types
XP 162 DERIVING NEW DATA TYPES Three components are involved in deriving new data types: – Value space: the set of values that correspond to the data type. – Lexical space: the set of textual representations of the value space. – Facets: the properties of the data type that distinguish one data type from another.
XP 163 USER DERIVED DATA New data types fall into three categories: – List: a list of values where each list is derived from a base type. – Union: the combination of two or more data types. – Restriction: a limit placed on the facet of a base type.
XP 164 DERIVING A RESTRICTED DATA TYPE The most common way to derive a new data type is to restrict the properties of a base type. XML Schema provides twelve constraining facets for this purpose.
XP 165 CONSTRAINING FACETS This figure shows the 12 constraining facets
XP 166 The Patterns Facet A pattern can be created with a formatted text string called a regular expression or regex. To apply a regular expression in a data type, you use the code Where regex is a regular expression pattern.
XP 167 PATTERN QUANTIFIERS This figure shows pattern quantifiers
XP 168 WORKING WITH NAMED TYPES Since content can be either simple or complex, it is not surprising that XML Schema also allows schema authors to create customized complex types. The advantage of creating a complex type is that the complex structure can be reused in the document. For example, the following code declares an element named client containing the complex content of two child elements named firstName and lastName:
XP 169 NAMED MODEL GROUPS A named model group is a collection, or group, of elements. The syntax for creating a model group is elements Where name is the name of the model group, and elements is a collection of element declarations
XP 170 WORKING WITH NAMED ATTRIBUTE GROUPS Attributes can be grouped into collections called named attribute groups. This is particularly useful for attributes that you want to use with several different elements in a schema. The syntax for a named attribute group is attributes Where name is the name of the attribute group and attributes is a collection of attributes assigned to the group.
XP 171 STRUCTURING A SCHEMA One schema design is a Flat Catalog Design. In this design, all element declarations are made globally. The structure of the instance document is created by referencing the global element declarations. The syntax is:
XP 172 FLAT CATALOG DESIGN This figure shows a Flat Catalog design
XP 173 STRUCTURING A SCHEMA Schemas can be structured in a number of ways. One structure is called a Russian Doll design. This design involves sets of nested declarations. While this design makes it easy to associate the schema with the instance document, it can be confusing and difficult to maintain.
XP 174 RUSSIAN DOLL DESIGN This figure shows a Russian Doll design
XP 175 VENETIAN BLIND DESIGN A Venetian blind design is similar to a flat catalog, except that instead of declaring elements and attributes globally, it creates named types and references those types within a single global element In this layout, the only globally declared element is the patients element; all other elements and attributes are placed within element or attribute groups or, in the case of the performance element, within a named complex type
XP 176 VENETIAN BLIND DESIGN
XP 177 COMPARING SCHEMA DESIGNS This figure compares the three schema designs
XP 178 PLACING A SCHEMA IN A NAMESPACE: TARGETING A NAMESPACE To associate a schema with a namespace, you first declare the namespace and then make that namespace the target of the schema. To do this, you add the following attributes to the schema’s root element: prefix:xmlns="uri" targetNamespace="uri" Where prefix is the prefix of the XML Schema namespace and uri is the URI of the target namespace
XP 179 VALIDATING A COMBINED DOCUMENT This figure shows how schemas are combined when the data is combined
XP 180 APPLYING A SCHEMA TO A DOCUMENT WITH A NAMESPACE To apply a schema to a document with a namespace, add the following attributes to the instance document’s root element: xmlns:xsi=" instance" xsi:schemaLocation="uri schema" Where uri is the URI of the namespace and schema is the location and name of the schema file. All global elements and attributes from the schema must be qualified in the instance document.
XP 181 INCLUDING AND IMPORTING SCHEMAS To include a schema from the same namespace, add the following element as a child of the schema element: Where schema is the name and location of the schema file. To import a schema from a different namespace, use the syntax Where uri is the URI of the imported schema’s namespace and schema is the name and location of the schema file.
XP 182 REFERENCING OBJECTS FROM OTHER SCHEMAS Once a schema is imported, any objects it contains with global scope can be referenced To reference an object from an imported schema, you must declare the namespace of the imported schema in the schema element You can then reference the object using the ref attribute or the type attribute for customized simple and complex types
XP 183 TUTORIAL 5 WORKING WITH XSLT AND XPATH
XP 184 In this chapter, you will: Learn about the history and theory of XSL Understand XPath and examine a node tree Create an XSLT style sheet Be introduced to syntax of the XPath language Transform an XML document into an HTML file Create templates to format sections of the XML document OBJECTIVES
XP 185 OBJECTIVES In this chapter, you will: Sort the contents of an XML document Create conditional nodes to generate different HTML code Use predicates to select subsets of an XML document Insert new elements and attributes in the transformed document
XP 186 THE HISTORY OF XSL In 1998, the W3C developed the Extensible Style sheet Language, or XSL XSL is composed of three parts: – XSL-FO (Extensible Style sheet Language – Formatting Objects) – XSLT (Extensible Style sheet Language Transformations)
XP 187 INTRODUCING XSLT STYLE SHEETS AND PROCESSORS An XSLT style sheet contains instructions for transforming the contents of an XML document into another format An XSLT style sheet document is itself an XML document An XSLT style sheet document has an extension.xsl
XP 188 GENERATING A RESULT DOCUMENT An XSLT style sheet converts a source document of XML content into a result document by using the XSLT processor
XP 189 INTRODUCING XSLT STYLE SHEETS AND PROCESSORS The transformation can be performed by a server or a client In a server-side transformation, the server receives a request from a client, applies the style sheet to the source document, and returns the result document to the client In a client-side transformation, a client requests retrieval of both the source document and the style sheet from the server, then performs the transformation, and generates the result document
XP 190 CREATING AN XSLT STYLE SHEET To create an XSLT style sheet, the general structure: <xsl:stylesheet version = “1.0” xmlns:xsl =“ Content of the style sheet The tag can be substituted for the tag
XP 191 WORKING WITH DOCUMENT NODES Under XPath, each component in the document is referred to as a node, and the entire structure of the document is a node tree The node tree consists of the following objects: – the source document itself – comments – processing instructions – namespaces – elements, – element text – element attributes
XP 192 NODE TREE EXAMPLE
XP 193 WORKING WITH DOCUMENT NODES At the top of the node is the root node A node that contains other nodes is called a parent node, and the nodes contained in the parent are called child nodes Nodes that share a common parent are called sibling nodes Any node below another node is referred to as a descendant of that node
XP 194 WORKING WITH DOCUMENT NODES Nodes are distinguished based on the object they refer to in the document A node for an element is called an element node The node that stores element attributes is called an attribute node
XP 195 USING XPATH TO REFERENCE A NODE XPath provides the syntax to refer to the various nodes in the node tree The syntax is used by operation system to specify file pathnames The location of a node can be expressed in either absolute or relative terms XPath also does data extraction
XP 196 RELATIVE PATHS With a relative path, the location of the node is indicated relative to a specific node in the tree called the context node
XP 197 USING XPATH TO REFERENCE A NODE For absolute path, XPath begins with the root node, identified by a forward slash and proceeds down the levels of the node tree An absolute path: /child1/child2/child3/… To reference an element without regard to its location in the node tree, use a double forward slash with the name of the descendant node A relative path : //descendant
XP 198 REFERENCING GROUPS OF ELEMENTS XPath allows you to refer to groups of nodes by using the wildcard character (*) To select all of the nodes in the node tree, you can use the path: //* The (*) symbol matches any node, and the (//)symbol matches any level of the node tree Example: /portfolio/stock/*
XP 199 REFERENCING ATTRIBUTE NODES XPath uses different notation to refer to attribute nodes The syntax for attribute node where attribute is the name of the attribute Example:
XP 200 WORKING WITH TEXT NODES The text contained in an element node is treated as a text node The syntax for referencing a text node is: text() To match all text nodes in the document, use: //text()
XP 201 CREATING THE ROOT TEMPLATE A template is a collection of elements that define how a particular section of the source document should be transformed in the result document The root template sets up the initial code for the result document
XP 202 CREATING A TEMPLATE To create a template, the syntax is: styles – where node set is an XPath expression that references a node set from the source document and styles are the XSLT styles applied to those nodes
XP 203 CREATING A ROOT TEMPLATE To create a root template, the syntax is: styles
XP 204 CREATING THE ROOT TEMPLATE A template contains two types of content: XSLT elements and literal result elements – XSLT elements are those elements that are part of the XSLT namespace and are used to send commands to the XSLT processor – A literal result element is text sent to the result document, but not acted upon by the XSLT processor
XP 205 CREATING THE ROOT TEMPLATE EXAMPLE
XP 206 SPECIFYING THE OUTPUT METHOD By default, the XSLT processor will render the result document as an XML file To control how the processor formats the source document, you can specify the output method using the element
XP 207 ATTRIBUTS OF THE ELEMENT
XP 208 TRANSFORMING A DOCUMENT A browser with a built-in XSLT processor allows you to view the result document Alternatively, you can use XML Spy to create the result document as a separate file, and then view that file in your browser Most XSLT processors provide the capability to create the result document as a separate file
XP 209 VIEWING THE RESULT DOCUMENT IN A BROWSER Internet Explorer 6.0 contains built-in XSLT processor You can view the results of the transformation by opening the result document in the browser
XP 210 CREATING AN HTML FILE IN XML SPY One advantage of creating a separate HTML file is that it can be viewed in any Web browser You have to regenerate the HTML file every time you make a change to the source document, or the style sheet The XSLT processor adds one extra line to the document that provides additional information to the browser about the content of the document and its encoding
XP 211 TRANSFORMING THE SOURCE DOCUMENT IN XML SPY
XP 212 EXTRACTING ELEMENT VALUES To insert a node’s value into the result document, the syntax is: – select=“expression” /> – where expression is an expression that identifies the node from the source document’s node tree If the node contains child elements in addition to text content, the text in those child nodes appears as well
XP 213 INSERTING A NODE VALUE EXAMPLE
XP 214 PROCESSING SEVERAL ELEMENTS To process a batch of nodes, the syntax is: styles where expression is an expression that defines the group of nodes to which the XSLT and literal result elements are applied
XP 215 PROCESSING SEVERAL ELEMENTS
XP 216 WORKING WITH TEMPLATES To apply a template in the result document, use the XSLT element – where expression indicates the node template to be applied
XP 217 CREATING TEMPLATE EXAMPLE
XP 218 USING THE BUILT-IN TEMPLATES Each node has its own built-in template. The built-in template for element nodes matches the document root and all elements in the node tree The built-in template for text nodes matches all text nodes and causes their values to appear in the result document For example, you can add the stock template to the style sheet
XP 219 CREATING THE STOCK TEMPLATE EXAMPLE
XP 220 SORTING NODE SETS By default, nodes are processed in document order, by their appearance in the document To specify a different order, XSLT provides the element This element can be used with either the or the element
XP 221 SORTING NODE SETS The element contains several attributes to control how the XSLT process sorts the nodes in the source document – The select attribute determines the criteria under which the context node is sorted – The data-type attribute indicates the type of data – The order attribute indicates the direction of the sorting (ascending or descending)
XP 222 CREATING CONDITIONAL NODES XSLT supports two kinds of conditional elements: – To apply a format only if a particular condition is met, use the element To test for multiple conditions and display different outcomes, use the element
XP 223 CREATING CONDITIONAL NODES EXAMPLE
XP 224 USING COMPARISON OPERATORS AND FUNCTIONS
XP 225 WORKING WITH PREDICATES Predicates are XPath expressions that test for a condition and create subsets of nodes that fulfill that condition The predicate can also indicate the position of the node in the node tree To select a specific position in the source document, use the position() function combined with any XPath expression
XP 226 ADDING PREDICATES TO THE ROOT TEMPLATE EXAMPLE
XP 227 CREATING ELEMENTS AND ATTRIBUTES To create an element, XSLT uses the tag The namespace attribute assigns a name to the element The namespace attribute provides a namespace The use-attribute provides a list of attribute-sets
XP 228 CREATING AN ELEMENT To create the element in the result document, use the tag
XP 229 CREATING AN ATTRIBUTE Attributes are created in XSLT by using the element The name attribute specifies the name of the attribute The namespace attribute indicates the namespace You can create inline images in the result document by using the attribute tag
XP 230 CREATING AN ATTRIBUTE To add the href attribute to the tag, use the element
XP 231 CREATING COMMENTS AND PROCESSING INSTRUCTIONS The element creates the comment You can create a processing instruction by using the element If you want to add a processing instruction to attach the result document to the style.css sheet, use the following code:
XP 232 SUMMARY Extensible Style sheet Language,or XSL, is composed of three parts: XSL-FO, XSLT, and XPath XPath language is used to reference a node Templates are used to format sections of the XML document and transform XML data into a variety of formats
XP 233 SUMMARY Nodes can be sorted in either alphabetical or numerical order Comparison elements allow changing the contents of the result document based on the values of the nodes in the source document Predicates are used to create subsets of the source document’s node tree You can insert new elements and attributes in the transformed document