Copyright © 2003 Pearson Education, Inc. Slide 1-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide to XML by Cheryl M. Hughes
Copyright © 2003 Pearson Education, Inc. Slide 1-2 CHAPTER 1 An Overview of XML
Copyright © 2003 Pearson Education, Inc. Slide 1-3 What is XML? XML stands for “Extensible Markup Language XML is a “metalanguage” that can be used to create markup languages XML languages can be created to describe specific data XML is an open standard, meaning that it is not tied to any specific technologies XML files can be created and edited with a text editor
Copyright © 2003 Pearson Education, Inc. Slide 1-4 Markup Language Fundamentals A “markup language” is a set of rules that define the structure of a document Programs, or applications, are used to interpret documents containing markup Some applications contain rules and instructions that can produce documents that can only be interpreted by that application – this is known as a “proprietary” format XML documents are “portable” because they can be interpreted by many different applications
Copyright © 2003 Pearson Education, Inc. Slide 1-5 The Beginning:SGML SGML stands for “Standard Generalized Markup Language” SGML was developed in the 1960’s and was the first standardized markup language SGML provides a framework for creating other markup languages XML and HTML are both SGML languages SGML is used mainly for very large documentation projects
Copyright © 2003 Pearson Education, Inc. Slide 1-6 HTML HTML was developed in the mid 1990’s as a lightweight language to be used for exchanging information over the World Wide Web HTML is an open standard, meaning that it is free to use and not tied to any particular technologies HTML documents, like XML documents, are plain text documents and can be created using a text editor HTML is limited in it’s scope and can not be extended
Copyright © 2003 Pearson Education, Inc. Slide 1-7 HTML Document Example Job Posting: Webmaster JOB POSTING 7 Job Title: Webmaster 8 Job Description: We are looking for a Webmaster to oversee 9the management of our company website. The Webmaster will be 10responsible for working with other staff members to collect 11information for the website, and for creating and maintaining the web 12pages. 13 Skills needed: Basic writing skills, good communication 14skills, HTML 15 16
Copyright © 2003 Pearson Education, Inc. Slide 1-8 HTML Document Example
Copyright © 2003 Pearson Education, Inc. Slide 1-9 The Need for XML XML was developed partly because of the limitations of HTML The W3C (World Wide Web Consortium) released the official XML version 1.0 specification in 1998 XML quickly gained popularity in the Web community XML itself is NOT a language, but rather a set of tools that can be used to create markup languages
Copyright © 2003 Pearson Education, Inc. Slide 1-10 Benefits of XML XML: Allows data to be self-describing Allows an author to create rules for the content an element can contain Languages can be developed for industry-specific or company-specific needs Elements describe the data, not the format Provides extensive linking functionality Can be used to interchange data between two proprietary formats Can be used to define standard syntax for many different languages Contains robust searching capabilities
Copyright © 2003 Pearson Education, Inc. Slide 1-11 Data vs. Presentation XML elements describe data properties HTML elements describe formatting properties XML elements can be formatted by using “style sheets” A style sheet is a set of instructions that describes how to format a document Many style sheets can be created to provide different presentations of a single document (ie – print vs. web page) A single style sheet can be used to provide formatting instructions for many XML documents
Copyright © 2003 Pearson Education, Inc. Slide 1-12 Differences Between XML and HTML XML is not dependant on a single document type XML allows an author to create elements that best fit the data XML separates data from presentation XML is strict about syntax XML tags are case-sensitive XML documents can be used with many different clients, not just web browsers XML documents require style sheets for their formatting information
Copyright © 2003 Pearson Education, Inc. Slide 1-13 XHTML: The Best of Both Worlds XHTML stands for “Extensible Hypertext Markup Language” XHTML is a language that is meant to merge HTML and XML XHTML contains the HTML element set, but adheres to XML’s syntax rules XHTML is extensible XHTML is accepted by many browsers
Copyright © 2003 Pearson Education, Inc. Slide 1-14 XML Document Example Job Title: Webmaster 4 We are looking for a Webmaster to oversee the management 5 of our company website. The Webmaster will be responsible for 6working with other staff members to collect information for the 7website, and for creating and maintaining the web 8pages Basic writing skills 11 good communication skills 12 HTML 13 14
Copyright © 2003 Pearson Education, Inc. Slide 1-15 XML Document Example
Copyright © 2003 Pearson Education, Inc. Slide 1-16 XML Element Structure
Copyright © 2003 Pearson Education, Inc. Slide 1-17 CHAPTER 2 A Closer Look at XML Documents
Copyright © 2003 Pearson Education, Inc. Slide 1-18 XML Syntax “Syntax” refers to the rules of a language Syntax is needed with any language so that the documents created with that language are consistent Programs that process documents expect the syntax rules to be followed, otherwise the document may not be interpreted correctly
Copyright © 2003 Pearson Education, Inc. Slide 1-19 Components of an XML Document XML Declaration Elements Attributes Entities Comments
Copyright © 2003 Pearson Education, Inc. Slide 1-20 Components: The XML Declaration The XML Declaration: Tells the processing program that the document is an XML document, along with other optional information The declaration is always the first line of an XML document Attributes that can be used in the Declaration: version encoding standalone Example:
Copyright © 2003 Pearson Education, Inc. Slide 1-21 Components: XML Elements Elements: Used to describe the data. Consist of: A start tag Content An end tag Example: Content The “root” element of a document is the outermost element, and contains all of the other elements in the document. There can be only one root element in a single document An element that does not contain any content is known as an “empty element”
Copyright © 2003 Pearson Education, Inc. Slide 1-22 Element Nesting The term “nesting” refers to the process of containing elements within other elements Terminology: Child elements – elements that are contained within other elements Parent elements – elements that contain other elements Sibling elements – elements that share the same parent element
Copyright © 2003 Pearson Education, Inc. Slide 1-23 Nesting Example 1 2 Sally 3 Joe 4 5 Larry 6 Curly 7 Mo 8 9
Copyright © 2003 Pearson Education, Inc. Slide 1-24 Components: XML Attributes Attributes help to describe XML elements Attributes are always contained in the start tag of the element they are describing Attributes are known as “name-value pairs” Example: address=“123 Main Street”
Copyright © 2003 Pearson Education, Inc. Slide 1-25 Components: XML Entities Two types of entities: General – placeholders for information contained in the XML document Parameter – used within a DTD to reference a grouping of elements Three types of general entities: Character – used in place of special characters Content – used for blocks of frequently used text Unparsed – used for binary or non-text data, like image files
Copyright © 2003 Pearson Education, Inc. Slide 1-26 Examples of Entities Character entity: Character: > Entity reference: > or > Usage: x > y Content entity: Declaration: Usage: &address; Unparsed entity: Declaration: Usage: &aimage;
Copyright © 2003 Pearson Education, Inc. Slide 1-27 Components: Comments An XML comment is ignored by applications that process XML Comments are commonly used for documentation, or to add information for others viewing the document The content of the comment is surrounded by special comment tags: Example:
Copyright © 2003 Pearson Education, Inc. Slide 1-28 Well-Formed XML Documents A “well-formed” document is one which adheres to the syntax rules for XML: An XML document contains one root element All elements must have start and end tags, except for empty elements Elements must be properly nested All attributes must have a value Attributes can only appear in the start tag and must be unique to that element Element names are case-sensitive Special characters must be written as entities Names of element can start only with letters or an underscore, and can contain letters, numbers, hyphens, periods and underscores
Copyright © 2003 Pearson Education, Inc. Slide 1-29 XML Parsers A “parser” is a program that checks the syntax of an XML document to ensure that the document is well-formed Two types of parsers: Non-validating – only checks for syntax Validating – checks syntax and verifies the document against a DTD or Schema
Copyright © 2003 Pearson Education, Inc. Slide 1-30 CHAPTER 3 Describing XML Documents with DTD’s and XML Schemas
Copyright © 2003 Pearson Education, Inc. Slide 1-31 XML Document Model A “document model” is used to enforce structure within a document Two types of document models for XML: DTD – Document Type Definition XML Schema Document models are not required in XML
Copyright © 2003 Pearson Education, Inc. Slide 1-32 Validating Parsers A validating parser will check an XML document’s structure against a DTD or XML Schema Documents that conform to a document model are “valid” Validating parsers will report an error if the document does not conform to it’s document model, even if it is well-formed
Copyright © 2003 Pearson Education, Inc. Slide 1-33 Document Type Definition (DTD) XML elements and attributes are defined in a DTD DTD’s are “extensible” – meaning that they can be extended to meet the needs of the task at hand Two types of DTD’s: Internal – DTD exists as part of the document External – DTD is an external file Many free public DTD’s exist today, and can be downloaded from the Internet
Copyright © 2003 Pearson Education, Inc. Slide 1-34 DTD Declarations Element declaration: Attribute declaration: <!ATTLIST element_name attribute_name-1datatype default_value attribute_name-2datatype default_value attribute_name-3datatype default_value>
Copyright © 2003 Pearson Education, Inc. Slide 1-35 Elements: Content Model Types Text: Description: text or character data Syntax: (#PCDATA) Elements: Description: contains other elements Syntax: (element_1, element_2, …) Mixed Content: Description: contains both text and other elements Syntax: (#PCDATA, element_1, element_2, …) Empty: Description: does not contain any content Syntax: EMPTY Any: Description: can contain text or elements Syntax: ANY
Copyright © 2003 Pearson Education, Inc. Slide 1-36 Elements: Character Notations Question Mark: Character: ? Description: element may occur zero or one time Usage: ? Asterisk: Character: * Description: element may occur zero or more times Usage: * Plus: Character: + Description: element may occur one or many times Usage: +
Copyright © 2003 Pearson Education, Inc. Slide 1-37 Elements: Character Notations (cont.) Parentheses: Character: ( ) Description: used to indicate a set Usage: (name, address, zip_code) Vertical bar: Character: | Description: used to indicate a set of values Usage: a | b | c Comma: Character:, Description: used to indicate element sequence Usage: (a, b, c)
Copyright © 2003 Pearson Education, Inc. Slide 1-38 Attributes: Default Values Attribute type: #FIXED Description: value of the attribute must match the value assigned in the DTD Attribute type: #REQUIRED Description: element must contain the attribute to be valid Attribute type: #IMPLIED Description: attribute it optional
Copyright © 2003 Pearson Education, Inc. Slide 1-39 Attributes: A Few Data Types Data type: CDATA Description: character data Data type: ID Description: unique identifier to give an element a label Data type: Enumerated List (ie – (a, b, c) ) Description: list of all possible values that the attribute can contain
Copyright © 2003 Pearson Education, Inc. Slide 1-40 DTD Example: XML File 1 2 3<message num=”a1” to=”joe@acmeshipping.com” 4from=”brenda@xyzcompany.com” date=”02/09/01”> 5 6 7Joe, 8Please let me know if order number has shipped. 9Thanks, 10Brenda
Copyright © 2003 Pearson Education, Inc. Slide 1-41 DTD Example: Internal DTD 1 <!DOCTYPE s [ <!ATTLIST message 5numID#REQUIRED 6toCDATA#REQUIRED 7from CDATA#FIXED“brenda@xyzcompany.com” 8dateCDATA#REQUIRED> 9 10<!ATTLIST subject 11titleCDATA#IMPLIED> <!ATTLIST reply 15status (yes | no) "no"> 16 ]>
Copyright © 2003 Pearson Education, Inc. Slide 1-42 DTD Example: Document with DTD 1 2 <!DOCTYPE s [ <!ATTLIST message 6numID#REQUIRED 7toCDATA#REQUIRED 8from CDATA#FIXED “brenda@xyzcompany.com” 9dateCDATA#REQUIRED> 10 11<!ATTLIST subject 12titleCDATA#IMPLIED> <!ATTLIST reply 16status (yes | no)"no"> 17 ]> 18
Copyright © 2003 Pearson Education, Inc. Slide 1-43 DTD Example: Document with DTD (cont.) 19 20<message num=”a1” to=”joe@acmeshipping.com” 21from=”brenda@xyzcompany.com” date=”02/09/01”> Joe, 25Please let me know if order number has shipped. 26Thanks, 27Brenda
Copyright © 2003 Pearson Education, Inc. Slide 1-44 External DTD External DTD’s are defined in files that are external to the XML document XML Declaration for external DTD: DTD Declaration for external DTD:
Copyright © 2003 Pearson Education, Inc. Slide 1-45 XML Schema Overview XML Schema specification released by the W3C in May 2001, and contains two parts: Part I - structure Part II - data types Developed as an alternative to DTD’s and is much more powerful Features: Pattern matching Rich set of data types Attribute grouping Supports XML namespaces Follows XML syntax
Copyright © 2003 Pearson Education, Inc. Slide 1-46 CHAPTER 4 All About Style: XML Presentation
Copyright © 2003 Pearson Education, Inc. Slide 1-47 XML Presentations XML provides two methods for formatting: Cascading Style Sheets (CSS) Extensible Stylesheet Language (XSL) Benefits of separating style from content Allows authors to create elements that describe the data, not the format Allows for multiple presentation layouts for a single document Allows a single style document to format many XML documents
Copyright © 2003 Pearson Education, Inc. Slide 1-48 Cascading Style Sheets (CSS) CSS was introduced as a recommendation by the W3C in 1996 CSS is widely accepted by web browsers CSS files are plain text files and can be edited with a text editor CSS style sheets work with XML and HTML files
Copyright © 2003 Pearson Education, Inc. Slide 1-49 CSS Syntax CSS Declaration: CSS rules consists of two parts: Element selector Properties declarations CSS rule example: address { font-size: 12pt; font-family: arial } CSS comments: /* This is a comment */
Copyright © 2003 Pearson Education, Inc. Slide 1-50 CSS Properties Major property categories: Font properties Text properties Color properties Border properties Display properties
Copyright © 2003 Pearson Education, Inc. Slide 1-51 CSS Example – XML Document joe@acmeshipping.com 7 brenda@xyzcompany.com 8 02/12/01 9 Order Joe, 12Please let me know if order number has shipped. 13Thanks, 14Brenda
Copyright © 2003 Pearson Education, Inc. Slide 1-52 CSS Example – CSS File 1 to, from { 2font-weight:bold; 3text-align:left; 4border-style:solid 5} 6 date_sent { 7font-style:italic; 8color:blue 9} 10 subject { 11text-decoration:underline; 12background-color:green; 13color:yellow 14} 15 body { 16margin-top:10; 17display:block 18}
Copyright © 2003 Pearson Education, Inc. Slide 1-53 CSS Example – Viewing in Browser
Copyright © 2003 Pearson Education, Inc. Slide 1-54 Overview of XSL The XSL specification was released by the W3C in October 2001 XSL is a style sheet language developed specifically for XML XSL provides more powerful formattin features than CSS XSL behaves more like a programming language, making it very flexible
Copyright © 2003 Pearson Education, Inc. Slide 1-55 CHAPTER 5 Namespaces in XML
Copyright © 2003 Pearson Education, Inc. Slide 1-56 Overview of Namespaces A namespace is a group, or set, of element and attribute names that belong to or describe a document type Each name in a namespace must be unique within that namespace A naming collision occurs when an element name has two different meanings within a document XML namespaces are used to avoid naming collisions and to assign elements to different groupings within a document
Copyright © 2003 Pearson Education, Inc. Slide 1-57 Naming Collision Example Introduction to XML 4 Jane Smith 5 Monday, 5:30-7:30 6 This course covers the basics of XML… 9 The Web Wizard’s Guide to XML 10 C Hughes 11 Addison Wesley 14 15
Copyright © 2003 Pearson Education, Inc. Slide 1-58 Namespace Syntax Namespaces must be declared before they are used Namespaces are declared on elements in the start tag Two types of namespace declarations: Default: xmlns=“URI” xmlns=“ Prefixed: xmlns:prefix=“URI” xmlns:xlink=“ Prefixed namespaces are referred to as “qualified names” Both default and prefixed declarations can exist within a single document
Copyright © 2003 Pearson Education, Inc. Slide 1-59 Uniform Resource Identifier (URI) URI’s can take two forms: Uniform Resource Locator (URL) Example: Uniform Resource Name (URN) Example: urn:local.gov:book Either form of URI can be used with both default and prefixed namespaces
Copyright © 2003 Pearson Education, Inc. Slide 1-60 Scope of a Namespace The “scope” of a namespace determines which elements can belong to a namespace Namespace scope cannot ascend above the element it is declared on within the document hierarchy Namespaces can be declared on sibling and child elements of the element on which it is declared When namespaces are declared on elements outside of the proper hierarchy, and “out-of- scope” error occurs
Copyright © 2003 Pearson Education, Inc. Slide 1-61 Default Namespace Example 1 2 ACME Company Newsletter 3 4 Company Employees Participate in 5K Run 5 J. Fraser 6 7 July, All elements in this document belong to the namespace
Copyright © 2003 Pearson Education, Inc. Slide 1-62 Prefixed Namespace Example 1 2 ACME Company Newsletter 3 4 Company Employees Participate in 5K Run 5 J. Frasier 6 7 July, Only the and elements belong to the namespace because they include the prefix “flag”
Copyright © 2003 Pearson Education, Inc. Slide 1-63 CHAPTER 6 Links in XML
Copyright © 2003 Pearson Education, Inc. Slide 1-64 Overview of Hyperlinks A “hyperlink” in a web page is an object that a user can click on that will redirect the browser to another web page, file or position within the page Hyperlinks in HTML make the web interactive Links in XML are similar in syntax to links in the HTML language XML Linking Language (XLink) is the XML specification for linking The resource being linked to is called the “target resource”
Copyright © 2003 Pearson Education, Inc. Slide 1-65 Linking in HTML The anchor element Require the user to take an action – usually by clicking on the link, which can consist of images or text The target of the link can be an absolute or relative URL The element Does not require user intervention – resource loads automatically when page loads. Usually used for graphics in HTML
Copyright © 2003 Pearson Education, Inc. Slide 1-66 HTML Link Example: HTML file Link Examples in HTML Here are some examples of links in HTML: 7 8 This is an absolute link to a new page This is a relative link to a new page <img src=”button.gif” alt=”This image is a clickable button”> This is link that launches an message
Copyright © 2003 Pearson Education, Inc. Slide 1-67 HTML Link Example: Browser
Copyright © 2003 Pearson Education, Inc. Slide 1-68 HTML Link Limitations Can only point to one target resource Links are unidirectional – once the link is followed, there is no path back to the original document Only certain HTML elements can be used for providing linking functionality
Copyright © 2003 Pearson Education, Inc. Slide 1-69 XLink Overview The XLink specification was released by the W3C in July 2001 Benefits over HTML links: Supports multi-directional links, which allows the target resource to link back to the originating document Can contain multiple destinations Any XML element can be a linking element XML link behavior can be programmed XLink specification defines two types of links Simple links Extended links
Copyright © 2003 Pearson Education, Inc. Slide 1-70 Simple XLink Links Syntax is similar to HTML links Simple links are: Unidirectional Can only link to one target resource Can be defined on any XML element Defined as a namespace: Required attributes: href and type
Copyright © 2003 Pearson Education, Inc. Slide 1-71 Simple Links: Attributes type : Determines the type of link – for simple, the value is always “simple” href : Defines the URL of the target resource show : Defines the behavior of the link after it is activated actuate : Defines when the link will be activated role : Describes the resource being linked to title : Used to describe the link arcrole : Describes the relationship between the source and target documents
Copyright © 2003 Pearson Education, Inc. Slide 1-72 Simple XLink Example 1 2<map 3xmlns:xlink=” 4xlink:type=”simple” 5xlink:href=”mapimage.gif” 6xlink:actuate=”onRequest” 7xlink:show=”replace” 8xlink:role=”image” 9xlink:title=”A map image”> 10Link to Map image 11
Copyright © 2003 Pearson Education, Inc. Slide 1-73 Simple XLink Example with DTD 1 2<!DOCTYPE map [ 3 4<!ATTLIST map 5xmlns:xlinkCDATA#FIXED ” 6xlink:typeCDATA#FIXED “simple” 7xlink:hrefCDATA#REQUIRED> 8]> 9 Link to Map image Attributes that are defined as “#FIXED” in the DTD do not have to be included in the XLink element
Copyright © 2003 Pearson Education, Inc. Slide 1-74 XLink Extended Links Extended links provide much greater functionality than simple links Extended links: Can link to multiple target resources Are multi-direction Extended links can be any of the following types: extended, resource, locator, arc or title
Copyright © 2003 Pearson Education, Inc. Slide 1-75 Extended XLink Example 1<courses xmlns:xlink=” 2 xlink:type=”extended”> 3<locator xlink:type=”locator” 4 xlink:href=”courses/xml101.xml” 5 xlink:title=”XML 101”/> 6<locator xlink:type=”locator” 7 xlink:href=”courses/advxml.xml” 8 xlink:title=”Advanced XML”/> 9<locator xlink:type=”locator” 10 xlink:href=”courses/bw.xml” 11 xlink:title=”Basket Weaving”/> 12Link to Course 13
Copyright © 2003 Pearson Education, Inc. Slide 1-76 Extended XLink Example with DTD 1 2<!DOCTYPE map [ 3 4<!ATTLIST courses 5xmlns:xlinkCDATA#FIXED ” 6xlink:typeCDATA#FIXED “extended”> 7 8<!ATTLIST locator 9xlink:hrefCDATA#REQUIRED 10xlink:titleCDATA#IMPLIED> 11]> 12<courses xmlns:xlink=” 13 xlink:type=”extended”> 14<locator xlink:type=”locator” 15 xlink:href=”courses/xml101.xml” 16 xlink:title=”XML 101”/> 17<locator xlink:type=”locator” 18 xlink:href=”courses/advxml.xml” 19 xlink:title=”Advanced XML”/> 20<locator xlink:type=”locator” 21 xlink:href=”courses/bw.xml” 22 xlink:title=”Basket Weaving”/> 23
Copyright © 2003 Pearson Education, Inc. Slide 1-77 CHAPTER 7 New XML Technologies: XSL Style Sheets and XML Schemas
Copyright © 2003 Pearson Education, Inc. Slide 1-78 XSL Style Sheets XSL stands for “Extensible Stylesheet Language” XSL, like CSS, is a language for defining format and presentation of XML documents The XSL specification was released by the W3C in October of 2001 The XSL specification contains two parts: XSL Formatting Objects (XSL-FO): an XML vocabulary for specifying formatting semantics XSL Transformations (XSLT): a language for transforming XML documents
Copyright © 2003 Pearson Education, Inc. Slide 1-79 XSL Formatting Objects (XSL-FO) XSL-FO is similar to CSS in that it defines formatting and presentation properties The namespace for XSL-FO is: XSL-FO example: 1<fo:block 2font-size=”12pt” 3font-color=”red” 4text-align=”left”> 5Block of text… 6
Copyright © 2003 Pearson Education, Inc. Slide 1-80 XSL Transformations (XSLT) XSLT is used to transform an XML document into another document format (ie – HTML or PDF) The namespace for XSLT is: XSLT is currently the most widely use of XSL
Copyright © 2003 Pearson Education, Inc. Slide 1-81 XSLT Example: XML File Chevy 5 Camaro 6 Blue ,000 9 $18, Ford 13 Mustang 14 Chrome , $30,000
Copyright © 2003 Pearson Education, Inc. Slide 1-82 XSLT Example: XML File (cont.) Jaguar 21 Roadster 22 Red , $23, Porsche Black 32 8, $35,
Copyright © 2003 Pearson Education, Inc. Slide 1-83 XSLT Example: XSL Style Sheet 1 2<xsl:stylesheet version="1.0" xmlns:xsl=" Make and Model 9 Color 10 Year 11 Mileage 12 Price
Copyright © 2003 Pearson Education, Inc. Slide 1-84 XSLT Example: XSL Style Sheet (cont.)
Copyright © 2003 Pearson Education, Inc. Slide 1-85 XML Schemas The XML Schema specification was released by the W3C in May of 2001 XML Schemas, like DTD’s, are used to describe the structure of an XML document The XML Schema specification consists of two parts: XML Schema: Structures. This specification consists of a definition language for describing and constraining the content of XML documents XML Schema: Datatypes. This specification defines the datatypes to be used in XML schemas. The namespace for XML Schema is:
Copyright © 2003 Pearson Education, Inc. Slide 1-86 XML Schema Datatypes The XML Schema specification contains a number of built-in datatypes, and also allows developers to create their own datatypes Some of the built-in datatypes: Integer String Date Time
Copyright © 2003 Pearson Education, Inc. Slide 1-87 XML Schema Occurrence Constraints Occurrence constraints define the number of times a particular element can or must occur Attributes: minOccurs : Defines the minimum number of times an element can occur. Default value is 1 maxOccurs : Defines the maximum number of times an element can occur. Default value is 1 Can set the value of the “maxOccurs” attribute to “unbounded” to indicate that there is no maximum number of times the element can occur
Copyright © 2003 Pearson Education, Inc. Slide 1-88 XML Schema Simple Type Example XML file: 1 2 < 3 xmlns:xsi = " 4 xsi:noNamespaceSchemaLocation = " _schema.xsd"> 5This is my message 6 Schema file:
Copyright © 2003 Pearson Education, Inc. Slide 1-89 XML Schema Complex Type Example XML file: 1 2 <message 3 xmlns:xsi = " 4 xsi:noNamespaceSchemaLocation = "message_schema.xsd"> 5 Joe Poller 6 Brenda Lane 7 8 Order Joe, 11Please let me know if order number has shipped. 12Thanks, 13Brenda 14 15
Copyright © 2003 Pearson Education, Inc. Slide 1-90 XML Schema Complex Type Example Schema file: <xsd:element name=”to” type=”xsd:string” minOccurs-“1” maxOccurs=”unbounded”/>
Copyright © 2003 Pearson Education, Inc. Slide 1-91 CHAPTER 8 XML Programs and Programming
Copyright © 2003 Pearson Education, Inc. Slide 1-92 XML Programming Some applications for processing XML: Parsers Document Object Model (DOM) The Simple Application Programming Interface for XML (SAX) Programming languages (ie – Java) Databases
Copyright © 2003 Pearson Education, Inc. Slide 1-93 XML Parsers Two types of parsers: Validating – checks document against a document model Non-validating – only checks syntax Parsers are code libraries written in a programming language The XML parser that comes with Internet Explorer is called the MSXML parser
Copyright © 2003 Pearson Education, Inc. Slide 1-94 The Document Object Model (DOM) The first DOM specification was released by the W3C in October of 1998 Three levels of the recommendation: Level 1 - provides the core document models Level 2 - includes Level 1 and adds a model for style sheets, Level 3 - includes Level 1 and adds a model for content (DTD or schema) What DOM does: takes an XML document as input and creates an object structure in memory, which can then be accessed by programs DOM creates a tree-like structure, with branches and leaves to represent the hierarchy of the document
Copyright © 2003 Pearson Education, Inc. Slide 1-95 DOM Example – Microsoft’s XMLDOM XMLDOM is a COM component that is included with the Internet Explorer 5.0 The DOM creates an object in memory that can then be accessed by programs Some DOM object properties: childNodes firstChild lastChild Some DOM object methods: Load() CreateNode() Save()
Copyright © 2003 Pearson Education, Inc. Slide 1-96 DOM Example – XML file Job Title: Web master 4 We are looking for a Web master to oversee the management 5 of our company Web site. The Web master will be responsible for 6working with other staff members to collect information for the 7Web site, and for creating and maintaining the Web 8pages Basic writing skills 11 good communication skills 12 HTML 13 14
Copyright © 2003 Pearson Education, Inc. Slide 1-97 DOM Example – HTML file with Javascript Microsoft DOM Example Using Javascript 4 5 6/* This is a comment in JavaScript */ 7/* Create a new DOM object for our document */ 8var xmlDocument = new ActiveXObject("Microsoft.XMLDOM"); 9xmlDocument.async="false"; 10/* Use the "load" method to read our XML document into memory */ 11xmlDocument.load("job-posting.xml"); 12/* Print some HTML code */ 13document.write(" Using the Microsoft DOM "); 14/* BEGIN THE EXAMPLES */ 15/* 1. Prints the value of the first child node */ 16document.write(" 1. firstChild is: "); 17document.write(xmlDocument.documentElement.firstChild.text); 18document.write(" ");
Copyright © 2003 Pearson Education, Inc. Slide 1-98 DOM Example – HTML file with Javascript (cont.) 19/* 2. Prints the value of the last child node */ 20document.write(" 2. lastChild is: "); 21document.write(xmlDocument.documentElement.lastChild.text); 22document.write(" "); 23/* 3. Prints the name of the lastChild element */ 24document.write(" 3. lastChild node name is: "); 25document.write(xmlDocument.documentElement.lastChild.nodeName); 26document.write(" "); 27/* 4. Prints the name of the child node stored in item(1) */ 28/* Then, it checks to see if the node has child elements */ 29document.write(" 4. item(2) is: "); 30document.write(xmlDocument.documentElement.childNodes.item(2).nodeName); 31document.write(" Does this node have child elements? "); 32document.write (xmlDocument.documentElement.childNodes.item(2).hasChildNodes()); 33document.write(" "); 34/* End the examples */
Copyright © 2003 Pearson Education, Inc. Slide 1-99 The Simple Application Programming Interface for XML (SAX) The SAX 1.0 recommendation was released by the W3C in May 1998 SAX is an event-based API that reads the document in a serial fashion Sax is much faster than DOM, and does not store information in memory
Copyright © 2003 Pearson Education, Inc. Slide SAX Example XML File: The Web Wizard’s Guide to XML C. Hughes SAX processor events: 1 Start document 2 Start element (book) 3 Attribute (id=”1234”) 4 Start element (title) 5 Text (The Web Wizard’s Guide to XML) 6 End element (title) 7 Start element (author) 8 Text (C. Hughes) 9 End element (author) 10 End element (book) 11 End document
Copyright © 2003 Pearson Education, Inc. Slide Comparing SAX and DOM When to use SAX: Processing large documents Searching Stopping the program When to use DOM: Accessing cross-referenced data Modifying the XML document Creating new XML documents
Copyright © 2003 Pearson Education, Inc. Slide XML and Programming Languages XML has become popular with object- oriented programming languages DOM created objects that can be easily accessed Some popular languages that are embracing XML: Java Perl C++
Copyright © 2003 Pearson Education, Inc. Slide XML and Databases XML’s structure makes it a good technology for storing data XML can support relationships among pieces of data that relational databases can not XML Query Language (XQuery) – being developed as a standardized query language that will span all types of XML documents and data sources
Copyright © 2003 Pearson Education, Inc. Slide Examples of XML Programs Distributed Authoring and Versioning on the World Wide Web (WebDAV) Wireless Application Protocol (WAP) Scalable Vector Graphics (SVG) Open Financial Exchange (OFX) Mathematical Markup Language (MathML) Chemical Markup Language (CML) Extensible Hypertext Markup Language (XHTML) Resource Description Framework (RDF)
Copyright © 2003 Pearson Education, Inc. Slide MathML Examples Equation: Equation: ab 1 2 x 3 ⁢ 4 b 5