Download presentation
Presentation is loading. Please wait.
1
Extensible Markup Language
XML Extensible Markup Language
2
What is XML? XML stands for EXtensible Markup Language
XML is a markup language much like HTML XML was designed to describe data XML tags are not predefined in XML. You must define your own tags. XML uses a Document Type Definition (DTD) or XML Schema to describe the data XML with a DTD or XML Schema is designed to be self-descriptive
3
XML Vs HTML XML was designed to carry data.
XML is not a replacement for HTML. XML and HTML were designed with different goals: XML was designed to describe data and to focus what data is. HTML was designed to display data and to focus on how data looks. HTML is about displaying information, XML is about describing information.
4
XML was designed NOT to do anything
XML was created to structure, store, and to send information. Example of a note to Tove from Jani, stored as XML: <note> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note> The note has a header, and a message body. It also has sender and receiver information. But still, this XML document does not do anything. It’s just pure information wrapped in XML tags. Someone must write a piece of software to send, receive or display it.
5
XML is free and extensible
XML tags are not predefined. You must invent your own tags. The tags used to markup HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in HTML standard (like<p>, <h1>, etc.). XML allows the author to define his own tags and his own document structure. The tags like <to> and <from> in the example above are not defined in any XML standard. These tags are invented by the author of the XML document.
6
XML is a complement to HTML
XML is not a replacement for HTML. In the future Web development, it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data. XML is a cross-platform, software and hardware independent tool for transmitting information. XML is expected to be as important to the future of the Web as HTML has been to the foundation of the Web and to be the most common tool for all data manipulation and data transmition.
7
XML was not designed to display data.
It is important to understand that XML was designed to store, carry, and exchange data.
8
XML Features XML is used to Exchange Data with XML, data can be exchanged between incompatible systems. XML can be used to Share Data with XML, plain text files can be used to share data. XML can be used to Store Data with XML, plain text files can be used to store data. XML can make your Data more Useful with XML, your data is available to more users. XML can be used to Create new Languages XML is the mother of WAP and WML The Wireless Markup Language (WML), used to markup Internet applications for handheld devices like mobile phones, is written in XML.
9
An example XML document
XML documents use a self-describing and simple syntax. <note> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note> The first line in the document-<note> describes the root element of the document (like: “this document is a note”).
10
An example XML document (Cont.)
The next 4 lines describe 4 child elements of the root (to, from, heading, and body): <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> And finally the last line defines the end of the root element: </note>. From the example, don’t you agree that XML is pretty self- descriptive?
11
Tags All XML elements must have a closing tag With XML it is illegal to omit the closing tag. For example: <p>this is a paragraph</p> <p>this is another paragraph</p> XML Tags are case sensitive Unlike HTML, XML tags are case sensitive. The tag <Letter> is different from the tag <letter>: <Message> This is incorrect </message> <Message> This is correct </Message>
12
Properly Nested All XML element must be properly nested <x><y>This is not properly nested</x></y> <x><y>This is properly nested</y><x> Improper nesting of tags make no sense to XML. All XML document must have a root tag The first tag in an XML document is a root tag. All XML documents must contain a single tag pair to define the root element. All other elements must be nested within the root element.
13
Properly Nested (Cont.)
All elements can have sub elements (children). Sub elements must be correctly nested within their parent element: <root> <child> <subchild>......</subchild> </child> </root>
14
Attribute values must always be quoted
With XML, it is illegal to omit quotation marks around attribute values. For example: <note date=12/11/02> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note> This is incorrect! The date attribute in note element is not quoted
15
Attribute values must always be quoted (Cont.)
<note date=“12/11/02”> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note> This is correct! This is correct: date=“12/11/02” This is incorrect: date=12/11/02
16
<!-- this is a comment -->
Comments in XML The syntax for writing comments in XML is similar to that of HTML: <!-- this is a comment -->
17
XML Elements are Extensible
XML documents can be extended to carry more information. E.g. <note> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note> Imagine if later on the author decided to add some extra information to it:
18
XML Elements are Extensible (Cont.)
<note> <date> </date> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note> Should the application break or crash? No. The application should still be able to find the <to>, <from>, and <body> elements in the XML document and produce the same output. XML documents are Extensible!
19
XML Elements have Relationships
Elements are related as parents and children. To understand XML terminology, you have to know how relationships between XML elements are named, and how element content is described.
20
XML Elements have Relationships (Cont.)
For Instance, this is the description of a book: Book title: My First XML Chapter 1: Introduction to XML What is HTML What is XML Chapter 2: XML Syntax Elements must have a closing tags Elements must be properly nested
21
XML Elements have Relationships (Cont.)
Then, this is the XML document that describes the book: <book> <title>My First XML</title> <prod id=“33-657” media=“paper”></prod> <chapter>Introduction to XML <para>What is HTML</para> <para>What is XML<[/para> </chapter> <chapter>XML Syntax <para>Elements must have a closing tags</para> <para>Elements must be properly nested</para> </book>
22
Explanation Book is the root element
Title and chapter are child elements of book Book is the parent element of the title and chapter Title and chapter are siblings (or sister elements) because they have the same parent.
23
Elements have contents
Elements can have different content types An XML element is everything from (including) the element’s start tag to (including) element’s end tag. An element can have element content, mixed contend, simple content, or empty content. An element can also have attributes. In the previous example, book has element content, because it contains other elements. Chapter has mixed contents because it contains both text and other elements. Para has simple content (or text content) because it contains only text. Prod has empty content, because it carries no information.
24
Elements have contents (Cont.)
In that example, only the prod element has attributes. The attribute named id has the value “33-657”. The attribute named media has the value “paper”.
25
Element Naming XML elements must follow these naming rules:
Names can contain letters, numbers, and other characters. Names must not start with a number or punctuation character. Names must not start with the letters xml (or XML or Xml …) Any name can be used, no words are reserved, but the idea is to make names Descriptive-Names with an underscore separator are nice. Examples: <first_name>, <last_name>.
26
XML Attributes XML Elements can have attributes.
Attributes are used to provide additional information about the elements. Attributes often provide information that is not a part of the data. For example: <file type=“gif”>computer.gif</file> In the example, the file type is irrelevant to the data, but important to the software that wants to manipulate the element.
27
Quote Styles Attribute values must always be enclosed in quotes, but
either double or single quote can be used. E.g. <person sex=“female”> or <person sex=‘female’> Double quotes are the most common, but sometimes (if the attribute value itself contains quotes) it is necessary to use single quotes. <gangster name=‘George “Shotgun” Ziegler’>
28
Elements Vs Attributes
E.g.1: <person sex=“female”> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> E.g.2: <person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> In the first example, sex is an attribute. In the last, sex is a child element. Both examples provide the same information. There are no rules about when to use attributes and when to use elements.
29
Should we avoid using attributes?
Some problem of using attributes are: - Attributes can’t contain multiple values (child element can) - Attributes are not easily extendable (for future changes) - Attributes can’t describe structure (child elements can) - Attributes are more difficult to manipulate by program code - Attribute values are not easy to test against a DTD Use attributes only to provide information that is not relevant to the data! Don’t end up like this: <note day=“12” month=“8” year=“02” to=“Tove” from=“Jani” heading=“Remainder” body=“Don’t forget me this weekend!”> </note>
30
Metadata attributes, and the data itself should be stored as elements.
Metadata (or data about data) should be stored as attributes, and the data itself should be stored as elements. For example: <messages> <note id=“p501”> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note> <note id=“p502> <to>Jani</to> <from>Tove</from> <heading>Re: Remainder</heading> <body>I will not!</body> </note> </message>
31
Metadata (Cont.) The ID in the previous example is just a counter, or a unique identifier, to identify the different notes in the XML file, and not a part of the note data.
32
XML DTD A DTD defines the legal elements of an XML elements.
The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements. A DTD can be declared inline in your XML document, or as an external reference.
33
Internal DOCTYPE Declaration
If the DTD is included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax: <!DOCTYPE root-element [element-declarations]> Example XML document with a DTD: <!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <message>Don’t forget me this weekend</message> </note>
34
DTD Interpretation !DOCTYPE note (in line 2) defines that this is a document of the type note. !ELEMENT note (in line 3) defines the note element as having four elements: “to, from, heading, body”. !ELEMENT to (in line 4) defines the to element to be of type “#PCDATA”. !ELEMENT from (in line 5) defines the from element to be of type “PCDATA”. And so on . . .
35
External DOCTYPE declaration
If the DTD is external to your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax: <!DOCTYPE root-element SYSTEM “filename”> E.g. : <!DOCTYPE note SYSTEM “note.dtd”> <note> <to>Tove</to> <from>Jani</from> <heading>Remainder</heading> <body>Don’t forget me this weekend!</body> </note>
36
“note.dtd” And then the copy of the file “note.dtd” will be like this:
<!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>
37
Why use a DTD? With DTD, each of your XML file can carry a description of its own format with it. With a DTD, independent groups of people can agree to use a common DTD for interchanging data. Your application can use a standard DTD to verify that the data you received from the outside world is valid. You can also use DTD to verify your own data.
38
DTD-XML building blocks
The building blocks of XML documents Seen from a DTD point of view, all XML documents are made up by following simple building blocks: - Elements - Tags - Attributes - Entities - PCDATA - CDATA
39
DTD-Elements element declaration.
In a DTD, XML elements are declared with a DTD element declaration. Declaring an element An element declaration has the following syntax: <!ELEMENT element-name category> OR <!ELEMENT (element-content)> Empty elements Empty elements are declared with the category keyword EMPTY: <!ELEMENT element-name EMPTY> example: <!ELEMENT br EMPTY> in XML: <br />
40
DTD-Elements (Cont.) Elements with only character data Elements with only character data are declared with #PCDATA inside parentheses: <!ELEMENT element-name (#PCDATA)> example: <!ELEMENT from (#PCDATA)> Elements with any contents Elements declared with category keyword ANY, can contain any combination of parsable data: <!ELEMENT element-name ANY> example: <!ELEMENT note ANY>
41
DTD-Elements (Cont.) Elements with children (sequences) Elements with one or more children are defined with the name of the children elements inside parentheses: <!ELEMENT element-name (child-element-name, child-element-name, …) example: <!ELEMENT note (to, from, heading, body)> When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children.
42
DTD-Elements (Cont.) The full declaration of the “note” element will be <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA> <!ELEMENT body (#PCDATA> Declaring only one occurrence of the same element <!ELEMENT element-name (child-name+)> example: <!ELEMENT note (message+)> The + sign in the example declares that the child element message can occur one or more times inside the “note” element.
43
DTD-Elements (Cont.) Declaring zero or more occurrences of the same
element <!ELEMENT element-name (child-name*)> example: <!ELEMENT note (messages*)> The * sign in the example above declares that the child element message can occur zero or more times inside the element “note” element.
44
DTD-Elements (Cont.) Declaring zero or one occurrences of the same
element <!ELEMENT element-name (child-name?)> example: <!ELEMENT note (message?)> The ? sign in the example above declares that the child element message can occur zero or one times inside the “note” element.
45
DTD-Elements (Cont.) Declaring either/or content
Example: <!ELEMENT note (to, from, header, (message|body))> The example declares that the “note” element must contain a “to” element, a “from” element, a “header” element, and either a “message” or a “body” element.
46
DTD-Attributes In a DTD, Attributes are declared with an ATTLIST
declaration. Declaring Attributes An attribute declaration has the following syntax: <!ATTLIST element-name attribute-name attribute type default-value> DTD example: <!ATTLIST payment type CDATA “check”> XML example: <payment type=“check”/>
47
DTD-Attributes (Cont.)
The attribute-type can have the following values: Value Explanation CDATA The value is character data (en1|en2|..) The value must be one from an enumerated list ID The value is a unique ID IDREF The value is the ID of another element IDREFS The value is a list of other ids NMTOKEN The value is a valid XML name NMTOKENS The value is a list of valid XML names ENTITY The value is an entity ENTITIES The value is a list of entities NOTATION The value is a name of a notation xml: The value is a predefined xml value
48
DTD-Attributes (Cont.)
The default-value can have the following values: Value Explanation value The default value of the attribute #DEFAULT value The default value of the attribute #REQUIRED The attribute value must be included in the element #IMPLIED The attribute does not have to be included #FIXED value The attribute value is fixed
49
Attribute Declaration Example
DTD example: <!ELEMENT square EMPTY> <!ATTLIST square width CDATA “0”> XML example: <square width=“100” /> The square element is defined to be an empty element with a “width” attribute of type CDATA. If no width specified, it has a default value of 0.
50
Default Attribute Value
Syntax: <!ATTLIST element-name attribute-name attribute-type “default-value”> DTD example: <!ATTLIST payment type CDATA “check”> XML example: <payment type=”check” /> Specifying a default value for an attribute ensures that the attribute will get a value even if the author of the XML document does not include it.
51
Implied Attribute Syntax: <!ATTLIST element-name attribute-name attribute-type #IMPLIED> DTD example: <!ATTLIST contact type CDATA #IMPLIED> XML example: <contact fax=“ ” /> Use implied attribute if you don’t want to force the author to include an attribute, and you don’t have an option for default value.
52
Required Attribute Syntax: <!ATTLIST element-name attribute-name attribute-type #REQUIRED> DTD example: <!ATTLIST person number CDATA #REQUIRED> XML example: <person number=“5678” /> Use a required attribute if you don’t have an option for a default value, but still want to force the attribute to be present.
53
Fixed Attribute Value Syntax: <!ATTLIST element-name attribute-name attribute-type #FIXED “value”> DTD example: <!ATTLIST sender company CDATA #FIXED “Microsoft”> XML example: <sender company=“Microsoft” /> Use a fixed attribute value when you want an attribute to have a fixed value without allowing the author to change it. If an author includes another value, the XML parser will return an error.
54
Enumerated Attribute Values
Syntax: <!ATTLIST element-name attribute-name (en1|en2) default-value> DTD example: <!ATTLIST payment type (check|cash) “check”> XML example: <payment type=“check” /> or <payment type=“cash” /> Use enumerated attribute values when you want the attribute values to be one of a fixed set of legal values.
55
DTD-Entities Entities are variables used to define shortcuts to
common text. Entity references are references to entities. Entity can be declared internal or external. Internal Entity Declaration Syntax: <!ENTITY entity-name “entity-value”> DTD example: <!ENTITY writer “Donald Duck.”> <!ENTITY copyright “Copyright W3Schools.” XML example: <author>&writer;©right;</author>
56
DTD-Entities (Cont.) External Entity Declaration
Syntax: <!ENTITY entity-name SYSTEM “URI/URL”> DTD example: <!ENTITY writer SYSTEM “ <!ENTITY copyright SYSTEM “ XML example: <author>&writer;©right;</author>
57
XML Schema Example in DTD: <!DOCTYPE bank [ <!ELEMENT bank ((account-customer-depositor)+)> <!ELEMENT account (account-number branch-name balance)> <!ELEMENT customer (customer-name customer-street customer-city)> <!ELEMENT depositor (customer-name account-number)> <!ELEMENT account-number (#PCDATA)> <!ELEMENT branch-name (#PCDATA)> <!ELEMENT balance (#PCDATA)> <!ELEMENT customer-name (#PCDATA)> <!ELEMENT customer-street (#PCDATA)> <!ELEMENT customer-city (#PCDATA)> ]> Can be re-written in XML Schema:
58
XML Schema (Cont.) <xsd:schema xmlns:xsd=“ <xsd:element name=“bank” type=“BankType”/> <xsd:element name=“account”> <xsd:complexType> <xsd:sequence> <xsd:element name=“account-number” type=“xsd:string”/> <xsd:element name=“branch-name” type=“xsd:string”/> <xsd:element name=“balance” type=“xsd:decimal”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=“customer”> <xsd:element name=“customer-number” type=“xsd:string”/> <xsd:element name=“customer-street” type=“xsd:string”/> <xsd:element name=“customer-city” type=“xsd:string”/> </xsd:element>
59
XML Schema (Cont.) <xsd:element name=“depositor”> <xsd:complexType> <xsd:sequence> <xsd:element name=“customer-name” type=“xsd:string”/> <xsd:element name=“account-number” type=“xsd:string”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:complexType name=“BankType”> <xsd:sequence> <xsd:element ref=“account” minOccurs=“0” maxOccurs=“unbounded”/> <xsd:element ref=“customer” minOccurs=“0” maxOccurs=“unbounded”/> <xsd:element ref=“depositor” minOccurs=“0” maxOccurs=“unbounded”/> </xsd:sequence> </xsd:complexType> </xsd:schema>
60
XML Schema (Cont.) The benefit of XMLSchema over DTDs are: * It allows user-defined types to be created. * It allows the text that appears in elements to be constrained to specific types, such as numeric types in specific formats or even more complicated types such as lists or union. * It allows types to be restricted to create specialized types, for instance by specifying max and min values. * It allows complex types to be extended by using a form of inheritance. * It is a superset of DTDs. * It allows uniqueness and foreign key constraints. However, the price for these features is that XMLSchema is significantly more complicated than DTDs.
61
Querying and Transformation
Tools for querying and transformation of XML data are essential to extract information from large bodies of XML data, and to convert data between different representations (schemas) in XML. Several languages provide increasing degrees of querying and transformation capabilities: - XPath - XSLT - XQuery
62
XPATH XPATH addresses part of an XML document by
means of path expressions. A path expression in XPATH is a sequence of location steps separated by “/”. The result of a path expression is a set of values. For example: /bank-2/customer/name would return: <name>Joe</name> <name>Lisa</name> <name>Mary</name> The expression: /bank-2/customer/name/text( ) would return the same names, but w/out the enclosing tags.
63
XPATH (Cont.) XPATH supports a number of other features: /bank-2/account[balance > 400] returns account elements with a balance value greater than 400, while /bank-2/account[balance returns the account numbers of those accounts. /bank-2/account/[customer/count( )>2] returns accounts with more than 2 customers. returns all customers referred to from the owners attribute account elements.
64
XPATH (Cont.) gives customers with either accounts or loans. However, the | operator cannot be nested inside other operators. /bank-2//name finds any name element anywhere under the /bank-2 element, regardless the element in which it is contained. Note: the “//” described above is a short form for specifying “all descendants”, while “..” specifies parent.
65
XSLT A style sheet is a representation of formatting option
for a document, usually stored outside the document itself. XML Stylesheet Language (XSL) was originally designed for generating HTML from XML. It includes a general-purpose transformation mechanism, called XSL Transformations (XSLT), which can be used to transform one XML document into another XML document, or to other formats such as HTML. XSLT transformations are expressed as a series of recursive rules, called templates.
66
XSLT (Cont.) A simple template for XSLT consists of a match part
and a select part. For instance: <xls:template match=“/bank-2/customer”> <xsl:value-of select=“customer-name”/> </xls:template> <xls:template match=“.”/> The xls:template match statement contains an XPath expression that selects one or more nodes. The first template matches customer elements that occur as children of the bank-2 root element. The xsl:value-of statement enclosed in the match statement outputs values from the nodes in the result of the XPath expression. The first template outputs the value of the customer-name subelement; note that the value does not contain the element tag.
67
XSLT (Cont.) The second template matches all nodes. This is required
because the default behavior of XSLT on subtrees of the input document that do not match any template is to copy the subtrees to the output document. Structural recursion is a key part of XSLT-When the template matches an element in the tree structure, XSLT can use structural recursion to apply template rules recursively by the xls:apply-templates directive, which appears inside other templates.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.