Download presentation
Presentation is loading. Please wait.
1
XML -07-
2
XML What is XML?
3
XML What is XML? XML stands for eXtensible Markup Language
4
XML What is XML? XML stands for eXtensible Markup Language
XML is a markup language
5
XML What is XML? XML stands for eXtensible Markup Language
XML is a markup language XML was designed to carry data
6
XML What is XML? XML stands for eXtensible Markup Language
XML is a markup language XML was designed to carry data <book> <title> A nice book about XML </title> <author> Hanan Shpungin </author> <edition> <volume> 1 </volume> <year> 2010 </year> </edition> </book>
7
XML What is XML? XML stands for eXtensible Markup Language
XML is a markup language XML was designed to carry data <book> <title> A nice book about XML </title> <author> Hanan Shpungin </author> <edition> <volume> 1 </volume> <year> 2010 </year> </edition> </book> It wasn’t designed to display data
8
XML What is XML? XML stands for eXtensible Markup Language
XML is a markup language XML was designed to carry data <book> <title> A nice book about XML </title> <author> Hanan Shpungin </author> <edition> <volume> 1 </volume> <year> 2010 </year> </edition> </book> It wasn’t designed to display data – how would you display the book information?
9
XML What is XML? Do not confuse with HTML
10
XML What is XML? Do not confuse with HTML
HTML is about displaying information, while XML is about carrying information - XML focuses on what data is - HTML focuses on how data looks
11
XML What is XML? Do not confuse with HTML XML does not do anything
HTML is about displaying information, while XML is about carrying information - XML focuses on what data is - HTML focuses on how data looks XML does not do anything
12
XML What is XML? Do not confuse with HTML XML does not do anything
HTML is about displaying information, while XML is about carrying information - XML focuses on what data is - HTML focuses on how data looks XML does not do anything XML was created to structure information - it is just pure information wrapped in tags Someone must write a piece of software to send, receive or display it
13
XML What is XML? XML is just plain text
14
XML What is XML? XML is just plain text
Software that can handle plain text can also handle XML However, XML-aware applications can handle the XML tags specially; the actual handling depends on the tags and the application
15
XML What is XML? XML is just plain text
Software that can handle plain text can also handle XML However, XML-aware applications can handle the XML tags specially; the actual handling depends on the tags and the application You have to invent your own tags
16
XML What is XML? XML is just plain text
Software that can handle plain text can also handle XML However, XML-aware applications can handle the XML tags specially; the actual handling depends on the tags and the application You have to invent your own tags XML language has no predefined tags (unlike HTML) XML allows the author to define his own tags and his own document structure For instance, the book example used “made-up” tags
17
XML What is XML? XML is just plain text
Software that can handle plain text can also handle XML However, XML-aware applications can handle the XML tags specially; the actual handling depends on the tags and the application You have to invent your own tags <book> <title> A nice book about XML </title> <author> Hanan Shpungin </author> <edition> <volume> 1 </volume> <year> 2010 </year> </edition> </book>
18
XML What is XML? XML is just plain text
Software that can handle plain text can also handle XML However, XML-aware applications can handle the XML tags specially; the actual handling depends on the tags and the application You have to invent your own tags An XML document is very self-descriptive
19
XML What is XML? XML is everywhere
XML is the most common tool for data transmissions between all sorts of applications
20
XML What is XML? XML is everywhere
XML is the most common tool for data transmissions between all sorts of applications Various uses: xHTML CAP (Common Alerting Protocol) – public warnings and emergencies DocBook – technical documentation CML (Chemical Markup Language) – managing molecular information
21
XML What is XML good for?
22
XML What is XML good for? Separating data from layout
You really don’t want to have to update your HTML every time the data changes
23
XML What is XML good for? Separating data from layout
You really don’t want to have to update your HTML every time the data changes Let HTML worry about how data is presented
24
XML What is XML good for? Separating data from layout
You really don’t want to have to update your HTML every time the data changes Let HTML worry about how data is presented Put the data into XML files; this way you can modify the data without having to worry about presentation
25
XML What is XML good for? Separating data from layout
You really don’t want to have to update your HTML every time the data changes Let HTML worry about how data is presented Put the data into XML files; this way you can modify the data without having to worry about presentation The data can then be retrieved by simple JavaScript code
26
XML What is XML good for? Simple sharing of data
XML data is stored in plain text format; this provides a software- and hardware-independent way of storing data
27
XML What is XML good for? Simple sharing of data
XML data is stored in plain text format; this provides a software- and hardware-independent way of storing data Data can be easily shared between different application on the same machine or across the internet
28
XML What is XML good for? Simple sharing of data
XML data is stored in plain text format; this provides a software- and hardware-independent way of storing data Data can be easily shared between different application on the same machine or across the internet System upgrades do not require changing the data format; there is no data incompatibility
29
XML What is XML good for? Simple sharing of data Data accessibility
XML data is stored in plain text format; this provides a software- and hardware-independent way of storing data Data can be easily shared between different application on the same machine or across the internet System upgrades do not require changing the data format; there is no data incompatibility Data accessibility Data can be available to all kinds of "reading machines“ (e.g. smart phones, news feeds readers)
30
XML What is XML good for? Simple sharing of data Data accessibility
XML data is stored in plain text format; this provides a software- and hardware-independent way of storing data Data can be easily shared between different application on the same machine or across the internet System upgrades do not require changing the data format; there is no data incompatibility Data accessibility Data can be available to all kinds of "reading machines“ (e.g. smart phones, news feeds readers) It can also be made available for people with disabilities
31
XML The XML tree An XML document is a tree of elements
32
XML The XML tree An XML document is a tree of elements
XML documents must contain a root element; this element is “the parent” of all the other elements
33
XML The XML tree An XML document is a tree of elements
XML documents must contain a root element; this element is “the parent” of all the other elements The elements in an XML document form a document tree
34
XML The XML tree An XML document is a tree of elements
XML documents must contain a root element; this element is “the parent” of all the other elements The elements in an XML document form a document tree The tree starts at the root and branches to the lowest level of the tree
35
XML The XML tree An XML document is a tree of elements
XML documents must contain a root element; this element is “the parent” of all the other elements The elements in an XML document form a document tree The tree starts at the root and branches to the lowest level of the tree All elements can have sub elements (child elements)
36
XML The XML tree An XML document is a tree of elements
XML documents must contain a root element; this element is “the parent” of all the other elements The elements in an XML document form a document tree The tree starts at the root and branches to the lowest level of the tree All elements can have sub elements (child elements) <root> <child> <subchild>.....</subchild> </child> </root>
37
XML The XML tree An XML document is a tree of elements
XML documents must contain a root element; this element is “the parent” of all the other elements The elements in an XML document form a document tree The tree starts at the root and branches to the lowest level of the tree All elements can have sub elements (child elements) The terms “parent”, “child”, and “sibling” are used to describe the relationships between elements
38
<bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
40
XML The XML tree An XML document is a tree of elements
XML declaration is not part of the tree
41
XML The XML tree An XML document is a tree of elements
XML declaration is not part of the tree Most XML documents start with a line which looks like this <?xml version=“1.0” encoding=“ISO ”?>
42
XML The XML tree An XML document is a tree of elements
XML declaration is not part of the tree Most XML documents start with a line which looks like this <?xml version=“1.0” encoding=“ISO ”?> XML documents can actually contain non-ASCII characters
43
XML The XML tree An XML document is a tree of elements
XML declaration is not part of the tree Most XML documents start with a line which looks like this <?xml version=“1.0” encoding=“ISO ”?> XML documents can actually contain non-ASCII characters By omitting the declaration line you can encounter parsing problems due to incompetability with the default settings
44
XML The XML tree An XML document is a tree of elements
XML declaration is not part of the tree Most XML documents start with a line which looks like this <?xml version=“1.0” encoding=“ISO ”?> XML documents can actually contain non-ASCII characters By omitting the declaration line you can encounter parsing problems due to incompetability with the default settings Always place a declaration line to avoid problems
45
XML The XML tree An XML document is a tree of elements
XML declaration is not part of the tree Most XML documents start with a line which looks like this <?xml version=“1.0” encoding=“ISO ”?> XML documents can actually contain non-ASCII characters By omitting the declaration line you can encounter parsing problems due to incompetability with the default settings Always place a declaration line to avoid problems Make sure you know in what encoding your XML file is saved
46
<. xml version=“1. 0” encoding=“ISO-8859-1”
<?xml version=“1.0” encoding=“ISO ”?> <bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>
47
XML XML syntax rules
48
XML XML syntax rules The XML rules are simple and strict
49
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag
50
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag A starting tag starts with “<” and ends with “>” <book> A closing tag starts with “</” and ends with “>” </book> An empty element tag starts with “<” and ends with “/>” <new />
51
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag A starting tag starts with “<” and ends with “>” A closing tag starts with “</” and ends with “>” An empty element tag starts with “<” and ends with “/>” The following example is illegal <note> <sender> Hanan </sender> <text> Meet you at 15:30 </note>
52
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag A starting tag starts with “<” and ends with “>” A closing tag starts with “</” and ends with “>” An empty element tag starts with “<” and ends with “/>” The following example is illegal <note> <sender> Hanan </sender> <text> Meet you at 15:30 </note>
53
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag A starting tag starts with “<” and ends with “>” A closing tag starts with “</” and ends with “>” An empty element tag starts with “<” and ends with “/>” The fixed example <note> <sender> Hanan </sender> <text> Meet you at 15:30 </text> </note>
54
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag A starting tag starts with “<” and ends with “>” A closing tag starts with “</” and ends with “>” An empty element tag starts with “<” and ends with “/>” The declaration is not an element, so there is no problem <?xml version=“1.0” encoding=“ISO ”?>
55
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements are defined using tags, which are case-sensitive
56
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements are defined using tags, which are case-sensitive <Note> is different from <note>
57
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements are defined using tags, which are case-sensitive <Note> is different from <note> The starting and the closing tags of an elements must match
58
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements are defined using tags, which are case-sensitive <Note> is different from <note> The starting and the closing tags of an elements must match <Note> this is wrong </note> <noTe> this is correct </noTe>
59
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested You cannot open one element and close another one
60
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested You cannot open one element and close another one Elements must be closed in reverse order
61
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested You cannot open one element and close another one Elements must be closed in reverse order <b> <i> wrong </b> </i> <b> <i> correct </i> </b>
62
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested An XML document must have a root element <root> <child> <subchild>.....</subchild> </child> </root>
63
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested An XML document must have a root element XML Attribute values must be quoted
64
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested An XML document must have a root element XML Attribute values must be quoted It is possible for XML elements to have attributes in the form name/value just like in HTML The attribute value must always be quoted
65
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested An XML document must have a root element XML Attribute values must be quoted It is possible for XML elements to have attributes in the form name/value just like in HTML The attribute value must always be quoted <msg date=11/02/10> wrong </msg>
66
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested An XML document must have a root element XML Attribute values must be quoted It is possible for XML elements to have attributes in the form name/value just like in HTML The attribute value must always be quoted <msg date=“11/02/10”> correct </msg>
67
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested An XML document must have a root element XML Attribute values must be quoted White spaces are not truncated in XML
68
XML XML syntax rules The XML rules are simple and strict
All XML elements must have a closing tag The XML tags are case-sensitive XML elements must be properly nested An XML document must have a root element XML Attribute values must be quoted White spaces are not truncated in XML Comments in XML Similar to HTML: <!-- this is a comment -->
69
XML XML syntax rules Entity references
Just like in HTML, it is advised to replace several special symbols with entity references
70
XML XML syntax rules Entity references
Just like in HTML, it is advised to replace several special symbols with entity references The characters “<” and “&” are strictly forbidden in XML
71
XML XML syntax rules Entity references
Just like in HTML, it is advised to replace several special symbols with entity references The characters “<” and “&” are strictly forbidden in XML The following generates an error <message> if salary < 1000 then </message>
72
XML XML syntax rules Entity references
Just like in HTML, it is advised to replace several special symbols with entity references The characters “<” and “&” are strictly forbidden in XML The following generates an error <message> if salary < 1000 then </message> Can be fixed
73
XML XML syntax rules Entity references
Just like in HTML, it is advised to replace several special symbols with entity references The characters “<” and “&” are strictly forbidden in XML There are 5 predefined entities for symbols < > “ ‘ & < > " ' &
74
XML XML syntax rules Entity references
Just like in HTML, it is advised to replace several special symbols with entity references The characters “<” and “&” are strictly forbidden in XML There are 5 predefined entities for symbols < > “ ‘ & < > " ' & It is possible to address any symbol through the numeric code, e.g. e
75
XML XML elements
76
XML XML elements What is an element?
77
XML XML elements What is an element?
An XML element is everything from (including) the element's start tag to (including) the element's end tag
78
XML XML elements What is an element?
An XML element is everything from (including) the element's start tag to (including) the element's end tag Elements can contain other elements, simple text or a mixture of both
79
XML XML elements What is an element?
An XML element is everything from (including) the element's start tag to (including) the element's end tag Elements can contain other elements, simple text or a mixture of both Elements can also have attributes
80
XML XML elements What is an element?
An XML element is everything from (including) the element's start tag to (including) the element's end tag Elements can contain other elements, simple text or a mixture of both Elements can also have attributes <bookstore> <book category=“WEB”> <author> Erik T. Ray </author> <title> Learning XML </title> </book> <bookstore>
81
XML XML elements Naming elements
Recall that there are no predefined elements – you make your own
82
XML XML elements Naming elements
Recall that there are no predefined elements – you make your own There are no reserved words; you can use any name
83
XML XML elements Naming elements
Recall that there are no predefined elements – you make your own There are no reserved words; you can use any name Some restrictions apply Names cannot start with a number or a punctuation sign Names cannot start with the letters “xml” in any case (XML, xml, Xml, etc.) Names cannot contain spaces
84
XML XML elements Naming elements Some tips for naming elements:
85
XML XML elements Naming elements Some tips for naming elements:
Use short and informative names, separating words with underscores “-”, e.g. first_name, book_title and not “the_title_of_the_book”
86
XML XML elements Naming elements Some tips for naming elements:
Use short and informative names, separating words with underscores “-”, e.g. first_name, book_title and not “the_title_of_the_book” Avoid using “-”, “.”, “:” in the names, as they might be misinterpreted by some softwares
87
XML XML elements Naming elements Some tips for naming elements:
Use short and informative names, separating words with underscores “-”, e.g. first_name, book_title and not “the_title_of_the_book” Avoid using “-”, “.”, “:” in the names, as they might be misinterpreted by some softwares Use naming conventions of the other parts of your project
88
XML XML elements Naming elements Some tips for naming elements:
Use short and informative names, separating words with underscores “-”, e.g. first_name, book_title and not “the_title_of_the_book” Avoid using “-”, “.”, “:” in the names, as they might be misinterpreted by some softwares Use naming conventions of the other parts of your project You can use non-english letters, but it is better to avoid as the reader might not support it
89
XML XML attributes
90
XML XML attributes Attributes may appear in the start tag (like in HTML) Attributes provide additional information about the element <file type=“png”>image.png</file>
91
XML XML attributes Attributes may appear in the start tag (like in HTML) Attributes provide additional information about the element <file type=“png”>image.png</file> Attributes must be quoted You can use either single or double quotes; both uses are valid <file type=“png”>image.png</file> <file type=‘png’>image.png</file>
92
XML XML attributes Attributes may appear in the start tag (like in HTML) Attributes provide additional information about the element <file type=“png”>image.png</file> Attributes must be quoted You can use either single or double quotes; both uses are valid <file type=“png”>image.png</file> <file type=‘png’>image.png</file> If the attribute contains quotes, it is possible to use single quotes or entity references <musician name=‘Elvis “The King” Presley’>
93
XML XML attributes Elements or attributes?
There are several ways to present the same data You can use either attributes <message date=“11/02/10”> … </message> or elements <message> <date>11/02/10</date> … </message> to present the same data - what is better?
94
<message>. <date>. <day>11</day>
<message> <date> <day>11</day> <month>02</month> <year>2010</year> </date> <from>Hanan</from> <to>class</to> <text>XML is fun!</text> </message> <message> <date>11/02/2010</date> <from>Hanan</from> <to>class</to> <text>XML is fun!</text> </message> <message date=“11/02/2010”> <from>Hanan</from> <to>class</to> <text>XML is fun!</text> </message>
95
The worst! <message date=“11/02/2010” from=“Hanan” to=“class” text=“XML is fun!”> </message>
96
XML XML attributes Elements or attributes?
There are several ways to present the same data You can use either attributes or elements to present the same data - what is better? Attributes are generally harder to read and maintain
97
XML XML attributes Elements or attributes?
There are several ways to present the same data You can use either attributes or elements to present the same data - what is better? Attributes are generally harder to read and maintain Some attributes drawbacks: cannot contain multiple values or nest
98
XML XML attributes Elements or attributes?
There are several ways to present the same data You can use either attributes or elements to present the same data - what is better? Attributes are generally harder to read and maintain Some attributes drawbacks: cannot contain multiple values or nest A nice rule of thumb: use elements for data and attributes for meta data (e.g. assigning an id to an element)
99
XML Well-formedness and validity
100
XML Well-formedness and validity A well-formed XML document
A document is well-formed if it satisfies the syntax rules
101
XML Well-formedness and validity A well-formed XML document
A document is well-formed if it satisfies the syntax rules If an XML document violates the syntax, it is not considered to be an XML document
102
XML Well-formedness and validity A well-formed XML document
A document is well-formed if it satisfies the syntax rules If an XML document violates the syntax, it is not considered to be an XML document Yes, it’s draconic – unlike HTML, where the browser is expected to produce a reasonable result even in the presence of severe errors
103
XML Well-formedness and validity A well-formed XML document
A document is well-formed if it satisfies the syntax rules If an XML document violates the syntax, it is not considered to be an XML document Yes, it’s draconic – unlike HTML, where the browser is expected to produce a reasonable result even in the presence of severe errors If a document is not well-formed, the processor is required to stop and report an error
104
XML Well-formedness and validity A valid XML document
In addition to being well-formed, an XML document may also be valid
105
XML Well-formedness and validity A valid XML document
In addition to being well-formed, an XML document may also be valid A valid document holds a reference to a Document Type Definition (DTD), and the document follows the rules of that DTD
106
XML Well-formedness and validity A valid XML document
In addition to being well-formed, an XML document may also be valid A valid document holds a reference to a Document Type Definition (DTD), and the document follows the rules of that DTD XML processors are classified as validating or non-validating, depending on whether or not they check XML documents for validity
107
XML Well-formedness and validity A valid XML document
In addition to being well-formed, an XML document may also be valid A valid document holds a reference to a Document Type Definition (DTD), and the document follows the rules of that DTD XML processors are classified as validating or non-validating, depending on whether or not they check XML documents for validity DTD is just one of the many ways to write grammar rules (schema) which define the validity of a document
108
XML Schemas and validation What is a schema?
109
XML Schemas and validation What is a schema?
A schema addresses the following aspects: the set of elements that may be used in a document what attributes may be applied to every element the order of elements/attributes the allowable parent/child relationships
110
XML Schemas and validation What is a schema? DTD
Defines the grammar rules of a document DTD The oldest schema language for XML Quite simple to write and read Only the string type available for data, that is you cannot define a numeric type of data No complex types Very widely used
111
XML Schemas and validation What is a schema? DTD
Defines the grammar rules of a document DTD <!DOCTYPE bookstore [ <!ELEMENT bookstore (books*) <!ELEMENT book (title,author,year)> <!ELEMENT title (#CDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT year (#CDATA)> <!ATTLIST book price (#CDATA) #REQUIRED> ]>
112
XML Schemas and validation What is a schema?
Defines the grammar rules of a document XML schema definition (XSD) Much more powerful than DTD XSD uses an XML-based format, which makes it easier to read using XML tools Complex and rich data typing (almost like a programming language) Detailed constraints on the logical structure of an XML document
113
XML Processing XML documents
The design goal: “it shall be easy to write programs which process XML documents” However, the XML specification does not say how
114
XML Processing XML documents
The design goal: “it shall be easy to write programs which process XML documents” However, the XML specification does not say how A variety of APIs to access XML were developed SAX (Simple API for XML) A stream parser with event driven API The user defines callback methods to be invoked on specific events The parser simply scans the document and notifies of events, such as “element start”, “text node”, etc.
115
XML Processing XML documents
The design goal: “it shall be easy to write programs which process XML documents” However, the XML specification does not say how A variety of APIs to access XML were developed DOM (Document Object Model) Supports navigation in the whole document tree Allows manipulation of the document as well Usually very heavy on the memory
116
XML Processing XML documents
The design goal: “it shall be easy to write programs which process XML documents” However, the XML specification does not say how A variety of APIs to access XML were developed Pull parsing Resembles SAX, but instead of reacting to events, forces the next step The parser provides the user with an iterator, with which he can traverse the document sequentially
117
XML Related specifications
Usually, the term XML implies additional technologies We focus on few of them
118
XML Related specifications
Usually, the term XML implies additional technologies We focus on few of them XML Namespaces – using elements with the same name but from different vocabularies (namespaces)
119
XML Related specifications
Usually, the term XML implies additional technologies We focus on few of them XML Namespaces – using elements with the same name but from different vocabularies (namespaces) XSLT – An XML-based language for transformation of XML documents into other XML documents, plain text, HTML, etc.
120
XML Related specifications
Usually, the term XML implies additional technologies We focus on few of them XML Namespaces – using elements with the same name but from different vocabularies (namespaces) XSLT – An XML-based language for transformation of XML documents into other XML documents, plain text, HTML, etc. XPath – Used to navigate through elements and attributes in an XML document
121
XML Related specifications
Usually, the term XML implies additional technologies We focus on few of them XML Namespaces – using elements with the same name but from different vocabularies (namespaces) XSLT – An XML-based language for transformation of XML documents into other XML documents, plain text, HTML, etc. XPath – Used to navigate through elements and attributes in an XML document XQuery – Designed to query XML documents (uses XPATH)
122
XML XML so far (quick summary)
XML stands for eXtensible Markup Language XML is a markup language XML was designed to carry data <book> <title> A nice book about XML </title> <author> Hanan Shpungin </author> <edition> <volume> 1 </volume> <year> 2010 </year> </edition> </book> It wasn’t designed to display data – how would you display the book information?
123
XML XML so far (quick summary) XML is just plain text
Software that can handle plain text can also handle XML However, XML-aware applications can handle the XML tags specially; the actual handling depends on the tags and the application You have to invent your own tags XML language has no predefined tags (unlike xHTML) XML allows the author to define his own tags and his own document structure For instance, the book example used “made-up” tags
124
XML XML so far (quick summary) An XML document is a tree of elements
XML documents must contain a root element; this element is “the parent” of all the other elements The elements in an XML document form a document tree The tree starts at the root and branches to the lowest level of the tree All elements can have sub elements (child elements) The terms “parent”, “child”, and “sibling” are used to describe the relationships between elements
125
XML Working with XML data
126
XML Working with XML data XML data is usually read by parsers
Although XML is plain text which can be easily read, using a parser allows taking advantage of the semantic structure of the document (e.g. SAX, DOM, Pull parsing, etc.)
127
XML Working with XML data XML data is usually read by parsers
Although XML is plain text which can be easily read, using a parser allows taking advantage of the semantic structure of the document (e.g. SAX, DOM, Pull parsing, etc.) It is possible to transform XML documents into into other XML documents
128
XML Working with XML data XML data is usually read by parsers
Although XML is plain text which can be easily read, using a parser allows taking advantage of the semantic structure of the document (e.g. SAX, DOM, Pull parsing, etc.) It is possible to transform XML documents into into other XML documents For example, converting an XML document (e.g. a web feed) into an xHTML document (to be presented in a browser)
129
XML Working with XML data XML data is usually read by parsers
Although XML is plain text which can be easily read, using a parser allows taking advantage of the semantic structure of the document (e.g. SAX, DOM, Pull parsing, etc.) It is possible to transform XML documents into into other XML documents For example, converting an XML document (e.g. a web feed) into an xHTML document (to be presented in a browser) XML documents can be transformed (and rendered) by using a family of languages called XSL
130
XML XSL (eXtensible Stylesheet Language)
XSL for XML is like CSS is for HTML XML tags hold no information about what and how to display
131
XML XSL (eXtensible Stylesheet Language)
XSL for XML is like CSS is for HTML XML tags hold no information about what and how to display XSL is actually a family of languages
132
XML XSL (eXtensible Stylesheet Language)
XSL for XML is like CSS is for HTML XML tags hold no information about what and how to display XSL is actually a family of languages These languages define the transformation and formatting rules for XML documents
133
XML XSL (eXtensible Stylesheet Language)
XSL for XML is like CSS is for HTML XML tags hold no information about what and how to display XSL is actually a family of languages These languages define the transformation and formatting rules for XML documents XSLT is a language for the transformation of XML documents
134
XML XSL (eXtensible Stylesheet Language)
XSL for XML is like CSS is for HTML XML tags hold no information about what and how to display XSL is actually a family of languages These languages define the transformation and formatting rules for XML documents XSLT is a language for the transformation of XML documents XPath a language for navigating in XML
135
XML XSL (eXtensible Stylesheet Language)
XSL for XML is like CSS is for HTML XML tags hold no information about what and how to display XSL is actually a family of languages These languages define the transformation and formatting rules for XML documents XSLT is a language for the transformation of XML documents XPath a language for navigating in XML documents XSL-FO is a language for formatting XML documents
136
XML XSL Transformations (XSLT) The general model of XSLT
The XSLT processor takes two input elements 1. the XML document 2. the XSL stylesheet
137
XML XSL Transformations (XSLT) The general model of XSLT
The XSLT processor takes two input elements 1. the XML document 2. the XSL stylesheet The XSLT processor then produces the output document by following the instructions of the XSL stylesheet
138
XML
139
XML XSL Transformations (XSLT) The general model of XSLT
The XSLT processor takes two input elements 1. the XML document 2. the XSL stylesheet The XSLT processor then produces the output document by following the instructions of the XSL stylesheet With XSLT you have full control of the resulting XML document You can create new elements, add existing ones, rearrange elements, sort elements, modify elements, etc.
140
XML XSL Transformations (XSLT) The processing is template-based
141
XML XSL Transformations (XSLT) The processing is template-based
The XSL stylesheet defines templates, which the nodes in the origin document are matched against
142
XML XSL Transformations (XSLT) The processing is template-based
The XSL stylesheet defines templates, which the nodes in the origin document are matched against In case of a match, the XSLT processor will transform the matching part into the result document according to the provided rules
143
XML XSL Transformations (XSLT) The processing is template-based
The XSL stylesheet defines templates, which the nodes in the origin document are matched against In case of a match, the XSLT processor will transform the matching part into the result document according to the provided rules It can be viewed as functional expressions which evaluate into the final result
144
XML XSL Transformations (XSLT) The processing is template-based
The XSL stylesheet defines templates, which the nodes in the origin document are matched against In case of a match, the XSLT processor will transform the matching part into the result document according to the provided rules It can be viewed as functional expressions which evaluate into the final result For example ...
145
XML document <?xml version="1.0" ?> <course_staff> <instructor> <name>Hanan Shpungin</name> </instructor> <teaching_assistant> <name>Marian Doerk</name> </teaching_assistant> </course_staff>
146
XSL Stylesheet <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl=" xmlns=" <xsl:template match="/"> <head> <title>SENG 513 teaching staff</title> </head> <body> <h1> SENG 513 teaching staff </h1> <h2> Instructor </h2> <b> Name: </b> <i><xsl:value-of select="//instructor/name" /></i> <br /> <b> </b> <tt><xsl:value-of select="//instructor/ " /></tt>
147
Click me! XSL Stylesheet <h2> Teaching assistant </h2>
<b> Name: </b> <i> <xsl:value-of select="//teaching_assistant/name" /> </i> <br /> <b> </b> <tt> <xsl:value-of select="//teaching_assistant/ " /> </tt> </body> </xsl:template> </xsl:stylesheet> Click me!
148
XML XSL Transformations (XSLT) The processing is template-based
The XSL stylesheet defines templates, which the nodes in the origin document are matched against In case of a match, the XSLT processor will transform the matching part into the result document according to the provided rules It can be viewed as functional expressions which evaluate into the final result XSLT relies on several XML related specifications XML Namespace and XPath
149
XML XML Namespace
150
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document
151
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document An XML document may contain elements or attributes from more than one XML vocabulary The ambiguity can be resolved by giving each vocabulary a namespace
152
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document An XML document may contain elements or attributes from more than one XML vocabulary The ambiguity can be resolved by giving each vocabulary a namespace For example <tree> <tree> <family> … </family> <node> … </nodes> <age> … </age> <edges> … </edges> </tree> </tree>
153
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document An XML document may contain elements or attributes from more than one XML vocabulary The ambiguity can be resolved by giving each vocabulary a namespace For example <tree> <tree> <family> … </family> <node> … </nodes> <age> … </age> <edges> … </edges> </tree> </tree>
154
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document An XML document may contain elements or attributes from more than one XML vocabulary The ambiguity can be resolved by giving each vocabulary a namespace For example <f:tree> <g:tree> <f:family> … </f:family> <g:node> … <g:/nodes> <f:age> … </f:age> <g:edges> … <g:/edges> </f:tree> <g:/tree>
155
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document The namespace is defined by the xmlns attribute at the start tag of an element <tag xmlns:prefix=URI>
156
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document The namespace is defined by the xmlns attribute at the start tag of an element <tag xmlns:prefix=URI> For example <tree xmlns:f=“ When a namespace is defined for an element, all child elements with the same prefix are associated with the same namespace
157
XML XML Namespace Namespaces are used to provide unique names to
elements and attributes of an XML document The namespace is defined by the xmlns attribute at the start tag of an element <tag xmlns:prefix=URI> For example <tree xmlns:f=“ The prefix might be omitted Note that the URI contains no data, it is just a name
158
XSL Stylesheet <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl=" xmlns=" <xsl:template match="/"> <head> <title>SENG 513 teaching staff</title> </head> <body> <h1> SENG 513 teaching staff </h1> <h2> Instructor </h2> <b> Name: </b> <i><xsl:value-of select="//instructor/name" /></i> <br /> <b> </b> <tt><xsl:value-of select="//instructor/ " /></tt>
159
XML XML Path
160
XML XML Path XPath is used to navigate through the tree of
elements and attributes in an XML document
161
XML XML Path XPath is used to navigate through the tree of
elements and attributes in an XML document The navigation slightly resembles a file system
162
XML XML Path XPath is used to navigate through the tree of
elements and attributes in an XML document The navigation slightly resembles a file system For example /bookstore/book /bookstore/*
163
XML XML Path XPath is used to navigate through the tree of
elements and attributes in an XML document The navigation slightly resembles a file system For example /bookstore/book /bookstore/* Either a single node or a set of nodes are selected by following expressions along the given path
164
XML XML Path Syntax The nodes are selected by following a chain of steps
165
XML XML Path Syntax The nodes are selected by following a chain of steps The chain can be either absolute or relative /step1/step2/step3/… step1/step2/step3/… The steps are separated by a “/”
166
XML XML Path Syntax The nodes are selected by following a chain of steps The chain can be either absolute or relative /step1/step2/step3/… step1/step2/step3/… The steps are separated by a “/” The general form of a step is: axisname::node-test[predicate]
167
XML XML Path Syntax The nodes are selected by following a chain of steps The chain can be either absolute or relative /step1/step2/step3/… step1/step2/step3/… The steps are separated by a “/” The general form of a step is: axisname::node-test[predicate] For example child::book[price>35.00]
168
XML XML Path Syntax The evaluation of a single step is relative to the
current node (the context node)
169
XML XML Path Syntax The evaluation of a single step is relative to the
current node (the context node) Axis name defines the relationship within the tree For example child, descendant, attribute, parent, etc.
170
XML XML Path Syntax The evaluation of a single step is relative to the
current node (the context node) Axis name defines the relationship within the tree For example child, descendant, attribute, parent, etc. Node test specifies a node (or a group of nodes) within the axis
171
XML XML Path Syntax The evaluation of a single step is relative to the
current node (the context node) Axis name defines the relationship within the tree For example child, descendant, attribute, parent, etc. Node test specifies a node (or a group of nodes) within the axis Predicates allow to refine the matching by providing additional conditions for the nodes to hold
172
XPath expression: /bookstore
<?xml version="1.0" encoding="ISO "?> <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <title lang="eng">Learning XML</title> <price>39.95</price> </bookstore> XPath expression: /bookstore
173
/bookstore/book[price>35]
<?xml version="1.0" encoding="ISO "?> <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <title lang="eng">Learning XML</title> <price>39.95</price> </bookstore> XPath expression: /bookstore/book[price>35]
174
//price/parent::book/title[@=‘eng’]
<?xml version="1.0" encoding="ISO "?> <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <title>Learning XML</title> <price>39.95</price> </bookstore> XPath expression:
175
XPath expression: //book/*
<?xml version="1.0" encoding="ISO "?> <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <title lang="eng">Learning XML</title> <price>39.95</price> </bookstore> XPath expression: //book/*
176
XML XML Path Syntax The evaluation of a single step is relative to the
current node (the context node) Axis name defines the relationship within the tree Node test specifies a node (or a group of nodes) within the axis Predicates allow to refine the matching by providing additional conditions for the nodes to hold It is possible to specify several paths simultaneously
177
//title | //book[price>35]/price
<?xml version="1.0" encoding="ISO "?> <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <title lang="eng">Learning XML</title> <price>39.95</price> </bookstore> XPath expression: //title | //book[price>35]/price
178
XML XML Path Syntax XPath supports a set of operators which can be
used in XPath expressions For example: *, /, div, or, and, etc.
179
XML XML Path Syntax XPath supports a set of operators which can be
used in XPath expressions For example: *, /, div, or, and, etc. There are also a core function library with useful methods
180
XML XML Path Syntax XPath supports a set of operators which can be
used in XPath expressions For example: *, /, div, or, and, etc. There are also a core function library with useful methods Some of the functions provide general utility methods substring, floor, string-length
181
XML XML Path Syntax XPath supports a set of operators which can be
used in XPath expressions For example: *, /, div, or, and, etc. There are also a core function library with useful methods Some of the functions provide general utility methods substring, floor, string-length Some of the functions operate on the node set the current context position, count, id
182
XML XSL Transformations (XSLT) (recap.)
The processing is template-based The XSL stylesheet defines templates, which the nodes in the origin document are matched against In case of a match, the XSLT processor will transform the matching part into the result document according to the provided rules It can be viewed as functional expressions which evaluate into the final result XSLT relies on several XML related specifications XML Namespace and XPath
183
XML XSL Transformations (XSLT) The XSL stylesheet is an XML document!
Therefore it must start with the XML declaration <?xml version="1.0" encoding="ISO "?>
184
XML XSL Transformations (XSLT) The XSL stylesheet is an XML document!
Therefore it must start with the XML declaration <?xml version="1.0" encoding="ISO "?> The root node can be either xsl:stylesheet or xsl:transform
185
XML XSL Transformations (XSLT) The XSL stylesheet is an XML document!
Therefore it must start with the XML declaration <?xml version="1.0" encoding="ISO "?> The root node can be either xsl:stylesheet or xsl:transform The XSLT namespace must be declared <?xml version="1.0" encoding="ISO "?> <xsl:stylesheet version=1.0 xmlns:xsl=“ </xsl:stylesheet>
186
XML XSL Transformations (XSLT)
The XML document is linked to an XSL stylesheet through a simple declaration <?xml version="1.0" encoding="ISO "?> <?xml-stylesheet type="text/xsl" href=“myxsl.xsl"?>
187
XML XSL Transformations (XSLT)
The XML document is linked to an XSL stylesheet through a simple declaration <?xml version="1.0" encoding="ISO "?> <?xml-stylesheet type="text/xsl" href=“myxsl.xsl"?> The browser will then gladly render your XML file The latest versions of the major browsers support XSL Transformations
188
XML XSL Stylesheet structure
The XSL Stylesheet consists of one or more templates Templates hold the rules which are applied when XML nodes are matched
189
XML XSL Stylesheet structure
The XSL Stylesheet consists of one or more templates Templates hold the rules which are applied when XML nodes are matched The <xsl:template> element defines a template The <xsl:template> element has a match attribute which is an XPath expression <xsl:template match=“XPathExpression”> </xsl:template>
190
XML XSL template rules The XSLT processor recursively searches for a match
191
XML XSL template rules The XSLT processor recursively searches for a match The processor holds a list of nodes to match (initialized with the root node)
192
XML XSL template rules The XSLT processor recursively searches for a match The processor holds a list of nodes to match (initialized with the root node) For each node in the list, the best possible template is chosen (if exists) and its logic is applied
193
XML XSL template rules The XSLT processor recursively searches for a match The processor holds a list of nodes to match (initialized with the root node) For each node in the list, the best possible template is chosen (if exists) and its logic is applied If the node is unmatched, the processor adds its children to the list and continues to the next node
194
XML XSL template rules The XSLT processor recursively searches for a match The processor holds a list of nodes to match (initialized with the root node) For each node in the list, the best possible template is chosen (if exists) and its logic is applied If the node is unmatched, the processor adds its children to the list and continues to the next node The process ends when the list is empty
195
XML XSL template rules The XSLT processor recursively searches for a match The processor holds a list of nodes to match (initialized with the root node) For each node in the list, the best possible template is chosen (if exists) and its logic is applied If the node is unmatched, the processor adds its children to the list and continues to the next node The process ends when the list is empty Templates will usually trigger template matching of descendant nodes
196
XML Producing output The result tree is created through static XML output and several XSL elements
197
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:value-of> element Extracts the value of the selected node Has the format <xsl:value-of select=“XPathExpression"/> If not absolute, the XPathExpression is relative to the current node (the node which matched the template)
198
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:for-each> element Used to select every XML element of a specified node set Has the format <xsl:for-each select=“XPathExpression"> </xsl:for-each> If not absolute, the XPathExpression is relative to the current node (the node which matched the template)
199
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:for-each> element For example <xsl:template match=“/bookstore”> <xsl:for-each select=“book[price<10]”> <i> <xsl:value-of select=“title” /> : </i> <b> <xsl:value-of select=“price” /> </xsl:for-each> </xsl:template>
200
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:sort> element The elements can be sorted by simply placing the <xsl:sort> element inside the <xsl:for-each> element
201
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:sort> element For example <xsl:template match=“/bookstore”> <xsl:for-each select=“book[price<10]”> <xsl:sort order=“descending” /> <i> <xsl:value-of select=“title” /> : </i> <b> <xsl:value-of select=“price” /> </xsl:for-each> </xsl:template>
202
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:sort> element The elements can be sorted by simply placing the <xsl:sort> element inside the <xsl:for-each> element It is possible to have primary, secondary (and so on) keys
203
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:if> and <xml:choose> elements Conditional processing in a template is supported by these two elements The <xsl:if> element is a simple condition <xsl:if test=“TestExpression"> . . . </xsl:if>
204
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:if> and <xml:choose> elements For example <xsl:template match=“/bookstore”> <xsl:for-each select=“book[price<10]”> <i> <xsl:value-of select=“title” /> : </i> <b> <xsl:value-of select=“price” /> <xsl:if test=“price<5”> <blink> GREAT PRICE!!! </blink> </xsl:if> </xsl:for-each> </xsl:template>
205
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:if> and <xml:choose> elements Conditional processing in a template is supported by these two elements The <xsl:choose> element is a more complex condition which resembles the Java switch statement
206
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:apply-templates> element When the XSLT processor finds a match to a node, it does not process its children
207
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:apply-templates> element When the XSLT processor finds a match to a node, it does not process its children It is possible to invoke the matching on some of the descendants of the current node <xsl:apply-templates select=“XPathExpression” /> If not absolute, the XPathExpression is relative to the current node (the node which matched the template)
208
XML Producing output The result tree is created through static XML output and several XSL elements The <xsl:apply-templates> element When the XSLT processor finds a match to a node, it does not process its children It is possible to invoke the matching on some of the descendants of the current node <xsl:apply-templates /> If the select attribute is omitted, the matching will be applied to all the child nodes
209
XML XSL summary XML data is usually read by parsers
Although XML is plain text which can be easily read, using a parser allows taking advantage of the semantic structure of the document (e.g. SAX, DOM, Pull parsing, etc.) It is possible to transform XML documents into into other XML documents For example, converting an XML document (e.g. a web feed) into an xHTML document (to be presented in a browser) XML documents can be transformed (and rendered) by using a family of languages called XSL
210
XML XSL summary XSL for XML is like CSS is for HTML
XML tags hold no information about what and how to display XSL is actually a family of languages These languages define the transformation and formatting rules for XML documents XSLT is a language for the transformation of XML documents XPath a language for navigating in XML documents XSL-FO is a language for formatting XML documents
211
XML Well-formedness and validity (recap.)
212
XML Well-formedness and validity (recap.) A well-formed XML document
A document is well-formed if it satisfies the syntax rules If an XML document violates the syntax, it is not considered to be an XML document Yes, it’s draconic – unlike HTML, where the browser is expected to produce a reasonable result even in the presence of severe errors If a document is not well-formed, the processor is required to stop and report an error
213
XML Well-formedness and validity (recap.) A valid XML document
In addition to being well-formed, an XML document may also be valid A valid document holds a reference to an XML schema, and the document follows the rules of that schema XML processors are classified as validating or non-validating, depending on whether or not they check XML documents for validity Document Type Definition (DTD) is just one of the many ways to write grammar rules (schema) which define the validity of a document
214
XML Schemas and validation (recap.) What is an XML schema?
An XML schema addresses the following aspects: the set of elements that may be used in a document what attributes may be applied to every element the order of elements/attributes the allowable parent/child relationships elements/attributes data types etc.
215
XML Schemas and validation (recap.) What is an XML schema? DTD
Defines the grammar rules of a document DTD The oldest schema language for XML Quite simple to write and read Only the string type available for data, that is you cannot define a numeric type of data No complex types Very widely used
216
XML Schemas and validation (recap.) What is an XML schema?
Defines the grammar rules of a document XML Schema definition (XSD) Much more powerful than DTD XSD uses an XML-based format, which makes it easier to read using XML tools Complex and rich data typing (almost like a programming language) Detailed constraints on the logical structure of an XML document
217
XML Schemas and validation (recap.) What is an XML schema?
Defines the grammar rules of a document Why to use any kind of schema?
218
XML Schemas and validation (recap.) What is an XML schema?
Defines the grammar rules of a document Why to use any kind of schema? Your XML documents carry their own definitions
219
XML Schemas and validation (recap.) What is an XML schema?
Defines the grammar rules of a document Why to use any kind of schema? Your XML documents carry their own definitions May serve as a specification for data format in a shared project
220
XML Schemas and validation (recap.) What is an XML schema?
Defines the grammar rules of a document Why to use any kind of schema? Your XML documents carry their own definitions May serve as a specification for data format in a shared project You can validate the received data to avoid parsing errors You can validate your own data to avoid errors
221
XML Document Type Definition (DTD)
222
XML Document Type Definition (DTD) DTD is a set of declarations
<!DOCTYPE bookstore [ <!ELEMENT bookstore (books*) <!ELEMENT book (title,author,year)> <!ELEMENT title (#CDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT year (#CDATA)> <!ATTLIST book price (#CDATA) #REQUIRED> ]>
223
XML Document Type Definition (DTD) DTD is a set of declarations
<!DOCTYPE bookstore [ <!ELEMENT bookstore (books*) <!ELEMENT book (title,author,year)> <!ELEMENT title (#CDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT year (#CDATA)> <!ATTLIST book price (#CDATA) #REQUIRED> ]> Declarations define the legal structure of a valid XML document
224
XML Document Type Definition (DTD) DTD is a set of declarations
There are several types of declarations Element type declarations Name the allowable set of elements within the document and specify the allowed nesting character content of the elements
225
XML Document Type Definition (DTD) DTD is a set of declarations
There are several types of declarations Element type declarations Attribute type declarations Name the allowable set of attributes for each declared element specify the value type or a strict set of possible values for each attribute
226
XML Document Type Definition (DTD) DTD is a set of declarations
There are several types of declarations Element type declarations Attribute type declarations Entity type declarations Used to define abbreviations (resembles #define in the C programming language) and also specify special characters
227
XML Document Type Definition (DTD) DTD is a set of declarations
There are several types of declarations The declarations are composed of building blocks with different types In addition to the basic types: Elements, Attributes, Entities there are types for character data: CDATA and PCDATA
228
XML Document Type Definition (DTD) DTD is a set of declarations
There are several types of declarations The declarations are composed of building blocks with different types In addition to the basic types: Elements, Attributes, Entities there are types for character data: CDATA and PCDATA CDATA is “Character Data”; it is text that WILL NOT be parsed by an XML parser Simply put, entities and markup will not be parsed
229
XML Document Type Definition (DTD) DTD is a set of declarations
There are several types of declarations The declarations are composed of building blocks with different types In addition to the basic types: Elements, Attributes, Entities there are types for character data: CDATA and PCDATA PCDATA is “Parsed Character Data”; it is text that WILL be parsed by an XML parser Simply put, entities and markup will be parsed
230
XML Document Type Definition (DTD) DTD is a set of declarations
Element type declarations Elements are declared with an ELEMENT declaration as follows <!ELEMENT element-name category> or <!ELEMENT element-name (element-content)>
231
XML Document Type Definition (DTD) DTD is a set of declarations
Element type declarations Elements are declared with an ELEMENT declaration as follows <!ELEMENT element-name category> category specifies that element must have no content (EMPTY) or can have any content (ANY) For example, <!ELEMENT br EMPTY>
232
XML Document Type Definition (DTD) DTD is a set of declarations
Element type declarations Elements are declared with an ELEMENT declaration as follows <!ELEMENT element-name (element-content)> (element-content) specifies the possible content of the element in more detail For example, a required sequence of child elements <!ELEMENT html (head, body)> the above child elements must appear only once
233
XML Document Type Definition (DTD) DTD is a set of declarations
Element type declarations Elements are declared with an ELEMENT declaration as follows <!ELEMENT element-name (element-content)> (element-content) specifies the possible content of the element in more detail For example, providing a required number of appearances <!ELEMENT book (title, dedication*, authors+, toc?, chapter+)>
234
XML Document Type Definition (DTD) DTD is a set of declarations
Element type declarations Elements are declared with an ELEMENT declaration as follows <!ELEMENT element-name (element-content)> (element-content) specifies the possible content of the element in more detail For example, a mixed and optional content <!ELEMENT book (title, authors+, toc?, (chapter|story)+)> <!ELEMENT message (from,to, #PCDATA)>
235
XML Document Type Definition (DTD) DTD is a set of declarations
Attribute type declarations Attributes are declared with an ATTLIST declaration as follows <!ATTLIST element-name attribute-name attribute-type default-value> attribute-type defines the type of the attribute value, e.g. CDATA, (val1|val2|…), ID, ENTITY
236
XML Document Type Definition (DTD) DTD is a set of declarations
Attribute type declarations Attributes are declared with an ATTLIST declaration as follows <!ATTLIST element-name attribute-name attribute-type default-value> attribute-type defines the type of the attribute value, e.g. CDATA, (val1|val2|…), ID, ENTITY <!ATTLIST cd genre (jazz|rock|pop) … > <!ATTLIST div id ID … >
237
XML Document Type Definition (DTD) DTD is a set of declarations
Attribute type declarations Attributes are declared with an ATTLIST declaration as follows <!ATTLIST element-name attribute-name attribute-type default-value> default-value can have the following values #REQUIRED - the attribute is required #IMPLIED - the attribute is not required #FIXED value - the attribute value is fixed value - the attribute’s default value is value
238
XML Document Type Definition (DTD) DTD is a set of declarations
Attribute type declarations Attributes are declared with an ATTLIST declaration as follows <!ATTLIST element-name attribute-name attribute-type default-value> default-value can have the following values: #REQUIRED, #IMPLIED, #FIXED value, value <!ATTLIST cd genre (jazz|rock|pop) #REQUIRED>
239
XML Document Type Definition (DTD) DTD is a set of declarations
Attribute type declarations Attributes are declared with an ATTLIST declaration as follows <!ATTLIST element-name attribute-name attribute-type default-value> default-value can have the following values: #REQUIRED, #IMPLIED, #FIXED value, value <!ATTLIST div id ID #IMPLIED>
240
XML Document Type Definition (DTD) DTD is a set of declarations
Attribute type declarations Attributes are declared with an ATTLIST declaration as follows <!ATTLIST element-name attribute-name attribute-type default-value> default-value can have the following values: #REQUIRED, #IMPLIED, #FIXED value, value <!ATTLIST payment type (cash|credit) “cash”>
241
XML Document Type Definition (DTD) DTD is a set of declarations
Attribute type declarations Attributes are declared with an ATTLIST declaration as follows <!ATTLIST element-name attribute-name attribute-type default-value> default-value can have the following values: #REQUIRED, #IMPLIED, #FIXED value, value <!ATTLIST record version CDATA FIXED “1.0”>
242
XML Document Type Definition (DTD) DTD is a set of declarations
Entity type declarations Entities are declared with an ENTITY declaration, which can be internal: <!ENTITY entity-name “entity-value”> or external <!ENTITY entity-name SYSTEM “URI/URL”> The entity is then used with the following syntax &entity-name;
243
XML Document Type Definition (DTD) DTD is a set of declarations
Entity type declarations Entities are declared with an ENTITY declaration, which can be internal: <!ENTITY entity-name “entity-value”> For example: <!ENTITY coursenum “SENG513”> Usage: <course>&coursenum;</course>
244
XML Document Type Definition (DTD) DTD is a set of declarations
Entity type declarations Entities are declared with an ENTITY declaration, which can be internal or external: <!ENTITY entity-name SYSTEM “URI/URL”> For example: <!ENTITY coursenum SYSTEM “
245
XML Document Type Definition (DTD)
DTD can be defined internally and externally Internal definition is within the XML document The DTD needs to be wrapped in a DOCTYPE definition as follows <!DOCTYPE root-element [element-declarations]> Where root-element is the root element of the XML document and element-declarations are the DTD declarations
246
XML Document Type Definition (DTD)
DTD can be defined internally and externally Internal definition is within the XML document For example <?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> ... </note>
247
XML Document Type Definition (DTD)
DTD can be defined internally and externally Internal definition is within the XML document External definition is referenced from within the XML document The DTD declarations appear in an external file which needs to be referenced as follows <!DOCTYPE root-element SYSTEM “filename”>
248
XML Document Type Definition (DTD)
DTD can be defined internally and externally Internal definition is within the XML document External definition is referenced from within the XML document For example, <?xml version="1.0"?> <!DOCTYPE note SYSTEM “note.dtd”> <note> </note>
249
XML Document Type Definition (DTD) Example from
250
<. ELEMENT CATALOG (PRODUCT+)> <
<!ELEMENT CATALOG (PRODUCT+)> <!ELEMENT PRODUCT (SPECIFICATIONS+, OPTIONS?, PRICE+, NOTES?)> <!ELEMENT SPECIFICATIONS (#PCDATA)> <!ELEMENT OPTIONS (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ELEMENT NOTES (#PCDATA)> <!ATTLIST PRODUCT NAME CDATA #IMPLIED> <!ATTLIST CATEGORY (HandTool | Table | Shop-Professional) "HandTool"> <!ATTLIST PARTNUM CDATA #IMPLIED> <!ATTLIST PLANT (Pittsburgh | Milwaukee | Chicago) "Chicago"> <!ATTLIST INVENTORY (InStock | Backordered | Discontinued) "InStock"> <!ATTLIST SPECIFICATIONS WEIGHT CDATA #IMPLIED> <!ATTLIST POWER CDATA #IMPLIED> <!ATTLIST OPTIONS FINISH (Metal | Polished | Matte) "Matte"> <!ATTLIST OPTIONS ADAPTER (Included | Optional | NotApplicable) "Included"> <!ATTLIST OPTIONS CASE (HardShell | Soft | NotApplicable) "HardShell"> <!ATTLIST PRICE MSRP CDATA #IMPLIED> <!ATTLIST PRICE WHOLESALE CDATA #IMPLIED> <!ATTLIST PRICE STREET CDATA #IMPLIED> <!ATTLIST PRICE SHIPPING CDATA #IMPLIED> <!ENTITY AUTHOR "John Doe"> <!ENTITY COMPANY "JD Power Tools, Inc."> <!ENTITY
251
XML XML Schema
252
XML XML Schema XML Schema resembles DTD in what it provides:
defines elements/attributes that can appear in a document defines the parent/child relationship of elements defines the number of children and their order defines the possible content of an element defines data types for elements and attributes defines default and fixed values for elements and attributes
253
XML XML Schema XML Schema resembles DTD in what it provides:
However, it holds some advantages over DTD written in XML supports more data types and data restrictions possible to reference multiple XML Schemas supports namespaces reuse and extend old XML Schemas DTD is still more used due to its simplicity and clarity
254
XML XML Schema The XML Schema is an XML document with a root
element <schema> The typical form of an XML Schema is <?xml version="1.0"?> <xs:schema xmlns:xs=" . . . </xs:schema> Note the xs namespace defined in the root element
255
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex
256
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements These elements contain only text of any type and cannot contain any other elements or attributes Some restrictions may be applied Simple elements cannot be empty
257
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements These elements contain only text of any type and cannot contain any other elements or attributes Some restrictions may be applied Simple elements cannot be empty Complex elements Any element which is not simple, is complex
258
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type
259
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type The syntax of a simple element is <xs:element name=“name" type=“type"/> Where name is the name of the element and type is the text data type
260
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type The syntax of a simple element is <xs:element name=“name" type=“type"/> Most common data types are xs:string xs:boolean xs:decimal xs:date xs:integer xs:time
261
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type For example: <xs:element name=“dogname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/> <dogname>Rocky</dogname> <age>5</age> <dateborn> </dateborn>
262
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type Simple elements can have default values or fixed values through the appropriate attributes <xs:element name="color" type="xs:string" default="red"/> <xs:element name="color" type="xs:string" fixed="red"/>
263
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type Attributes resemble simple elements <xs:element name=“name" type=“type"/> where name and type are the same as for simple elements
264
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type It is possible to use default and fixed attributes <xs:attribute name="lang" type="xs:string" default="EN"/> <xs:attribute name="lang" type="xs:string" fixed="EN"/>
265
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type It is possible to use default and fixed attributes <xs:attribute name="lang" type="xs:string" default="EN"/> <xs:attribute name="lang" type="xs:string" fixed="EN"/> By default attributes are optional; to make them required it is possible to use the use=“required” attribute in the declaration
266
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type Data types may be restricted
267
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type Data types may be restricted The general form of a restriction on a textual value is <xs:element name="name"> (or <xs:attribute name=“name”> <xs:restriction base="type"> </xs:restriction> </xs:element>
268
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type Data types may be restricted The restriction is imposed by using special elements, e.g. totalDigits, pattern, length, enumeration For example…
269
XML XML Schema Restriction example
<xs:element name="car"> <xs:restriction base="xs:string"> <xs:enumeration value="Audi"/> <xs:enumeration value=“Toyota"/> <xs:enumeration value="BMW"/> </xs:restriction> </xs:element>
270
XML XML Schema Restriction example
<xs:element name="age"> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> <xs:maxInclusive value=“120"/> </xs:restriction> </xs:element>
271
XML XML Schema Restriction example
<xs:element name="password"> <xs:restriction base="xs:string"> <xs:pattern value="[a-zA-Z0-9]{8}"/> </xs:restriction> </xs:element>
272
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type Data types may be restricted Complex elements can contain text, other elements, have attributes, and be empty - there are four kinds
273
XML XML Schema The XML Schema is an XML document with a root
element <schema> There are two types of elements: simple and complex Simple elements and attributes are of simple type Data types may be restricted Complex elements can contain text, other elements, have attributes, and be empty - there are four kinds empty elements containing only other elements containing only text containing both text and other elements
274
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration <xs:element name=“name"> <xs:complexType> <order-indicator> <xs:element name=“name" type=“type" occurs-indicator /> <xs:element name=“name" type=“type" occurs-indicator /> </order-indicator> </xs:complexType> </xs:element>
275
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration <xs:element name=“book"> <xs:complexType> <xs:sequence> <xs:element name=“title" type="xs:string"/> <xs:element name=“author" type="xs:string" maxOccurs=“5"/> </xs:sequence> </xs:complexType> </xs:element>
276
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators
277
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements <xs:all> specifies that the child elements can appear in any order, and that each child element must occur only once
278
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements <xs:element name="person"> <xs:complexType> <xs:all> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:all> </xs:complexType> </xs:element>
279
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements <xs:all> <xs:choice> specifies that either one child element or another can occur
280
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements <xs:element name=“employee"> <xs:complexType> <xs:choice> <xs:element name=“teamleader" type=“teamleader"/> <xs:element name=“developer" type=“developer"/> </xs:choice> </xs:complexType> </xs:element>
281
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements <xs:all> <xs:choice> <xs:sequence> specifies that the child elements must appear in a specific order
282
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element>
283
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements: <xs:all>, <xs:choice>, <xs:sequence> Occurrence indicators define how often an element can occur by using the maxOccurs and minOccurs attributes in the elements (to specify that an element can occur unbounded number of times, use maxOccurs=“unbounded”)
284
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration <xs:element name=“book"> <xs:complexType> <xs:sequence> <xs:element name=“title" type="xs:string"/> <xs:element name=“author" type="xs:string" minOccurs=“1"/> <xs:element name=“chapter" type=“chapter" maxOccurs=“unbounded"/> </xs:sequence> </xs:complexType> </xs:element>
285
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration There are several types of indicators Order indicators define the order of the elements: <xs:all>, <xs:choice>, <xs:sequence> Occurrence indicators define how often an element can occur by using the maxOccurs and minOccurs attributes in the elements (to specify that an element can occur unbounded number of times, use maxOccurs=“unbounded”) Group indicators: elements and attributes can be grouped and referenced in later declarations
286
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration It is also possible to define an element and to reference it in later declarations
287
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration It is also possible to define an element and to reference it in later declarations The general form would be <xs:element name=“name” type=“typeName”/> <xs:complexType name=“typeName”> </xs:complexType>
288
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration It is also possible to define an element and to reference it in later declarations <xs:element name="employee" type="personinfo"/> <xs:complexType name="personinfo"> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType>
289
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration It is also possible to define an element and to reference it in later declarations As a result several elements can refer to the same complex type <xs:element name=“developer" type="personinfo"/> <xs:element name=“qa" type="personinfo"/> <xs:element name=“manager" type="personinfo"/>
290
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration It is also possible to define an element and to reference it in later declarations As a result several elements can refer to the same complex type Complex types can also be extended
291
<xs:element name="employee" type="fullpersoninfo"/> <xs:complexType name="personinfo"> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType> <xs:complexType name="fullpersoninfo"> <xs:complexContent> <xs:extension base="personinfo"> <xs:sequence> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
292
XML XML Schema Complex elements can be define directly, which
resembles a DTD declaration It is also possible to define an element and to reference it in later declarations As a result several elements can refer to the same complex type Complex types can also be extended We only showed one type of elements, which contain other elements only
293
XML XML Schema Empty complex elements are defined the same
but without providing any child elements
294
XML XML Schema Empty complex elements are defined the same
but without providing any child elements <xs:element name=“book” type=“booktype”/> <xs:complexType name=“booktype”> <xs:attribute name=“isbn” type=“xs:string”/> </xs:complexType>
295
XML XML Schema Empty complex elements are defined the same
but without providing any child elements Element which contains only text needs to be restricted/extended to/from the base simple type
296
XML XML Schema Empty complex elements are defined the same
but without providing any child elements Element which contains only text needs to be restricted/extended to/from the base simple type <xs:element name=“cdtime” type=“timelength”/> <xs:complexType name=“timelength”> <xs:simpleContent> <xs:extension base="xs:integer"> <xs:attribute name=“units” type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType>
297
XML XML Schema Empty complex elements are defined the same
but without providing any child elements Element which contains only text needs to be restricted/extended to/from the base simple type Element which contains both text and other elements needs to be declared as such by setting the mixed attribute in the <xs:complexType> element <xs:complexType name=“message” mixed=true>
298
XML XML Schema A reference to an XML Schema comes is specified
in the root node of the XML document <?xml version=“1.0”?> <root xmlns:xsi= xsi:noNamespaceSchemaLocation=“url.xsd" > . . . </root>
299
XML Schemas and validation summary What is an XML schema?
An XML schema addresses the following aspects: the set of elements that may be used in a document what attributes may be applied to every element the order of elements/attributes the allowable parent/child relationships elements/attributes data types etc.
300
XML Schemas and validation summary What is an XML schema? DTD
Defines the grammar rules of a document DTD The oldest schema language for XML Quite simple to write and read Only the string type available for data, that is you cannot define a numeric type of data No complex types Very widely used
301
XML Schemas and validation summary What is an XML schema?
Defines the grammar rules of a document XML Schema definition (XSD) Much more powerful than DTD XSD uses an XML-based format, which makes it easier to read using XML tools Complex and rich data typing (almost like a programming language) Detailed constraints on the logical structure of an XML document
302
XML Schemas and validation summary What is an XML schema?
Defines the grammar rules of a document Why to use any kind of schema? Your XML documents carry their own definitions May server as a specification for data format in a shared project You can validate the received data to avoid parsing errors You can validate your own data to avoid errors
303
XML XML Schema vs. DTD XML Schema resembles DTD in what it provides
However, it holds some advantages over DTD written in XML supports more data types and data restrictions possible to reference multiple XML Schemas supports namespaces reuse and extend old XML Schemas DTD is still more used due to its simplicity and clarity
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.