Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Roger L.

Slides:



Advertisements
Similar presentations
XML Language Family Detailed Examples Most information contained in these slide comes from: These slides are intended.
Advertisements

1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
XML 6.5 XML Schema (XSD) 6. What is XML Schema? The origin of schema  XML Schema documents are used to define and validate the content and structure.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
1 XML DTD & XML Schema Monica Farrow G30
+ XSL eXtensible Stylesheet Language. + 2 XML Lecture Adapted from the work of Prof Mark Baker ACET, University of Reading.
An Introduction to XML Schema CSCI 7818 by Ming Rutar.
Document Type Definitions
CSE 636 Data Integration XML Schema. 2 XML Schemas W3C Recommendation: Generalizes DTDs Uses XML syntax Two documents: structure.
XML Schemas Microsoft XML Schemas W3C XML Schemas.
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
2/9/00 EECS 684: Current Topics in Databases1 ( W3C Working Draft 17 December 1999 )
XML Schemas Lecture 10, 07/10/02. Acknowledgements A great portion of this presentation has been borrowed from Roger Costello’s excellent presentation.
XML Simple Types CSPP51038 shortcourse. Simple Types Recall that simple types are composed of text-only values. All attributes are of simple type Elements.
XML Schema Matthias Hauswirth. Agenda 4 W3C Process 4 XML Schema Requirements 4 The Specifications 4 Schema Tools.
DECO 3002 Advanced Technology Integrated Design Computing Studio Tutorial 6 – XML Schema School of Architecture, Design Science and Planning Faculty of.
1 XML Schemas Marco Mesiti This Presentation has been extracted from Roger L. Costello (XML Technologies Course)
XML Schemas and Namespaces Lecture 11, 07/10/02. BookStore.dtd.
Document Content Description for XML, Version 1.0 By Tim Bray, Charles Frankston and Ashok Malhotra EECS 684 Presentation by Calvin Ang.
ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ ΣΤΟΝ ΠΑΓΚΟΣΜΙΟ ΙΣΤΟ XML Schema
XML Schema Notes Lecture 13, 07/16/02. (see example05)
Sunday, June 28, 2015 Abdelali ZAHI : FALL 2003 : XML Schemas XML Schemas Presented By : Abdelali ZAHI Instructor : Dr H.Haddouti.
1 … more on XML Schemas. 2 Name Conflicts Whereas DTDs required every element to have a unique name, XML Schemas enable you to use the same name in multiple.
Copyright (c) [2001]. Roger L. Costello. All Rights Reserved. 1 … more on XML Schemas Roger L. Costello XML Technologies Course.
XML Verification Well-formed XML document  conforms to basic XML syntax  contains only built-in character entities Validated XML document  conforms.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
Processing of structured documents Spring 2003, Part 3 Helena Ahonen-Myka.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Validating DOCUMENTS with DTDs
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
IS432 Semi-Structured Data Lecture 3: XSchema Dr. Gamal Al-Shorbagy.
XML Schema Vinod Kumar Kayartaya. What is XML Schema?  XML Schema is an XML based alternative to DTD  An XML schema describes the structure of an XML.
Chapter 4: Document Type Definitions. Chapter 4 Objectives Learn to create DTDs Validate an XML document against a DTD Use DTDs to create XML documents.
Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev.
1 XML Schemas. 2 Useful Links Schema tutorial links:

Dr. Azeddine Chikh IS446: Internet Software Development.
Copyright © [2001]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XML Language Family Detailed Examples Most information contained in these slide comes from: These slides are intended.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
Ceng 520 XML Schemas IntroductionXML Schemas 2 Part 0: Introduction Why XML Schema?
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
New Perspectives on XML, 2nd Edition
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
1 XML Schemas. 2 Topics What are Schemas? NameSpaces Elements Attributes Data Types Derivations Keys.
XML Schema. Why Schema? To define a class of XML documents Serve same purpose as DTD “Instance document" used for XML document conforming to schema.
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
An Introduction to XML Sandeep Bhattaram
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 5 XML Schema (Based on Møller and Schwartzbach,
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.
Primer on XML Schema CSE 544 April, XML Schemas Generalizes DTDs Uses XML syntax Two parts: structure and datatypes Very complex –criticized –alternative.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: (x2073)
CSE 6331 © Leonidas Fegaras XML Schema 1 XML Schema Leonidas Fegaras.
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Tutorial 2: XML Working with Namespaces. 2 Name Collision This figure shows two documents each with a Name element.
Lecture 0 W3C XML Schema. Topics Status Motivation Simple type vs. complex type.
XML Schema – Simple Type Web site:
THE DATATYPES OF XML SCHEMA A Practical Introduction
New Perspectives on XML
New Perspectives on XML
Presentation transcript:

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Roger L. Costello XML Technologies

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 2 Thanks Special thanks to the following people for their help in answering my endless barrage of questions: –Henry Thompson –Andrew Layman –Noah Mendelsohn –David Beech –Rick Jelliffe Many thanks the following people who have carefully reviewed this tutorial and notified me of bugs, typos, etc. –Oliver Becker –Kit Lueder –David Wang

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 3 Motivation People are dissatisfied with DTDs –It's not XML So, you write your XML document in one language and you specify the grammar of that document using another language (DTD) --> bad, inconsistent –Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, say "I want this element to be of datatype "int" Desire a set of datatypes compatible with those found in databases –DTD supports 10 datatypes; XML Schemas supports 37+ datatypes

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 4 Highlights of XML Schemas XML Schemas are a tremendous advancement over DTDs: –Enhanced datatypes 37+ versus 10 Can create your own datatypes Can define the lexical representation –Example "This element can contain strings of this form: ddd-dddd, where 'd' represents a 'digit'". –Written in XML enables use of XML tools –Object-oriented Can extend or restrict a type (derive new type definitions on the basis of old ones) –Can express sets - the child elements may occur in any order –Can specify element content as being unique (keys on content) and uniqueness within a region –Can define multiple elements with the same name but different content –Can define elements with null content –Can create equivalent elements - e.g., the "subway" element is equivalent to the "train" element.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 5 Problem Convert BookCatalogue.dtd to the XML Schema notation –for this first example we will make a straight, one-to-one conversion, i.e., Title, Author, Date, ISBN, and Publisher will hold strings, just like is done in the DTD (next slide) –We will gradually modify the example to give more custom types to these elements

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 6 BookCatalogue.dtd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 7 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd xsd = Xml-Schema Definition (explanations on succeeding pages)

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 8 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd This (default) namespace declaration says that all these elements come from this namespace (the XML Schema namespace). By virtue of the fact that they are associated with an element that comes from the XML Schema namespace, these attributes also come from the XML Schema namespace. (Recall that a default namespace doesn’t apply to attributes.)

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 9 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd Says that the elements and types defined in this schema are in this namespace. This will be used in instance documents to indicate that the elements contained in the instance document come from this namespace.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 10 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd Here we are referencing a Book element. Where is that Book element defined? In what namespace? The cat: prefix indicates what namespace this element is defined in. cat: has been set to be the same as the targetNamespace.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 11 xmlns:cat The schema element has an attribute xmlns:cat. The cat: namespace prefix, as we noted, is used for one element referencing another element within the schema. Clearly, this attribute will be schema-dependent, e.g., if I write a schema for cars I might have xmlns:car. Obviously, there is no way for xml-schema.dtd to account for all the possibilities. Thus, in the schema we supplement xml-schema.dtd with the internal subset declaration: <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]>

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 12 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" A book catalogue contains zero or more books A Book has a Title, Author, Date, ISBN, and a Publisher BookCatalogue1.xsd The annotation/ info elements are optional. Their content is intended for human consumption, i.e., it's a comment.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 13 Referencing a schema in an XML instance document <BookCatalogue xmlns =" xmlns:xsi=" xsi:schemaLocation=" My Life and Times Paul McCartney July, McMillin Publishing... In the BookCatalogue element (the root element) I declare that the schemaLocation attribute comes from the XML Schema Instance namespace (xsi). The value of schemaLocation is a pair of values - a namespace and the URI to a schema. When the XML parser processes this XML document it will use the schemaLocation pair of values to determine the XML Schema that it conforms to. It will retrieve the schema at the URI specified in schemaLocation (in this example, BookCatalogue.xsd) and then it will open up this schema document to confirm that its targetNamespace value matches the namespace value shown in schemaLocation. In this case it does. I declare (using a default namespace) that all the elements in this XML instance document comes from the same namespace.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 14 Referencing a schema in an XML instance document BookCatalogue.xml BookCatalogue.xsd targetNamespace="A" schemaLocation="A A/BookCatalogue.xsd" - defines elements for namespace A - uses elements from namespace A

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 15 Note about schemaLocation schemaLocation is just a hint to the XML Parser "The choice of which schema to use ultimately lies with the consumer. If you as a consumer wish to rely on the schemaLocation idiom, then you should purchase/use processors that will honor that for you. The reason that some other processors might not provide that service to you is that they are designed to run in environments where it is impractical or undesirable to allow the document author to force reference to and use of some particular schema document." (Noah Mendelsohn) For this tutorial I will assume that we are using an XML Parser which uses the schemaLocation idiom.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 16 Note multiple levels of checking BookCatalogue.xmlBookCatalogue1.xsd xml-schema.dtd Does the xml document conform to the rules laid out in the xml-schema? Is the xml-schema a valid xml document, i.e., does it conform to the rules laid out in the xml-schema DTD? Do Lab1

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 17 Alternate Schema On the following slide is an alternate (equivalent) way of representing the schema shown previously.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 18 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue2.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 19 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue2.xsd Anonymous type (no name) Do Lab 2

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 20 Named Types The following slide shows an alternate (equivalent) schema which uses named types.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 21 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue3.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 22 Problem Defining the Date element to be of type string is unsatisfactory (it allows any string value to be input as the content of the Date element, including non-date strings). We would like to constrain the allowable content that Date can have. Modify the XML-Schema to restrict the content of the Date element to just date values. Similarly, constrain the content of the ISBN element to content of this form: ddddd-ddddd- ddddd or d-ddd-ddddd-d or d-dd-dddddd-d, where 'd' stands for 'digit'

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 23 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue4.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 24 "Elements of this datatype will have string content. The string must conform to one of the three patterns listed. - The first pattern is: 5 digits followed by a dash followed by 5 digits followed by another dash followed by 5 more digits. - The second pattern is: 1 digit followed by a dash followed by 3 digits followed by another dash followed by 5 digits followed by another dash followed by 1 more digit. - The third pattern is: 1 digit followed by a dash followed by 2 digits followed by another dash followed by 6 digits followed by another dash followed by 1 more digit." These patterns are specified using Regular Expressions. In a few slides we will see more of the Regular Expression syntax.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 25 Built-in Datatypes Primitive Datatypes –string –boolean –float –double –decimal –timeInstant –timeDuration –recurringInstant –binary –uri Atomic, built-in –"Hello World" –{true, false} – 12.56E3, 12, 12560, 0, -0, INF, -INF, NAN –7.08 – T20:12: – 1Y2M3DT10H30M12.3S –same format as timeInstant – – Note: 'T' is the date/time separator INF = infinity NAN = not-a-number

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 26 Built-in Datatypes (cont.) Generated datatypes –language –NMTOKEN –NMTOKENS –Name –NCName –QNAME –ID –IDREF –IDREFS –ENTITY –ENTITIES –NOTATION –integer –non-negative-integer –positive-integer –non-positive-integer –negative-integer –date –time Subtype of primitive datatype – any valid xml:lang value, e.g., EN, FR,... –"house" –"house barn yard" –"hello-there" –part (no namespace qualifier) –book:part –Token, must be unique –Token which matches an ID –List of IDREF –456 –zero to infinity –one to infinity –negative infinity to zero –negative infinity to negative one – –13:20: Do Lab 3

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 27 Creating your own Datatypes We can create new datatypes from existing datatypes (called source or basetype) by specifying values for one or more of the optional facets Example. The string primitive datatype has nine optional facets - pattern, enumeration, length, maxlength, maxInclusive, maxExclusive, minlength, minInclusive and minExclusive.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 28 Example This creates a new datatype called 'TelephoneNumber'. Elements of this type can hold string values, but the string length must be exactly 8 characters long and the string must follow the pattern: ddd-dddd, where 'd' represents a 'digit'.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 29 Facets of the Integer Datatype Facets: –maxInclusive –maxExclusive –minInclusive –minExclusive

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 30 Example This creates a new datatype called 'EarthSurfaceElevation'. Elements of this type can hold an integer. However, the integer must have a value between and 29028, inclusive.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 31 General Form of Datatype/Facet Usage... Facets: - minInclusive - maxInclusive - minExclusive - maxExclusive - length - minlength - maxlength - pattern - enumeration... Sources: - string - boolean - float - double - decimal - timeInstant - timeDuration - recurringInstant - binary - uri... Do Lab 4

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 32 Regular Expressions Recall that the string datatype has a pattern facet. The value of a pattern facet is a regular expression. Below are some examples of regular expressions: Regular Expression - Chapter \d - a*b - [xyz]b - a?b - a+b - [a-c]x Example - Chapter 1 - b, ab, aab, aaab, … - xb, yb, zb - b, ab - ab, aab, aaab, … - ax, bx, cx

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 33 Regular Expressions (cont.) Regular Expression –[a-c]x –[-ac]x –[ac-]x –[^0-9]x –\Dx –Chapter\s\d –(ho){2} there –(ho\s){2} there –.abc –(a|b)+x Example –ax, bx, cx –-x, ax, cx –ax, cx, -x – any non-digit char followed by x – Chapter followed by a blank followed by a digit –hoho there – any char followed by abc –ax, bx, aax, bbx, abx, bax,...

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 34 Regular Expressions (cont.) a{1,3}x a{2,}x \w\s\w ax, aax, aaax aax, aaax, aaaax, … word (alphanumeric plus dash) followed by a space followed by a word

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 35 Example R.E. [1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5] 0 to to to to 255 This regular expression restricts a string to have values between 0 and 255. … Such a R.E. might be useful in describing an IP address...

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 36 IP Datatype Definition <pattern value="(([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3} ([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"> Datatype for representing IP addresses. Examples, , , etc. This datatype restricts each field of the IP address to have a value between zero and 255, i.e., [0-255].[0-255].[0-255].[0-255] Note: in the value attribute (above) the regular expression has been split over two lines. This is for readability purposes only. In practice the R.E. would all be on one line.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 37 Regular Expression Parser Want to test your skill in writing regular expressions? Go to: –Dan Potter has created a nice tool which allows you to enter a regular expression and then enter a string. The parser will then determine if your string conforms to your regular expression. Do Lab 5

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 38 Derived Types We can do a form of subclassing the type definitions. We call this "derived types" –derive by extension: extend the parent type with more elements –derive by restriction: constrain the parent type by constraining some of the elements to have a more restricted range of values, or a more restricted number of occurrences.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 39 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue5.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 40 Elements of type Book will have 5 child elements - Title, Author, Date, ISBN, and Publisher.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 41 Using Derived Types If we declare an element to be of type Publication then in the XML instance document that element's content can be either a Publication or a Book (since Book is a Publication).

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 42 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue6.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 43 <Catalogue xmlns =" xmlns:xsi=" xsi:schemaLocation=" Staying Young Forever Karin Granstrom Jordan, M.D. December, 1999 Illusions The Adventures of a Reluctant Messiah Richard Bach Dell Publishing Co. The First and Last Freedom J. Krishnamurti Harper & Row BookCatalogue2.xml

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 44 Illusions The Adventures of a Reluctant Messiah Richard Bach Dell Publishing Co. "The CatalogueEntry element is declared to be of type Publication. Book is derived from Publication. Therefore, Book is a Publication. Thus, the content of CatalogueEntry can be a Book. However, to indicate that the content is not the source type, but rather a derived type, we need to specify the derived type that is being used. The attribute 'type' comes from the XML Schema Instance (xsi) namespace."

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 45 Why is xsi:type Needed? Why, in an instance document, do we need to indicate the derived type being used? Suppose that there are several types derived (by extension) from Publication, and some of the extended element declarations are optional. In the instance document, if xsi:type was not used, it might get very difficult for an XML Parser to determine which derived type is being used. Do Lab 6

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 46 Derive by Restriction Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date. There must be exactly one Author element.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 47 Restricting Derivations Sometimes we may want to create a type and disallow all derivations of it, or just disallow extensions of it, or restrictions of it. This type cannot be extended nor restricted This type cannot be restricted This type cannot be extended

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 48 exact Types If you define a type to be exact, then other types may derive from it. However, in the instance document derived types may not be used in its stead.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 49 Illusions The Adventures of a Reluctant Messiah Richard Bach Dell Publishing Co. Schema: Instance doc: This permits Publication elements, as well as types derived from Publication to be used as a child of Catalogue, e.g., Book

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 50 Illusions The Adventures of a Reluctant Messiah Richard Bach Dell Publishing Co. Schema: Instance doc: This prohibits the use of types derived by extension to be used as children of CatalogueEntry, e.g., this is not allowed

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 51 exact Types exact="extension" –Prohibits types derived by extension to be used in its stead in instance documents exact="restriction" –Prohibits types derived by restriction to be used in its stead in instance documents exact="#all" –Prohibits all derived types to be used in its stead in instance documents

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 52 Equivalence Oftentimes in daily conversation there are several ways to express something. –In Boston we use the words "T" and "subway" interchangeably. For example, "we took the T into town", or "we took the subway into town". Thus, "T" and "subway" are equivalent. Which one is used may depend upon what part of the state you live in, what mood you're in, or any number of factors. We would like to be able to express this "equivalence" capability in XML Schemas. –We would like to be able to declare in the schema an element called "subway", an element called "T", and state that "T" is equivalent to "subway". Thus, instance documents can use either "subway" or "T", depending on their preference.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 53 equivClass We can create an element (called the exemplar) and then create other elements which state that they are equivalent to the exemplar. subway is the exemplar T is equivalent. So what's the big deal? - Where ever the exemplar can be used in an instance document, any equivalent element can also be used!

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 54 Red Line Schema: Instance doc: Note: the type of every element of an equivalence class must be the same as or derived from the type of the exemplar element. Red Line Alternative Instance doc:

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 55 Enabling Interoperability One very hard problem in today's world is how to enable different domains to communicate effectively when they don't all speak the same language. The equivClass is a good step towards enabling interoperability. –It makes explicit the equivalence of one element with another element.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 56 Enabling Interoperability Imagine that you create an application. The application is designed to input XML documents. Suppose that within the XML document the application expects to see a element. Now suppose that it receives an XML document, but it contains a element. The application can go to the XML Schema and see that is equivalent to. Thus, it accepts and understands the XML document, even though it is using a different vocabulary than what the application was written to. Wow! Further, over time new element declarations can be added to the schema. For example, Without any change to the application, the application will be able to process XML documents with a element by simply consulting the schema. Mike Los coined the term "interoperability schema" to describe schemas used in the fashion described above. Do Lab 7

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 57 Abstract Elements You can declare an element to be abstract –Example. An abstract element is a kind of template/placeholder. If an element is abstract then in the XML instance document that element may not appear. –Example. may not appear in an instance document. However, elements that are equivalent to the abstract type may appear in its place.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 58 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue7.xsd Since the Publication element is abstract, only equivalent elements can appear as children of Catalogue. The Book and Magazine elements are equivalent to the Publication element.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 59 <Catalogue xmlns =" xmlns:xsi=" xsi:schemaLocation=" Natural Health December, 1999 Illusions The Adventures of a Reluctant Messiah Richard Bach Dell Publishing Co. The First and Last Freedom J. Krishnamurti Harper & Row BookCatalogue3.xml An XML Instance Document of BookCatalogue7.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 60 Attributes On the next slide I show a version of the BookCatalogue DTD that uses attributes. On the following slide I show how this is implemented using XML Schemas.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 61 <!-- A Book has three attributes - Category, InStock, and Reviewer. Category must be either "autobiography", "non-fiction", or "fiction". A value must be supplied for this attribute whenever a Book element is used within a document. InStock can be either ”true" or ”false". If no value is supplied it defaults to ”false". Reviewer contains the name of the reviewer. It defaults to "" if no value is supplied --> <!ATTLIST Book Category (autobiography | non-fiction | fiction) #REQUIRED InStock (true | false) ”false" Reviewer CDATA ""> BookCatalogue2.dtd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 62 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" A book catalogue contains zero or more books A Book has a Title, one or more Authors, a Date, an ISBN, and a Publisher cont. --->

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 63 A Book has three attributes - Category, InStock, and Reviewer. Category must be either "autobiography", "non-fiction", or "fiction". A value must be supplied for this attribute whenever a Book element is used within a document. InStock can be either "yes" or "no". If no value is supplied it defaults to "no". Reviewer contains the name of the reviewer. It defaults to "" if no value is supplied. BookCatalogue8.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 64 "The default value for maxOccurs is 1. Thus, this attribute is REQUIRED."

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 65 Alternate Schema On the next slide is another way of expressing the last example.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 66 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue9.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 67 Note about Attributes The attribute declarations always come last, after the element declarations. Do Lab 8

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 68 group Element The group element enables you to group together element declarations. Note: the group element is just for grouping together element declarations, no attribute declarations allowed.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 69 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:cat=" BookCatalogue10.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 70 Expressing Alternates <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:sig=" DTD: XML Schema:

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 71 Expressing Repeats <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:bi=" DTD: XML Schema:

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 72 Expressing Any Order <schema xmlns=" targetNamespace=" XML Schema: Problem: create an element, Book, which contains Author, Title, Date, ISBN, and Publisher, in any order (Note: this cannot be done with DTDs). order="all" means that Book must contain all five child elements, but they may occur in any order. Note: minOccurs, maxOccurs are both fixed to a value of "1". Do Lab 9

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 73 Empty Element <schema xmlns=" targetNamespace=" Schema: Instance doc: Do Lab 10

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 74 Uniqueness DTDs provided the ID attribute datatype for uniqueness (i.e., an ID value must be unique throughout the entire document, and the XML parser enforces this). XML Schema has much enhanced uniqueness capabilities: –enables you to define element content to be unique. –enables you to define non-ID attributes to be unique. –enables you to define a combination of element content and attributes to be unique. –enables you to distinguish between unique and key. –enables you to declare the range of the document over which something is unique

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 75 unique vs key Key: an element/attribute (or combination thereof) which is defined to be a key must –always be present (minOccurs must be greater than zero) –be non-nullable (i.e., nullable="false") –be unique Key implies unique, but unique does not imply key

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 76 Illusions The Adventures of a Reluctant Messiah Richard Bach Dell Publishing Co. The First and Last Freedom J. Krishnamurti Harper & Row Roger Costello <Book titleRef="Illusions The Adventures of a Reluctant Messiah" categoryRef="fiction"/> + = key + = keyref

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 77 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:lib="

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 78 Library/BookCatalogue/Book Library/CheckoutRegister/Book BookCatalogue11.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 79 Defining a Key Library/BookCatalogue/Book "I define that the combination of the contents of Title plus the value of the attribute Category is to be unique throughout an XML instance document. The Title element and Category attribute lie within the XPath expression shown in the selector element." Note: you can have one or more field elements. The key will be the combination of all the fields.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 80 Defining a Reference to a "I define that the values in the two attributes, titleRef and categoryRef, combined, are a reference to a key defined by bookKey. The two attributes are located at the XPath expression found in the selector element." Note: if there are 2 fields in the key, then there must be 2 fields in the keyref, if there are 3 fields in the key, then there must be 3 fields in the keyref, etc. Further, the fields in the keyref must match in type and position to the key.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 81 Defining Uniqueness Library/BookCatalogue/Book "I define that the combination of the contents of Title plus the value of the attribute Category is to be unique throughout an XML instance document. The Title element and Category attribute lie within the XPath expression shown in the selector element."

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 82 Specifying scope of uniqueness in XML Schemas? - The key/keyref/unique elements may be placed anywhere in your schema. - Where you place them determines the scope of the uniqueness. - In our example we placed the key/keyref elements at the top-level (direct child of the schema element). Thus, we are stating that in an instance document the uniqueness is with respect to the entire document. - On the other hand, if we were to place the key/keyref elements as a child of the Book element then the uniqueness will have a scope of just the Book element. Thus, over the entire instance document there may be repeats, but within any Book element it will be unique.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 83 any Element The any element allows any well-formed XML. The free-form element can contain any well-formed XML (WFXML)." This is great!

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 84 anyAttribute The anyAttribute allows any attribute The free-form element can have any attribute.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 85 (Almost) any Element/Attribute allows any well-formed XML element, provided the element is in another namespace than the one we're defining. allows any well-formed XML element, provided it's from the specified namespace. allows any well-formed XML element, provided it's from the namespace that we're defining. allows any attribute, provided the attribute is in another namespace than the one we're defining. allows any attribute, provided it's from the specified namespace. allows any attribute, provided it's from the namespace that we're defining.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 86 Validating an XML Instance Document Validation can apply to the entire XML instance document, or to a single element.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 87 <Library xmlns:book=" xmlns:person=" xmlns:xsi=" xsi:schemaLocation= " Examples/Book.xsd Illusions The Adventures of a Reluctant Messiah Richard Bach Dell Publishing Co. The First and Last Freedom J. Krishnamurti Harper & Row John Doe Sally Smith Library.xml Validating using two schemas

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 88 Assembling a Schema from Multiple Schema Documents The include element allows you to bring in schema definitions from other schema –They must all have the same namespace –The net effect of include is as though you had typed all the definitions in directly into the containing schema … Book.xsd Person.xsd Library.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 89 <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ ]> <schema xmlns=" targetNamespace=" xmlns:lib=" Library.xsd

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 90 import Element The import element allows you to reference elements in another namespace <schema xmlns=" targetNamespace=" xmlns:html=" <import namespace=" schemaLocation=" …

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 91 Declaration vs Definition You make declarations of things that will be used in an XML instance document. You make definitions of things that are just used in the schema document; a definition creates a new type Declarations: - element declarations - attribute declarations Definitions: - type definitions - attribute group definitions - model group definitions

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 92 Symbol Space Each type definition creates a new symbol space Top-level element declarations are in the top-level symbol space

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 93 <schema xmlns=" targetNamespace=" 3 different titles

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 94 Version Management The schema element has an optional attribute, version, which you may use to indicate the version of your schema (for private version documentation of the schema) <schema xmlns=" targetNamespace=" version="1.0">...

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 95 or ? When do you use the type element and when do you use the datatype element? –Use the type element when there are elements and/or attributes –Use the datatype element when it is a primitive type (string, integer, etc)

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 96 null Content You can indicate in a schema that an element may be null in the instance document. Empty content vs null: –Empty: an element with a type of content="empty" is constrained to have no content. –null: an instance document element may indicate no value is available by setting an attribute - xsi:null - equal to 'true' John Doe XML Schema: XML instance document: The content of middle can be a NMTOKEN value or, we can indicate that its content is undefined.

Copyright (c) [2000]. Roger L. Costello. All Rights Reserved. 97 ur-type The ur-type is the source for all types which do not specify a value for the source attribute. It is the type for all elements which do not specify a type. –Example: