Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © [2001]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)

Similar presentations


Presentation on theme: "Copyright © [2001]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)"— Presentation transcript:

1 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 1 XML Schemas http://www.w3.org/TR/xmlschema-0/ (Primer) http://www.w3.org/TR/xmlschema-1/ (Structures) http://www.w3.org/TR/xmlschema-2/ (Datatypes) Roger L. Costello XML Technologies Course

2 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 2 Acknowledgements Special thanks to the following people for their help in answering my unending questions and/or for finding errors and making suggestions: –Henry Thompson –Jonathan Rich –Francis Norton –Rick Jelliffe –Curt Arnold

3 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 3 Viewing this Tutorial This tutorial is best viewed in slide show mode –Under the View menu select Slide Show Periodically you will see an icon at the bottom, right of the slide indicating that it is time to do a lab exercise. I strongly recommend that you stop and do the lab exercise to obtain the maximum benefit from this tutorial.

4 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 4 Schema Validators Command Line Only –XSV by Henry Thompson ftp://ftp.cogsci.ed.ac.uk/pub/XSV/XSV11.EXE Has a Programmatic API –xerces by Apache http://www.apache.org/xerces-j/index.html –Oracle Schema Validator http://technet.oracle.com/tech/xml/schema_java/index.htm GUI Oriented –XML Spy http://www.xmlspy.com –XML Authority http://www.extensibility.com

5 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 5 Purpose of XML Schemas (and DTDs) Specify: –the structure of instance documents "this element contains these elements, which contains these other elements, etc" –the datatype of each element/attribute "this element shall hold an integer within the range 0 to 12,000"

6 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 6 Motivation for XML Schemas People are dissatisfied with DTDs –It's a different syntax You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent –Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, say "I want the element to hold an integer with a range of 0 to 12,000" Desire a set of datatypes compatible with those found in databases –DTD supports 10 datatypes; XML Schemas supports 41+ datatypes

7 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 7 Highlights of XML Schemas XML Schemas are a tremendous advancement over DTDs: –Enhanced datatypes 41+ versus 10 Can create your own datatypes –Example: "This is a new type based on the string type and elements of this type must follow this pattern: ddd-dddd, where 'd' represents a digit". –Written in the same syntax as instance documents less syntax to remember –Object-oriented'ish Can extend or restrict a type (derive new type definitions on the basis of old ones) –Can express sets, i.e., can define the child elements to occur in any order –Can specify element content as being unique (keys on content) and uniqueness within a region –Can define multiple elements with the same name but different content –Can define elements with null content –Can define substitutable elements - e.g., the "subway" element is substitutable for the "train" element.

8 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 8 Let's Get Started! Convert BookCatalogue.dtd (next page) to the XML Schema notation –for this first example we will make a straight, one-to-one conversion, i.e., Title, Author, Date, ISBN, and Publisher will hold strings, just like is done in the DTD –We will gradually modify the schema to give more custom types to these elements

9 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 9 BookCatalogue.dtd

10 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 10 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> BookCatalogue.xsd (see example 01) xsd = Xml-Schema Definition (explanations on succeeding pages)

11 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 11 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <!ELEMENT Book (Title, Author, Date, ISBN, Publisher)>

12 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 12 <xsd:schema xmlns:xsd="http://www.w3.org/20001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> All XML Schemas have "schema" as the root element.

13 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 13 <xsd:schema xmlns:xsd="http://www.w3.org/20001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> The elements that are used to create an XML Schema come from the XMLSchema namespace

14 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 14 element complexType schema sequence http://www.w3.org/2001/XMLSchema XMLSchema Namespace string integer boolean

15 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 15 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> Says that the elements declared in this schema (BookCatalogue, Book, Title, Author, Date, ISBN, Publisher) are to go in this namespace

16 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 16 BookCatalogue Book Title Author Date ISBN Publisher http://www.publishing.org (targetNamespace) Publishing Namespace (targetNamespace)

17 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 17 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> This is referencing a Book element declaration. The Book in what namespace? Since there is no namespace qualifier it is referencing the Book element in the default namespace, which is the targetNamespace! The default namespace is http://www.publishing.org which is the targetNamespace!

18 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 18 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> This is a directive to instance documents which use this schema: Any elements used by the instance document which were declared by this schema must be namespace qualified by the namespace specified by targetNamespace.

19 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 19 Referencing a schema in an XML instance document <BookCatalogue xmlns ="http://www.publishing.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.publishing.org BookCatalogue.xsd"> My Life and Times Paul McCartney July, 1998 94303-12021-43892 McMillin Publishing... 1. First, using a default namespace declaration, tell the schema-validator that all of the elements used in this instance document come from the publishing namespace. 2. Second, with schemaLocation tell the schema-validator that the http://www.publishing.org namespace is defined in BookCatalogue.xsd. 3. Third, tell the schema-validator that schemaLocation attribute we are using is the one in the schema instance namespace.

20 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 20 Referencing a schema in an XML instance document BookCatalogue.xml BookCatalogue.xsd targetNamespace="A" schemaLocation="A BookCatalogue.xsd" - defines elements in namespace A - uses elements from namespace A

21 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 21 Note multiple levels of checking BookCatalogue.xmlBookCatalogue.xsd XMLSchema.xsd (schema-for-schemas) Validate that the xml document conforms to the rules described in BookCatalogue.xsd Validate that BookCatalogue.xsd is a valid schema document, i.e., it conforms to the rules described in the schema-for-schemas

22 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 22 Default Value for minOccurs and maxOccurs The default value for minOccurs is "1" The default value for maxOccurs is "1" Equivalent! Do Lab1

23 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 23 Qualify XMLSchema, Default targetNamespace In the last example, we explicitly qualified all elements from the XML Schema namespace. The targetNamespace was the default namespace. BookCatalogue Book Title Author Date ISBN Publisher http://www.publishing.org element annotation documentation complexType schema sequence http://www.w3.org/2001/XMLSchema

24 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 24 Default XMLSchema, Qualify targetNamespace Alternatively (equivalently), we can design our schema so that XMLSchema is the default namespace. BookCatalogue Book Title Author Date ISBN Publisher http://www.publishing.org element annotation documentation complexType schema sequence http://www.w3.org/2001/XMLSchema

25 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 25 <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns:pub="http://www.publishing.org" elementFormDefault="qualified"> (see example01.1)

26 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 26 <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns:pub="http://www.publishing.org" elementFormDefault="qualified"> Here we are referencing a Book element. Where is that Book element defined? In what namespace? The pub: prefix indicates what namespace this element is in. pub: has been set to be the same as the targetNamespace.

27 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 27 "pub" References the targetNamespace BookCatalogue Book Title Author Date ISBN Publisher http://www.publishing.org (targetNamespace) element annotation documentation complexType schema sequence http://www.w3.org/2001/XMLSchema pub Do Lab1.1

28 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 28 Alternate Schema In the previous examples we declared an element and then we ref’ed that element declaration. Instead, we can inline the element declarations. On the following slide is an alternate (equivalent) way of representing the schema shown previously, using inlined element declarations.

29 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 29 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> (see example 02)

30 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 30 Do Lab 2 (see example 02) Anonymous types (no name) <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified">

31 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 31 Named Types The following slide shows an alternate (equivalent) schema which uses a named type.

32 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 32 (see example 03) Named type <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified">

33 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 33 Please note that is equivalent to Element A references the complexType foo. Element A has the complexType definition inlined in the element declaration.

34 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 34 type Attribute or complexType Child Element, but not Both! An element declaration can have a type attribute, or a complexType child element, but it cannot have both a type attribute and a complexType child element. …

35 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 35 Summary of Declaring Elements (two ways to do it) A simple type (e.g., xsd:string) or the name of a complexType … 1 2 A nonnegative integer A nonnegative integer or "unbounded" Note: minOccurs and maxOccurs can only be used in nested (local) element declarations.

36 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 36 Problem Defining the Date element to be of type string is unsatisfactory (it allows any string value to be input as the content of the Date element, including non-date strings). We would like to constrain the allowable content that Date can have. Modify the BookCatalogue schema to restrict the content of the Date element to just date values (actually, year values. See next two slides). Similarly, constrain the content of the ISBN element to content of this form: ddddd-ddddd-ddddd or d-ddd-ddddd- d or d-dd-dddddd-d, where 'd' stands for 'digit'

37 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 37 The date Datatype A built-in datatype Elements declared to be of type date must follow this form: CCYY- MM-DD –range for CC is: 00-99 –range for YY is: 00-99 –range for MM is: 01-12 –range for DD is: 01-28 if month is 2 01-29 if month is 2 and the year is a leap year 01-30 if month is 4, 6, 9, or 11 01-31 if month is 1, 3, 5, 7, 8, 10, or 12 –Example: 1999-05-31 represents May 31, 1999

38 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 38 The year Datatype A built-in datatype Elements declared to be of type year must follow this form: CCYY –range for CC is: 00-99 –range for YY is: 00-99 –Example: 1999 indicates the year 1999

39 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 39 (see example 04) <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified">

40 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 40 "I am defining a new type that is a restricted form of the string type. Elements declared of this type must conform to one of the following patterns: - First Pattern: 5 digits followed by a dash followed by 5 digits followed by another dash followed by 5 more digits, or - Second Pattern: 1 digit followed by a dash followed by 3 digits followed by another dash followed by 5 digits followed by another dash followed by 1 more digit, or - Third Pattern: 1 digit followed by a dash followed by 2 digits followed by another dash followed by 6 digits followed by another dash followed by 1 more digit." These patterns are specified using Regular Expressions. In a few slides we will see more of the Regular Expression syntax.

41 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 41 or ? When do you use the complexType element and when do you use the simpleType element? –Use the complexType element when you want to define child elements and/or attributes of an element –Use the simpleType element when you want to create a new type that is a refinement of a built- in type (string, integer, etc)

42 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 42 Built-in Datatypes Primitive Datatypes –string –boolean –decimal –float –double –duration –dateTime –time –date –gYearMonth –gYear –gMonthDay Atomic, built-in –"Hello World" –{true, false, 1, 0} –7.08 – 12.56E3, 12, 12560, 0, -0, INF, -INF, NAN – P1Y2M3DT10H30M12.3S – format: CCYY-MM-DDThh:mm:ss – format: hh:mm:ss.sss – format: CCYY-MM-DD – format: CCYY-MM – format: CCYY – format: --MM-DD Note: 'T' is the date/time separator INF = infinity NAN = not-a-number

43 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 43 Built-in Datatypes (cont.) Primitive Datatypes –gDay –gMonth –hexBinary –base64Binary –anyURI –QName –NOTATION Atomic, built-in – format: ---DD (note the 3 dashes) – format: --MM-- –a hex string –a base64 string –http://www.xfront.com –a namespace qualified name –a NOTATION from the XML spec

44 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 44 Built-in Datatypes (cont.) Derived types –normalizedString –token –language –IDREFS –ENTITIES –NMTOKEN –NMTOKENS –Name –NCName –ID –IDREF –ENTITY –integer –nonPositiveInteger Subtype of primitive datatype – A string without tabs, line feeds, or carriage returns – String w/o tabs, l/f, leading/trailing spaces, consecutive spaces –any valid xml:lang value, e.g., EN, FR,... –must be used only with attributes –part (no namespace qualifier) –must be used only with attributes –456 –negative infinity to 0

45 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 45 Built-in Datatypes (cont.) Derived types –negativeInteger –long –int –short –byte –nonNegativeInteger –unsignedLong –unsignedInt –unsignedShort –unsignedByte –positiveInteger Subtype of primitive datatype – negative infinity to -1 – -9223372036854775808 to 9223372036854775807 – -2147483648 to 2147483647 – -32768 to 32767 – -127 to 128 – 0 to infinity – 0 to 18446744073709551615 – 0 to 4294967295 – 0 to 65535 –0 to 255 –1 to infinity Do Lab 3 Note: the following types can only be used with attributes (which we will discuss later): ID, IDREF, IDREFS, NMTOKEN, NMTOKENS, ENTITY, and ENTITIES.

46 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 46

47 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 47 Creating your own Datatypes A new datatype can be defined from an existing datatype (called the "base" type) by specifying values for one or more of the optional facets for the base type. Example. The string primitive datatype has six optional facets: –pattern –enumeration –length –minLength –maxlength –whitespace (legal values: preserve, replace, collapse)

48 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 48 Example of Creating a New Datatype by Specifying Facet Values This creates a new datatype called 'TelephoneNumber'. Elements of this type can hold string values, but the string length must be exactly 8 characters long and the string must follow the pattern: ddd-dddd, where 'd' represents a 'digit'. (Obviously, in this example the regular expression makes the length facet redundant.)

49 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 49 Another Example This creates a new type called US-Flag-Colors. An element declared to be of this type must have either the value red, or white, or blue.

50 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 50 Facets of the Integer Datatype Facets: –pattern –enumeration –whitespace –maxInclusive –maxExclusive –minInclusive –minExclusive –totalDigits

51 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 51 Example This creates a new datatype called 'EarthSurfaceElevation'. Elements declared to be of this type can hold an integer. However, the integer is restricted to have a value between -1290 and 29035, inclusive.

52 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 52 General Form of Creating a New Datatype by Specifying Facet Values … Facets: - length - minlength - maxlength - pattern - enumeration - minInclusive - maxInclusive - minExclusive - maxExclusive... Sources: - string - boolean - float - double - decimal - timeDuration - recurringDuration - uriReference... See DatatypeFacets.html for a mapping of datatypes to their facets.

53 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 53 Multiple Facets - "and" them together, or "or" them together? An element declared to be of type TelephoneNumber must be a string of length=8 and the string must follow the pattern: 3 digits, dash, 4 digits. An element declared to be of type US-Flag-Color must be a string with a value of either red, or white, or blue. Patterns, enumerations => "or" them together All other facets => "and" them together

54 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 54 Element Containing a User- Defined Simple Type Example. Create a schema element declaration for an elevation element. Declare the elevation element to be an integer with a range -1290 to 29035 5240 Here's one way of declaring the elevation element:

55 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 55 Element Containing a User- Defined Simple Type (cont.) Here's an alternative method for declaring elevation:

56 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 56 Summary of Declaring Elements (three ways to do it) … 1 2 … 3

57 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 57 Annotating Schemas The element is used for documenting the schema, both for humans and for programs. –Use for providing a comment to humans –Use for providing a comment to programs The content is any well-formed XML Note that annotations have no effect on schema validation This constraint is not expressible with XML Schema: The value of element A must be greater than the value of element B. So, we need to use a separate tool (e.g., Schematron) to check this constraint. We will express this constraint in the appinfo section (below). A must be greater than B

58 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 58 Where Can You Put Annotations? You cannot put annotations at any location in the schema. Annotations are restricted to occur at the beginning of the content model of the component that you are annotating.

59 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 59 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> Can only put annotations at these locations Suppose that we want to annotate, say, the Date element declaration. What do we do? See next page...

60 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 60 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> This is how to annotate the Date element!

61 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 61 Code to check the structure and content of the data Code to actually do the work "In a typical program, up to 60% of the code is spent checking the data!" - source unknown Save $$$ using XML Schemas Continued -->

62 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 62 Code to check the structure and content of the data Code to actually do the work If your data is structured as XML, and there is a schema, then you can hand the data-checking task off to a schema validator. Thus, your code is reduced by up to 60%!!! Big $$ savings! Save $$$ using XML Schemas (cont.)

63 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 63 Classic use of XML Schemas (Trading Partners - B2B) Supplier Consumer P.O. Schema Validator P.O. Schema Software to Process P.O. "P.O. is okay" P.O. (Schema at neutral web site)

64 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 64 What are XML Schemas? Data Model With XML Schemas you specify how your XML data will be organized, and the datatypes of your data. That is, with XML Schemas you model your XML data A Contract Organizations agree to structure their XML documents in conformance with the XML Schema. Thus, the XML Schema acts as a contract A rich source of metadata An XML Schema document contains lots of data about the data in the XML documents, i.e., XML Schemas contain metadata

65 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 65 No Limits The previous slide showed the classic use of XML Schemas - to validate your data (so that your program doesn't have to do it) However, there are many other uses for XML Schemas. Schemas are a wonderful source of metadata. Truly, your imagination is the only limit on its usefulness. On the next slide I show how to use the metadata provided by XML Schemas to create a GUI. The slide after that shows how to automatically generate an API using XML Schema metadata. Following that is a slide showing how to create a "smart editor" using XML Schemas.

66 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 66 XML Schema --> GUI P.O. Schema GUI Builder P.O. HTML Supplier Web Server cf: one schema  many styleSheets  many outputs [format] many schemas  GUI Builder  one uniform output [format] (one stylesheet for many XML schema instances)

67 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 67 schema-for schema schema A schema A instance (XML document) GUIBuilder (a StyleSheet for instance of schema-for schema) XSL StyleSheet for instance of schema A Schema A GUI Schema A API etc. various output

68 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 68 XML Schema --> API P.O. Schema API Builder P.O. API

69 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 69 XML Schema --> Smart Editor P.O. Schema Smart Editor (XML Spy) Helps you build your instance documents. For example, it pops up a menu showing you what is valid next. It knows this by looking at the XML Schema!

70 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 70 XML Schema Validate XML documents Automatic GUI generation Automatic API generation ??? Limited only by your imagination Do Lab 4 Smart Editor

71 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 71 Regular Expressions Recall that the string datatype has a pattern facet. The value of a pattern facet is a regular expression. Below are some examples of regular expressions: Regular Expression - Chapter \d - a*b - [xyz]b - a?b - a+b - [a-c]x Example - Chapter 1 - b, ab, aab, aaab, … - xb, yb, zb - b, ab - ab, aab, aaab, … - ax, bx, cx

72 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 72 Regular Expressions (cont.) Regular Expression –[a-c]x –[-ac]x –[ac-]x –[^0-9]x –\Dx –Chapter\s\d –(ho){2} there –(ho\s){2} there –.abc –(a|b)+x Example –ax, bx, cx –-x, ax, cx –ax, cx, -x – any non-digit char followed by x – Chapter followed by a blank followed by a digit –hoho there – any (one) char followed by abc –ax, bx, aax, bbx, abx, bax,...

73 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 73 Regular Expressions (concluded) a{1,3}x a{2,}x \w\s\w ax, aax, aaax aax, aaax, aaaax, … word character (alphanumeric plus dash) followed by a space followed by a word character

74 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 74 Example R.E. [1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5] 0 to 99 100 to 199 200 to 249250 to 255 This regular expression restricts a string to have values between 0 and 255. … Such a R.E. might be useful in describing an IP address...

75 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 75 IP Datatype Definition <xsd:pattern value="(([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3} ([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"> Datatype for representing IP addresses. Examples, 129.83.64.255, 64.128.2.71, etc. This datatype restricts each field of the IP address to have a value between zero and 255, i.e., [0-255].[0-255].[0-255].[0-255] Note: in the value attribute (above) the regular expression has been split over two lines. This is for readability purposes only. In practice the R.E. would all be on one line.

76 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 76 Regular Expression Parser Want to test your skill in writing regular expressions? Go to: http://www.xfront.org/xml-schema/ –Dan Potter has created a nice tool which allows you to enter a regular expression and then enter a string. The parser will then determine if your string conforms to your regular expression. Do Lab 5

77 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 77 Derived Types We can do a form of subclassing complexType definitions. We call this "derived types" –derive by extension: extend the parent complexType with more elements –derive by restriction: constrain the parent complexType by constraining some of the elements to have a more restricted range of values, or a more restricted number of occurrences.

78 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 78 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> Note that the Book type extends the Publication type, i.e., doing Derive by Extension (see example05)

79 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 79 Notice that an element and a type can have the same name! (Later we will see that element declarations and type definitions reside in different Symbol Spaces.)

80 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 80 Elements declared to be of type Book will have 5 child elements - Title, Author, Date, ISBN, and Publisher. Note that the elements in the derived type are appended to the elements in the base type. Do Lab 6

81 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 81 Derive by Restriction Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date. However, there must be exactly one Author element. Note that in the restriction type you must repeat all the declarations from the base type.

82 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 82 Derive by Restriction (cont.) You might (legitimately) ask: –why do I have to repeat all the declarations from the base type? Why can't I simply show the delta (i.e., show those declarations that are changed)? –What's the point of doing derived by restriction if I have to repeat everything? I'm certainly not saving on typing. Answer: –Even though you have to retype everything in the base type there are advantages to explicitly associating the type with the base type. Later we will see that an element’s content model may be substituted by the content model of derived types. Thus, the content of an element that has been declared to be of type Publication can be substituted with a SingleAuthorPublication content since SingleAuthorPublication derives from Publication. We will discuss this type substitutability in detail later.

83 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 83 Prohibiting Derivations Sometimes we may want to create a type and disallow all derivations of it, or just disallow extension derivations, or disallow restriction derivations. –Rationale: "For example, I may create a complexType and make it publicly available for others to use. However, I don't want them to extend it with their proprietary extensions or constrict it to remove, say, copyright information." (Jon Cleaver) Publication cannot be extended nor restricted Publication cannot be restricted Publication cannot be extended

84 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 84 Terminology: Declaration vs Definition In a schema: –You declare elements and attributes. Schema components that are declared are used in an XML instance document. –You define components that are used just within the schema document(s) Declarations: - element declarations - attribute declarations Definitions: - type definitions - attribute group definitions - model group definitions

85 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 85 Terminology: Global versus Local Global element declarations, global type definitions: –These are element declarations/type definitions that are immediate children of Local element declarations, local type definitions: –These are element declarations/type definitions that are nested within other elements/types.

86 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 86 Global type definition Global element declaration Local element declarations Local type definition <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified">

87 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 87 Global or Local … What's the Big Deal? So what if an element or type is global or local. What practical impact does it have? –Answer: only global elements/types can be referenced (i.e., reused). Thus, if an element/type is local then it is effectively invisible to the rest of the schema (and to other schemas).

88 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 88 Element Substitutability Oftentimes in daily conversation there are several ways to express something. –In Boston we use the words "T" and "subway" interchangeably. For example, "we took the T into town", or "we took the subway into town". Thus, "T" and "subway" are substitutable. Which one is used may depend upon what part of the state you live in, what mood you're in, or any number of factors. We would like to be able to express this substitutability in XML Schemas. –That is, we would like to be able to declare in the schema an element called "subway", an element called "T", and state that "T"may be substituted for "subway". Instance documents can then use either or, depending on their preference.

89 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 89 substitutionGroup We can define a group of substitutable elements (called a substitutionGroup) by declaring an element (called the head) and then declaring other elements which state that they are substitutable for the head element. subway is the head element T is substitutable for subway So what's the big deal? - Anywhere a head element can be used in an instance document, any member of the substitutionGroup can be substituted!

90 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 90 Red Line Schema: Instance doc: Red Line Alternative instance doc (substitute T for subway): This example shows the element being substituted with the element.

91 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 91 International Clients We can use substitutionGroups to create tags customized for our international clients. On the next slide is shown a Spanish version of the tags.

92 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 92 Red Line Schema: Instance doc: Linea Roja Alternative instance doc (customized for our Spanish clients):

93 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 93 Notes about using substitutionGroup The elements that are declared to be in the substitution group (e.g., subway and T) must be declared as global elements If the type of a substitutionGroup element is the same as the head element then you can omit it (the type) –In our Subway example we could have omitted the type attribute in the declaration of the T element since it is the same as Subway’s type (string).

94 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 94 Notes about using substitutionGroup (cont.) The type of every element in the substitutionGroup must be the same as, or derived from, the type of the head element. This type must be the same as "xxx" or, it must be derived from "xxx".

95 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 95 Blocking Element Substitution An element may wish to block other elements from substituting with it. This is achieved by adding a block attribute.

96 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 96 Red Line Schema: Instance doc: Red Line Not allowed! Do Lab 7

97 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 97 Attributes On the next slide I show a version of the BookCatalogue DTD that uses attributes. Then, on the following slide I show how this is implemented using XML Schemas.

98 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 98 <!ATTLIST Book Category (autobiography | non-fiction | fiction) #REQUIRED InStock (true | false) "false" Reviewer CDATA " "> BookCatalogue2.dtd

99 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 99 (see example06) InStock (true | false) "false" Reviewer CDATA " " Category (autobiography | non-fiction | fiction) #REQUIRED

100 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 100 "Instance documents are required to have the Category attribute (as indicated by use="required"). The value of Category must be either autobiography, non-fiction, or fiction (as specified by the enumeration facets)." Note: attributes can only have simpleTypes (i.e., attributes cannot have child elements).

101 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 101 Summary of Declaring Attributes (two ways to do it) required optional prohibited default value when use is optional; or fixed value use =required or optional. xsd:string xsd:integer xsd:boolean... … 1 2

102 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 102 use --> use it only with Local Attribute Declarations The "use" attribute only makes sense in the context of an element declaration. Example: "for each Book element, the Category attribute is required". When declaring a global attribute do not specify a "use"

103 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 103 … … Local attribute declaration. Must have a "use“ or a default or a fixed. Global attribute declaration. Must NOT have a "use" ("use" only makes sense in the context of an element)

104 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 104 Alternate Schema On the next slide is another way of expressing the last example - the attributes are inlined within the Book declaration rather than being separately defined in an attributeGroup. (I only show a portion of the schema - the Book element declaration.)

105 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 105 (see example07)

106 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 106 Notes about Attributes The attribute declarations always come last, after the element declarations. The attributes are always with respect to the element that they are defined (nested) within. … "bar and boo are attributes of foo"

107 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 107 These attributes apply to the element they are nested within (Book) That is, Book has three attributes - Category, InStock, and Reviewer. Do Lab 8.a,

108 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 108 Element with Simple Content and Attributes Example. Consider this: 5440 The elevation element has these two constraints: - it has a simple (integer) content - it has an attribute called units How do we declare elevation? (see next slide)

109 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 109 elevation contains an attribute. - therefore, we must use However, elevation does not contain child elements (which is what we generally use to indicate). Instead, elevation contains simpleContent. We wish to extend the simpleContent (an integer) with an attribute.

110 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 110 elevation - use Stronger Datatype In the declaration for elevation we allowed it to hold any integer. Further, we allowed the units attribute to hold any string. Let's restrict elevation to hold an integer with a range 0 - 12,000 and let's restrict units to hold either the string "feet" or the string "meters"

111 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 111

112 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 112 Summary of Declaring Elements 1. Element with Simple Content. Declaring an element using a built-in type: Declaring an element using a user-defined simpleType: An alternative formulation of the above flag example is to inline the simpleType definition:

113 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 113 Summary of Declaring Elements (cont.) 2. Element Contains Child Elements Defining the child elements inline: An alternate formulation of the above Person example is to create a named complexType and then use that type:

114 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 114 Summary of Declaring Elements (cont.) 3. Element Contains a ComplexType that is an Extension of another ComplexType

115 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 115 Summary of Declaring Elements (cont.) 4. Element Contains a ComplexType that is a Restriction of another ComplexType

116 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 116 Summary of Declaring Elements (concluded) Do Lab 8.b, 8.c 5. Element Contains Simple Content and Attributes Example. Large, green, sour

117 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 117 group Element The group element enables you to group together element declarations. Note: the group element is just for grouping together element declarations, no attribute declarations allowed!

118 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 118 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> (see example08)

119 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 119 Note about group Group definitions must be global...... Cannot inline the group definition. Instead, you must use a ref here and define the group globally.

120 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 120 Expressing Alternates DTD: XML Schema: <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.travel.org" xmlns="http://www.travel.org" elementFormDefault="qualified"> (see example 9)

121 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 121 Expressing Repeats <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.binary.org" xmlns="http://www.binary.org" elementFormDefault="qualified"> DTD: XML Schema: Notes: 1. An element can fix its value, using the fixed attribute. 2. When you don't specify a value for minOccurs, it defaults to "1". Same for maxOccurs. See the last example (transportation) where we used a element with no minOccurs or maxOccurs. (see example 10)

122 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 122 Expressing Any Order <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> XML Schema: Problem: create an element, Book, which contains Author, Title, Date, ISBN, and Publisher, in any order (Note: this cannot be easily done with DTDs). means that Book must contain all five child elements, but they may occur in any order. (see example 11)

123 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 123 Constraints on using Elements declared within must have a maxOccurs value of "1" (minOccurs can be either "0" or "1") Do Lab 9

124 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 124 Empty Element <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.photography.org" xmlns="http://www.photography.org" elementFormDefault="qualified"> Schema: Instance doc (snippet): Do Lab 10 DTD: (see example 12)

125 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 125 Using and <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.binary.org" xmlns="http://www.binary.org" elementFormDefault="qualified"> DTD: XML Schema:

126 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 126 No targetNamespace (noNamespaceSchemaLocation) Sometimes you may wish to create a schema but without a targetNamespace The targetNamespace attribute is actually an optional attribute of. Thus, if you don’t want to specify a namespace for your schema then simply don’t use the targetNamespace attribute. Consequences of having no namespace –1. In the instance document don’t qualify the elements. –2. In the instance document, instead of using schemaLocation use noNamespaceSchemaLocation.

127 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 127 (see example13) Note that there is no targetNamespace attribute, and note that there is no longer a default namespace. <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

128 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 128 <BookCatalogue xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation= "BookCatalogue.xsd"> My Life and Times Paul McCartney 1998 94303-12021-43892 McMillin Publishing … (see example13) Note that there is no default namespace declaration. So, none of the elements belong to a namespace. Note that we do not use xsi:schemaLocation (since it requires a pair of values - a namespace and a URL to the schema for that namespace). Instead, we use xsi:noNamespaceSchemaLocation.

129 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 129 Assembling an Instance Document from Multiple Schema Documents An instance document may have elements from multiple schemas. Validation can apply to the entire XML instance document, or to a single element.

130 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 130 <Library xmlns:b="http://www.publishing.org" xmlns:e="http://www.employee.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.publishing.org BookCatalogue.xsd http://www.employee.org Employee.xsd"> My Life and Times Paul McCartney 1998 94303-12021-43892 McMillin Publishing Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row John Doe 123-45-6789 Sally Smith 000-11-2345 Library.xml (see example 14) Validating against two schemas

131 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 131 Assembling a Schema from Multiple Schema Documents The include element allows you to bring in schema definitions from other schemas –All the schemas you include must have the same namespace as your schema (i.e., the schema that is using the include) –The net effect of include is as though you had typed all the definitions directly into the containing schema … LibraryBookCatalogue.xsd LibraryEmployees.xsd Library.xsd

132 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 132 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.library.org" xmlns="http://www.library.org" elementFormDefault="qualified"> Library.xsd (see example 15) These are referencing the element declarations in the other schemas. Nice!

133 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 133 Assembling a Schema from a Schema with no targetNamespace A schema can another schema that has no targetNamespace. The included components take on the namespace of the schema that is doing the.

134 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 134 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> Product.xsd (see example16) Note that this schema has no targetNamespace!

135 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 135 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.company.org" xmlns="http://www.company.org" elementFormDefault="qualified"> Company.xsd (see example16) This schema s Product.xsd. Thus, the components in Product.xsd are namespace-coerced to the company namespace.

136 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 136 Assembling a Schema from Multiple Schema Documents in Different Namespaces The import element allows you to reference elements in another namespace <xsd:import namespace="A" schemaLocation="A.xsd"/> <xsd:import namespace="B" schemaLocation="B.xsd"/> … Namespace A A.xsd Namespace B B.xsd C.xsd

137 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 137 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.camera.org" xmlns:nikon="http://www.nikon.com" xmlns:olympus="http://www.olympus.com" xmlns:pentax="http://www.pentax.com" elementFormDefault="qualified"> <xsd:import namespace="http://www.nikon.com" schemaLocation="Nikon.xsd"/> <xsd:import namespace="http://www.olympus.com" schemaLocation="Olympus.xsd"/> <xsd:import namespace="http://www.pentax.com" schemaLocation="Pentax.xsd"/> Camera.xsd (see example 17)

138 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 138 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.nikon.com" xmlns="http://www.nikon.com" elementFormDefault="qualified"> Nikon.xsd (see example 17 for Olympus.xsd and Pentax.xsd)

139 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 139 <camera xmlns="http://www.camera.org" xmlns:nikon="http://www.nikon.com" xmlns:olympus="http://www.olympus.com" xmlns:pentax="http://www.pentax.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.camera.org Camera.xsd http://www.nikon.com Nikon.xsd http://www.olympus.com Olympus.xsd http://www.pentax.com Pentax.xsd"> Ergonomically designed casing for easy handling 300mm 1.2 1/10,000 sec to 100 sec The Camera instance uses elements from the Nikon, Olympus, and Pentax namespaces. Camera.xml (see example 17)

140 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 140 Redundant! On the previous slide, the value of schemaLocation contained four pairs of values - one for camera, and three for each schema that it uses. The later three are redundant. Once you give the schema validator camera it will look in the camera schema and see the import elements, thus it will be aware of the other schemas being used (Nikon, Olympus, and Pentax) The next slide shows the non-redundant version.

141 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 141 <camera xmlns="http://www.camera.org" xmlns:nikon="http://www.nikon.com" xmlns:olympus="http://www.olympus.com" xmlns:pentax="http://www.pentax.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.camera.org Camera.xsd"> Ergonomically designed casing for easy handling 300mm 1.2 1/10,000 sec to 100 sec Camera.xml (non-redundant version)

142 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 142 Note about using Include and Import The and elements must come before any element declarations or type definitions. Do Labs 11.a, 11.b, 11.c

143 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 143 Creating Lists There are times when you will want an element to contain a list of values, e.g., "The contents of the Numbers element is a list of numbers". Example: For a document containing the Lottery drawing we might have 12 49 37 99 20 67 How do we declare the element Numbers... (1) To contain a list of integers, and (2) Each integer is restricted to be between 1 and 99, and (3) The total number of integers in the list is exactly six.

144 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 144 <LotteryDrawings xmlns="http://www.lottery.org/namespaces/Lottery" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.lottery.org/namespaces/Lottery Lottery.xsd"> July 1 21 3 67 8 90 12 July 8 55 31 4 57 98 22 July 15 70 77 19 35 44 11 Lottery.xml (see example18)

145 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 145 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.lottery.org" xmlns="http://www.lottery.org" elementFormDefault="qualified"> Lottery.xsd

146 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 146 LotteryNumbers --> Need Stronger Datatyping The list in the previous schema has two problems: –It allows to contain an arbitrarily long list –The numbers in the list may be any positiveInteger We need to: –Restrict the list to length value="6" –Restrict the numbers to maxInclusive value="99"

147 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 147 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.lottery.org" xmlns="http://www.lottery.org" elementFormDefault="qualified"> Lottery.xsd (see example18)

148 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 148 NumbersList is a list where the type of each item is OneToNintyNine. LotteryNumbers restricts NumbersList to a length of six (i.e., an element declared to be of type LotteryNumbers must hold a list of numbers, between 1 and 99, and the length of the list must be exactly six).

149 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 149 Alternatively, This is read as: "We are creating a new type called LotteryNumbers. It is a restriction. At this point we can either use the base attribute or a simpleType child element to indicate the type that we are restricting (you cannot use both the base attribute and the simpleType child element). We want to restrict the type that is a list of OneToNintyNine. We will restrict that type to a length of 6."

150 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 150 Notes about the list type You cannot create a list of lists –i.e., you cannot create a list type from another list type. You cannot create a list of complexTypes –i.e., this only applies to simpleTypes In the instance document, you must separate each item in a list with white space (blank, tab, carriage return) The only facets that you may use with a list type are: –length –minLength –maxLength –enumeration Do Lab 11.d

151 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 151 Creating a simpleType that is a Union of Types simpleType 1 simpleType 2 simpleType 1 + simpleType 2 Note: you can create a union of more that just two simpleTypes

152 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 152 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.CostelloReunion.org" xmlns="http://www.CostelloReunion.org" elementFormDefault="qualified"> Cont. -->

153 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 153 <xsd:union memberTypes="Parent PatsFamily BarbsFamily JudysFamily TomsFamily RogersFamily JohnsFamily"/> Cont. -->

154 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 154 Y2KFamilyReunion.xsd (see example 19)

155 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 155 <Y2KFamilyReunion xmlns="http://www.CostelloReunion.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.CostelloReunion.org Y2KFamilyReunion.xsd"> Mary Pat Patti Christopher Elizabeth Judy Peter Tom Cheryl Marc Joe Roger Y2KFamilyReunion.xml (see example 19)

156 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 156 Alternative … Version 2 of Y2KFamilyReunion.xsd (see example 20)

157 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 157 Review of Union simpleType Alternatively, … … … "The memberTypes attribute and the nested simpleTypes are mutually exclusive, i.e., you can have one or the other, but not both."

158 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 158 maxOccurs The value space for maxOccurs is the union of the value space for nonNegativeInteger with the value space for a simpleType which contains one enumeration value - "unbounded". See next slide for how maxOccurs is defined in the schema-for- schemas

159 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 159 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.maxOccurs.org" xmlns="http://www.maxOccurs.org" elementFormDefault="qualified"> (see example20.1)

160 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 160 any Element The element enables the instance document author to extend his/her document with elements not specified by the schema. Now an instance document author can extend the content of elements with any element after.

161 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 161 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.PublishingCompany.org" xmlns="http://www.PublishingCompany.org" elementFormDefault="qualified"> PublishingCompany.xsd (see example21) Suppose that the instance document author discovers this schema, and wants to extend his elements with a element. He/she can do so! Thus, the instance document will be extended with an element never anticipated by the schema author. Wow!

162 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 162 <BookCatalogue xmlns="http://www.BookCatalogue.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.BookCatalogue.org BookCatalogue.xsd http://www.PublishingCompany.org PublishingCompany.xsd"> My Life and Times Paul McCartney 1998 94303-12021-43892 McMillin Publishing Roger Costello Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co.

163 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 163 Extensible Instance Documents The element enables instance document authors to create instance documents which contain elements above and beyond what was specified by the schema. The instance documents are said to be extensible. Contrast this schema with previous schemas where the content of all our elements were always fixed and static.

164 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 164 Specifying the Namespace of Extension Elements allows the instance document to contain a new element, provided the element comes from a namespace other than the one the schema is defining (i.e., targetNamespace). allows a new element, provided it's from the specified namespace Note: you can specify a list of namespaces, separated by a blank space. One of the namespaces can be ##targetNamespace (see next) allows a new element, provided it's from the namespace that the schema is defining. allows an element from any namespace. This is the default. the new element must come from no namespace

165 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 165 anyAttribute The element enables the instance document author to extend his document with attributes not specified by the schema. Now an instance document author can add any number of attributes onto a element.

166 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 166 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.StandardRepository.org" xmlns="http://www.StandardRepository.org" elementFormDefault="qualified"> StandardRepository.xsd (see example21.1) Suppose that the instance document author discovers this schema, and wants to extend his elements with an id attribute. He/she can do so! Thus, the instance document will be extended with an attribute never anticipated by the schema author. Wow!

167 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 167 <BookCatalogue xmlns="http://www.BookCatalogue.org" xmlns:sr="http://www.StandardRepository.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.BookCatalogue.org BookCatalogue.xsd http://www.StandardRepository.org StandardRepository.xsd"> My Life and Times Paul McCartney 1998 94303-12021-43892 McMillin Publishing Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. BookCatalogue.xml (see example29)

168 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 168 Extensible Instance Documents The element enables instance document authors to create instance documents which contain attributes above and beyond what was specified by the schema. The instance documents are said to be extensible. Contrast this schema with previous schemas where the content of all our elements were always fixed and static.

169 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 169 Specifying the Namespace of Extension Attributes allows the instance document to contain new attributes, provided the attributes come from a namespace other than the one the schema is defining (i.e., targetNamespace). allows new attributes, provided they're from the specified namespace. Note: you can specify a list of namespaces, separated by a blank space. One of the namespaces can be ##targetNamespace (see next) allows new attributes, provided they're from the namespace that the schema is defining. allows any attributes. This is the default. allows any unqualified attributes (i.e., the attributes comes from no namespace)

170 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 170 Open Content Definition: an open content schema is one that allows instance documents to contain additional elements beyond what is declared in the schema. This is achieved by using the and elements in the schema. Sprinkling and elements liberally throughout your schema will yield benefits in terms of how evolvable your schema is. –See later slides for how open content enables the rapid evolution of schemas that is required in today's marketplace.

171 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 171 Global Openness There is a range of openness that a schema may support - anywhere from having instance documents where new elements can be inserted anywhere (global openness), to instance documents where new elements can be inserted only at specific locations (localized openness)... This schema is allowing expansion before and after every element. Further, is is allowing for attribute expansion on every element. Truly, this is the ultimate in openness!

172 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 172 Localized Openness With localized openness we design our schema to allow instance documents to extend at specific points in the document With this schema we are allowing instance documents to extend only at the end of Book's content model.

173 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 173 In today's rapidly changing market static schemas will be less commonplace, as the market pushes schemas to quickly support new capabilities. For example, consider the cellphone industry. Clearly, this is a rapidly evolving market. Any schema that the cellphone community creates will soon become obsolete as hardware/software changes extend the cellphone capabilities. For the cellphone community rapid evolution of a cellphone schema is not just a nicety, the market demands it! Suppose that the cellphone community gets together and creates a schema, cellphone.xsd. Imagine that every week NOKIA sends out to the various vendors an instance document (conforming to cellphone.xsd), detailing its current product set. Now suppose that a few months after cellphone.xsd is agreed upon NOKIA makes some breakthroughs in their cellphones - they create new memory, call, and display features, none of which are supported by cellphone.xsd. To gain a market advantage NOKIA will want to get information about these new capabilities to its vendors ASAP. Further, they will have little motivation to wait for the next meeting of the cellphone community to consider upgrades to cellphone.xsd. They need results NOW. How does open content help? That is described next. Suppose that the cellphone schema is declared "open". Immediately NOKIA can extend its instance documents to incorporate data about the new features. How does this change impact the vendor applications that receive the instance documents? The answer is - not at all. In the worst case, the vendor's application will simply skip over the new elements. More likely, however, the vendors are showing the cellphone features in a list box and these new features will be automatically captured with the other features. Let's stop and think about what has been just described … Without modifying the cellphone schema and without touching the vendor's applications, information about the new NOKIA features has been instantly disseminated to the marketplace! Open content in the cellphone schema is the enabler for this rapid dissemination. Dynamic Schema Evolution using Open Content Continued -->

174 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 174 Dynamic Schema Evolution using Open Content (cont.) Clearly some types of instance document extensions may require modification to the vendor's applications. Recognize, however, that the vendors are free to upgrade their applications in their own time. The applications do not need to be upgraded before changes can be introduced into instance documents. At the very worst, the vendor's applications will simply skip over the extensions. And, of course, those vendors do not need to upgrade in lock-step To wrap up this example … suppose that several months later the cellphone community reconvenes to discuss enhancements to the schema. The new features that NOKIA first introduced into the marketplace are then officially added into the schema. Thus completes the cycle. Changes to the instance documents have driven the evolution of the schema. Do Lab 12

175 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 175 Strategy for Defining Semantics of your XML Elements (by Mary Pulvermacher) Capture the semantics in the XML Schema –Describe the semantics within the element –Adopt the convention that every element and attribute have an annotation which provides information on the meaning Advantages: –The XML Schema will capture the data structure, meta-data, and relationships between the elements –Use of strong typing will capture much of the data content –The annotations can capture definitions and other explanatory information –The structure of the "definitions" will always be consistent with the structure used in the schema since they are linked –Since the schema itself is an XML document, we can use XSLT to extract the annotations and transform the "semantic" information into a format suitable for human consumption

176 Copyright © [2001]. Roger L. Costello. All Rights Reserved. 176 Strategy for Defining Semantics of your XML Elements (by Mary Pulvermacher)


Download ppt "Copyright © [2001]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)"

Similar presentations


Ads by Google