Presentation is loading. Please wait.

Presentation is loading. Please wait.

Schemas1 XML Schema More Powerful. Schemas 2 DTD – Schema - Relax An XML schema is a description of a type of XML document, typically expressed in terms.

Similar presentations


Presentation on theme: "Schemas1 XML Schema More Powerful. Schemas 2 DTD – Schema - Relax An XML schema is a description of a type of XML document, typically expressed in terms."— Presentation transcript:

1 Schemas1 XML Schema More Powerful

2 Schemas 2 DTD – Schema - Relax An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. An XML schema provides a view of the document type at a relatively high level of abstraction.XML There are languages developed specifically to express XML schemas. The Document Type Definition (DTD) language, which is native to the XML specification, is a schema language that is of relatively limited capability, but that also has other uses in XML aside from the expression of schemas. Two other very popular, more expressive XML schema languages are XML Schema (W3C) and RELAX NG.Document Type DefinitionXML Schema (W3C)RELAX NG

3 Schemas 3 Schemas 1-10. Benefits of Schema 10-21. Explaining Schema 21-30. Namespace 31-35. Inlined Schema 36-57. Datatypes 58. Summary of Defining 59.- 141. Advanced 142. Stratecy of Defining Schemantics

4 Schemas 4 It's all about integrated business processes An XML vocabulary for expressing your data's business rules RosettaNet PIPs

5 Schemas 5 Data Model With XML Schemas you specify how your XML data will be organized, and the datatypes of your data. That is, with XML Schemas you model how your data is to be represented in an instance document. A Contract Organizations agree to structure their XML documents in conformance with an XML Schema. Thus, the XML Schema acts as a contract between the organizations. A rich source of metadata An XML Schema document contains lots of data about the data in the XML instance documents, such as the datatype of the data, the data's range of values, how the data is related to another piece of data (parent/child, sibling relationship), i.e., XML Schemas contain metadata What are XML Schemas?

6 Schemas 6 With the support for data types: It is easier to describe permissible document content It is easier to validate the correctness of data It is easier to work with data from a database It is easier to define data facets (restrictions on data) It is easier to define data patterns (data formats) It is easier to convert data between different data types XML Schemas Benefits

7 Schemas 7 Example 32.904237 73.620290 2 Is this data valid? To be valid, it must meet these constraints (data business rules): 1. The location must be comprised of a latitude, followed by a longitude, followed by an indication of the uncertainty of the lat/lon measurements. 2. The latitude must be a decimal with a value between -90 to +90 3. The longitude must be a decimal with a value between -180 to +180 4. For both latitude and longitude the number of digits to the right of the decimal point must be exactly six digits. 5. The value of uncertainty must be a non-negative integer 6. The uncertainty units must be either meters or feet. We can express all these data constraints using XML Schemas

8 Schemas 8 Validating your data 32.904237 73.620290 2 -check that the latitude is between -90 and +90 -check that the longitude is between -180 and +180 - check that the fraction digits is 6 for lat and lon... XML Schema validator Data is ok!

9 Schemas 9 Purpose of XML Schemas Idea is to specify:  the structure of instance documents "this element contains these elements, which contains these other elements …"  the datatype of each element/attribute "this element is an integer with the range 0 to 15,000"

10 Schemas 10 DTDs  It's a different syntax You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent  Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, express "I want the element to hold an integer with a range of 0 to 15,000" Desire a set of datatypes compatible with those found in databases  DTD supports 10 datatypes; XML Schemas supports 44+ datatypes DTD Limitations

11 Schemas 11 Advantages of Schemas Schemas’ advantages over DTDs:  Enhanced datatypes 44+ versus 10 Can create your own datatypes e.g. type must follow this pattern: ddd-dddd, where 'd' represents a digit"  Written in the same syntax as instance documents less syntax to remember  Object-oriented'ish Can extend or restrict a type ( derive new type definitions on the basis of old ones)  Can express sets, i.e., can define the child elements to occur in any order  Can specify element content as being unique (keys on content) and uniqueness within a region  Can define multiple elements with the same name but different content  Can define elements with nil content  Can define substitutable elements - e.g., the "Book" element is substitutable for the "Publication" element.

12 Schemas 12 BookStore.dtd

13 Schemas 13 ATTLIST ELEMENT ID #PCDATA NMTOKEN ENTITY CDATA BookStore Book Title Author Date ISBN Publisher This is the vocabulary that DTDs provide to define your new vocabulary DTD Vocabulary

14 Schemas 14 element complexType schema sequence http://www.w3.org/2001/XMLSchema string integer boolean BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) This is the vocabulary that XML Schemas provide to define your new vocabulary One difference between XML Schemas and DTDs is that the XML Schema vocabulary is associated with a name (namespace). Likewise, the new vocabulary that you define must be associated with a name (namespace). Schema Vocabularity

15 Schemas 15 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> NetStore.xsd xsd = Xml-Schema Definition Example A

16 Schemas 16 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <!ELEMENT Book (Title, Author, Date, ISBN, Publisher)> Explanation A1

17 Schemas 17 xsd:schema < xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> All XML Schemas have "schema" as the root element. Explanation A2

18 Schemas 18 xmlns:xsd="http://www.w3.org/2001/XMLSchema" <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> The elements and datatypes that are used to construct schemas - schema - element - complexType - sequence - string come from the http://…/XMLSchema namespace Explanation A3

19 Schemas 19 element complexType schema sequence http://www.w3.org/2001/XMLSchema string integer boolean XMLSchema Namespace

20 Schemas 20 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Says that the elements defined by this schema - BookStore - Book - Title - Author - Date - ISBN - Publisher are to go in this namespace Explanation A4

21 Schemas 21 BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) Target namespace

22 Schemas 22 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> ref="Book" This is referencing a Book element declaration. The Book in what namespace? Since there is no namespace qualifier it is referencing the Book element in the default namespace, which is the targetNamespace! Thus, this is a reference to the Book element declaration in this schema. The default namespace is http://www.books.org which is the targetNamespace! Explanation A5

23 Schemas 23 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> This is a directive to any instance documents which conform to this schema: Any elements used by the instance document which were declared in this schema must be namespace qualified. Explanation A6

24 Schemas 24 Referencing a schema in an XML instance document <BookStore xmlns ="http://www.books.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.books.org BookStore.xsd"> My Life and Times Paul McCartney July, 1998 94303-12021-43892 McMillin Publishing... 1. First, using a default namespace declaration, tell the schema-validator that all of the elements used in this instance document come from the http://www.books.org namespace. 2. Second, with schemaLocation tell the schema-validator that the http://www.books.org namespace is defined by BookStore.xsd (i.e., schemaLocation contains a pair of values). 3. Third, tell the schema-validator that the schemaLocation attribute we are using is the one in the XMLSchema-instance namespace. 1 2 3

25 Schemas 25 Referencing a schema in an XML instance document NetStore.xml NetStore.xsd targetNamespace= "http://www.books.org" schemaLocation= "http://www.books.org NetStore.xsd" - defines elements in namespace http://www.books.org - uses elements from namespace http://www.books.org A schema defines a new vocabulary. Instance documents use that new vocabulary.

26 Schemas 26 Note multiple levels of checking NetStore.xmlNetStore.xsd XMLSchema.xsd (schema-for-schemas) Validate that the xml document conforms to the rules described in NetStore.xsd Validate that NetStore.xsd is a valid schema document, i.e., it conforms to the rules described in the schema-for-schemas

27 Schemas 27 Default Value for minOccurs and maxOccurs The default value for minOccurs is "1" The default value for maxOccurs is "1" Equivalent!

28 Schemas 28 Qualify XMLSchema, Default targetNamespace In the first example, we explicitly qualified all elements from the XML Schema namespace. The targetNamespace was the default namespace. BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) http://www.w3.org/2001/XMLSchema element complexType schema sequence string integer boolean

29 Schemas 29 Alternatively (equivalently), we can design our schema so that XMLSchema is the default namespace. (Example B) BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) http://www.w3.org/2001/XMLSchema element complexType schema sequence string integer boolean Default XML Schema, Qualify targetNamespace

30 Schemas 30 xmlns="http://www.w3.org/2001/XMLSchema" <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns:bk="http://www.books.org" elementFormDefault="qualified"> (see example02) Note that http://…/XMLSchema is the default namespace. Consequently, there are no namespace qualifiers on - schema - element - complexType - sequence - string Example B

31 Schemas 31 <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns:bk="http://www.books.org" elementFormDefault="qualified"> Here we are referencing a Book element. Where is that Book element defined? In what namespace? The bk: prefix indicates what namespace this element is in. bk: has been set to be the same as the targetNamespace. Explanation B1

32 Schemas 32 "bk:" References the targetNamespace BookStore Book Title Author Date ISBN Publisher http://www.books.org (targetNamespace) http://www.w3.org/2001/XMLSchema bk element complexType schema sequence string integer boolean Consequently, bk:Book refers to the Book element in the targetNamespace.

33 Schemas 33 Inlining Element Declarations In the previous examples we declared an element and then we ref’ed to that element declaration. Alternatively, we can inline the element declarations. On the following slide is an alternate (equivalent) way of representing the schema shown previously, using inlined element declarations.

34 Schemas 34 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Note that we have moved all the element declarations inline, and we are no longer refering to the element declarations. This results in a much more compact schema! Example C: Inlined

35 Schemas 35 Anonymous types (no name) <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Explanation C1

36 Schemas 36 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> The advantage of splitting out Book's element declarations and wrapping them in a named type is that now this type can be reused by other elements. Example D: Named types

37 Schemas 37 A simple type (e.g., xsd:string) or the name of a complexType (e.g., BookPublication) … 1 2 A nonnegative integer A nonnegative integer or "unbounded" Note: minOccurs and maxOccurs can only be used in nested (local) element declarations. Summary A of Declaring Elements

38 Schemas 38 The date Datatype A built-in datatype (i.e., schema validators know about this datatype) This datatype is used to represent a specific day (year-month-day) Elements declared to be of type date must follow this form: CCYY- MM-DD  range for CC is: 00-99  range for YY is: 00-99  range for MM is: 01-12  range for DD is: 01-28 if month is 2 01-29 if month is 2 and the gYear is a leap gYear 01-30 if month is 4, 6, 9, or 11 01-31 if month is 1, 3, 5, 7, 8, 10, or 12  Example: 1999-05-31 represents May 31, 1999

39 Schemas 39 The gYear Datatype A built-in datatype (Gregorian calendar year) Elements declared to be of type gYear must follow this form: CCYY  range for CC is: 00-99  range for YY is: 00-99  Example: 1999 indicates the gYear 1999

40 Schemas 40 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Here we are defining a new (user-defined) data- type, called ISBNType. Declaring Date to be of type gYear, and ISBN to be of type ISBNType (defined above) Example E

41 Schemas 41 "I hereby declare a new type called ISBNType. It is a restricted form of the string type. Elements declared of this type must conform to one of the following patterns: - First Pattern: 1 digit followed by a dash followed by 5 digits followed by another dash followed by 3 digits followed by another dash followed by 1 more digit, or - Second Pattern: 1 digit followed by a dash followed by 3 digits followed by another dash followed by 5 digits followed by another dash followed by 1 more digit, or - Third Pattern: 1 digit followed by a dash followed by 2 digits followed by another dash followed by 6 digits followed by another dash followed by 1 more digit." These patterns are specified using Regular Expressions. Explanation E1

42 Schemas 42 Equivalent Expressions The vertical bar means "or"

43 Schemas 43 or ? When do you use the complexType element and when do you use the simpleType element?  Use the complexType element when you want to define child elements and/or attributes of an element  Use the simpleType element when you want to create a new type that is a refinement of a built-in type (string, date, gYear, etc)

44 Schemas 44 Built-in Datatypes Primitive Datatypes  string  boolean  decimal  float  double  duration  dateTime  time  date  gYearMonth  gYear  gMonthDay Atomic, built-in  "Hello World"  {true, false, 1, 0}  7.08  12.56E3, 12, 12560, 0, -0, INF, -INF, NAN  P1Y2M3DT10H30M12.3S  format: CCYY-MM-DDThh-mm-ss  format: hh:mm:ss.sss  format: CCYY-MM-DD  format: CCYY-MM  format: CCYY  format: --MM-D D Note: 'T' is the date/time separator INF = infinity NAN = not-a-number

45 Schemas 45 Built-in Datatypes (cont.) Primitive Datatypes  gDay  gMonth  hexBinary  base64Binary  anyURI  QName  NOTATION Atomic, built-in  format: ---DD (note the 3 dashes)  format: --MM--  a hex string  a base64 string  http://www.xfront.com  a namespace qualified name  a NOTATION from the XML spec

46 Schemas 46 Derived types  normalizedString  token  language  IDREFS  ENTITIES  NMTOKEN  NMTOKENS  Name  NCName  ID  IDREF  ENTITY  integer  nonPositiveInteger Subtype of primitive datatype  A string without tabs, line feeds, or carriage returns  String w/o tabs, l/f, leading/trailing spaces, consecutive spaces  any valid xml:lang value, e.g., EN, FR,...  must be used only with attributes  part (no namespace qualifier)  must be used only with attributes  456  negative infinity to 0 Built in Datatypes (cont.)

47 Schemas 47 Derived types  negativeInteger  long  int  short  byte  nonNegativeInteger  unsignedLong  unsignedInt  unsignedShort  unsignedByte  positiveInteger Subtype of primitive datatype  negative infinity to -1  -9223372036854775808 to 9223372036854775807  -2147483648 to 2147483647  -32768 to 32767  -127 to 128  0 to infinity  0 to 18446744073709551615  0 to 4294967295  0 to 65535  0 to 255  1 to infinity Note: the following types can only be used with attributes (which we will discuss later): ID, IDREF, IDREFS, NMTOKEN, NMTOKENS, ENTITY, and ENTITIES. Built-in Datatypes (cont.)

48 Schemas 48 Creating own Datatypes A new datatype can be defined from an existing datatype (called the "base" type) by specifying values for one or more of the optional facets for the base type. Example. The string primitive datatype has six optional facets:  length  minLength  maxLength  pattern  enumeration  whitespace (legal values: preserve, replace, collapse)

49 Schemas 49 New Datatype by Specifying Facet Values 1. This creates a new datatype called 'TelephoneNumber'. 2. Elements of this type can hold string values, 3. But the string length must be exactly 8 characters long and 4. The string must follow the pattern: ddd-dddd, where 'd' represents a 'digit'. (Obviously, in this example the regular expression makes the length facet redundant.) 1 2 3 4

50 Schemas 50 Another Example This creates a new type called shape. An element declared to be of this type must have either the value circle, or triangle, or square.

51 Schemas 51 Facets of the integer Datatype The integer datatype has 8 optional facets:  totalDigits  pattern  whitespace  enumeration  maxInclusive  maxExclusive  minInclusive  minExclusive

52 Schemas 52 Example This creates a new datatype called 'EarthSurfaceElevation'. Elements declared to be of this type can hold an integer. However, the integer is restricted to have a value between -1290 and 29035, inclusive.

53 Schemas 53 General Form of Creating a New Datatype by Specifying Facet Values … Facets: - length - minlength - maxlength - pattern - enumeration - minInclusive - maxInclusive - minExclusive - maxExclusive... Sources: - string - boolean - number - float - double - duration - dateTime - time...

54 Schemas 54 Multiple Facets - "and" them together, or "or" them together? An element declared to be of type TelephoneNumber must be a string of length=8 and the string must follow the pattern: 3 digits, dash, 4 digits. An element declared to be of type shape must be a string with a value of either circle, or triangle, or square. Patterns, enumerations => "or" them together. All other facets => "and" them together

55 Schemas 55 Creating a simpleType from another simpleType Thus far we have created a simpleType using one of the built-in datatypes as our base type. However, we can create a simpleType that uses another simpleType as the base.

56 Schemas 56 This simpleType uses EarthSurfaceElevation as its base type. Example F

57 Schemas 57 Fixing a Facet Value Sometimes when we define a simpleType we want to require that one (or more) facet have an unchanging value. That is to make the facet a constant. simpleTypes which derive from this simpleType may not change this facet.

58 Schemas 58 Element Containing a User-Defined Simple Type Example. Create a schema element declaration for an elevation element. Declare the elevation element to be an integer with a range -1290 to 29035 5240 Here's one way of declaring the elevation element:

59 Schemas 59 Element Containing a User-Defined Simple Type (cont.) Here's an alternative method for declaring elevation: The simpleType definition is defined inline, it is an anonymous simpleType definition. The disadvantage of this approach is that this simpleType may not be reused by other elements.

60 Schemas 60 … 1 2 … 3 Summary B of Declaring Elements

61 Schemas 61 Annotating Schemas The element is used for documenting the schema, both for humans and for programs.  Use for providing a comment to humans  Use for providing a comment to programs The content is any well-formed XML Note that annotations have no effect on schema validation The following constraint is not expressible with XML Schema: The value of element A should be greater than the value of element B. So, we need to use a separate tool (e.g., Schematron) to check this constraint. We will express this constraint in the appinfo section (below). A should be greater than B

62 Schemas 62 Where Can You Put Annotations? You cannot put annotations at just any random location in the schema. Here are the rules for where an annotation element can go:  annotations may occur before and after any global component  annotations may occur only at the beginning of non-global components

63 Schemas 63 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Can put annotations only at these locations Suppose that you want to annotate, say, the Date element declaration. What do we do? Example G

64 Schemas 64 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> How to annotate the Date element! Inline the annotation within the Date element declaration. Explanation G1

65 Schemas 65 Two Optional Attributes for the documentation Element In the previous example we showed with no attributes. Actually, it can have two attributes:  source: this attribute contains a URL to a file which contains supplemental information  xml:lang: this attribute specifies the language that the documentation was written in

66 Schemas 66 In the previous example was showed with no attributes. Actually, it can have one attribute:  source: this attribute contains a URL to a file which contains supplemental information Appinfo

67 Schemas 67 XML Schema Validate XML documents Automatic GUI generation Automatic API generation Semantic Web??? Smart Editor Ways of Using Schema

68 Schemas 68 Describing Metadata using Schemas XML Schema Strategy - two documents are used to provide metadata:  a schema document specifies the properties (metadata) for a class of resources (objects);  each instance document provides specific values for the properties.

69 Schemas 69 XML Schema: Specifies the Properties for a Class of Resources "For the class of Book resources, we identify five properties - Title, Author, Date, ISBN, and Publisher"

70 Schemas 70 Regular Expressions Recall that the string data type has a pattern facet. The value of a pattern facet is a regular expression. Below are some examples of regular expressions: Regular Expression - Chapter \d - Chapter \d - a*b - [xyz]b - a?b - a+b - [a-c]x Example - Chapter 1 - b, ab, aab, aaab, … - xb, yb, zb - b, ab - ab, aab, aaab, … - ax, bx, cx

71 Schemas 71 Regular Expressions (cont.) Regular Expression  [a-c]x  [-ac]x  [ac-]x  [^0-9]x  \Dx  Chapter\s\d  (ho){2} there  (ho\s){2} there .abc  (a|b)+x Example  ax, bx, cx  -x, ax, cx  ax, cx, -x  any non-digit char followed by x  Chapter followed by a blank followed by a digit  hoho there  any (one) char followed by abc  ax, bx, aax, bbx, abx, bax,...

72 Schemas 72 Regular Expressions (cont.) a{1,3}x a{2,}x \w\s\w ax, aax, aaax aax, aaax, aaaax, … word character (alphanumeric plus dash) followed by a space followed by a word character [a-zA-Z-[Ol]]* A string comprised of any lower and upper case letters, except "O" and "l" \. The period "." (Without the backward slash the period means "any character")

73 Schemas 73 Regular Expressions (cont.) \n \r \t \\ \| \- \^ \? \* \+ \{ \} \( \) \[ \] linefeed carriage return tab The backward slash \ The vertical bar | The hyphen - The caret ^ The question mark ? The asterisk * The plus sign + The open curly brace { The close curly brace } The open paren ( The close paren ) The open square bracket [ The close square bracket ]

74 Schemas 74 Regular Expressions (concluded) \p{L} \p{Lu} \p{Ll} \p{N} \p{Nd} \p{P} \p{Sc} A letter, from any language An uppercase letter, from any language A lowercase letter, from any language A number - Roman, fractions, etc A digit from any language A punctuation symbol A currency sign, from any language "currency sign from any language, followed by one or more digits from any language, optionally followed by a period and two digits from any language" $45.99 ¥300

75 Schemas 75 Example R.E. [1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5] 0 to 99 100 to 199 200 to 249250 to 255 This regular expression restricts a string to have values between 0 and 255. … Such a R.E. might be useful in describing an IP address...

76 Schemas 76 IP Datatype Definition <xsd:pattern value="(([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3} ([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"> Datatype for representing IP addresses. Examples, 129.83.64.255, 64.128.2.71, etc. This datatype restricts each field of the IP address to have a value between zero and 255, i.e., [0-255].[0-255].[0-255].[0-255] Note: in the value attribute (above) the regular expression has been split over two lines. This is for readability purposes only. In practice the R.E. would all be on one line.

77 Schemas 77 Derived Types We can do a form of subclassing complexType definitions. We call this "derived types"  derive by extension: extend the parent complexType with more elements  derive by restriction: create a type which is a subset of the base type. There are two ways to subset the elements: redefine a base type element to have a restricted range of values, or redefine a base type element to have a more restricted number of occurrences.

78 Schemas 78 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Note that BookPublication extends the Publication type, i.e., we are doing Derive by Extension (see example06) Example H

79 Schemas 79 Elements declared to be of type BookPublication will have 5 child elements - Title, Author, Date, ISBN, and Publisher. Note that the elements in the derived type are appended to the elements in the base type. Explanation H1

80 Schemas 80 Title Author Date Publication ISBN Publisher BookPublication Explanation H2

81 Schemas 81 Deleting an element in the base type Note that in this subtype we have eliminated the Author element, i.e., the subtype is just comprised of an unbounded number of Title elements followed by a single Date element. If the base type has an element with minOccurs="0", and the subtype wishes to not have that element, then it can simply leave it out.

82 Schemas 82 Prohibiting Derivations Sometimes we may want to create a type and disallow all derivations of it, or just disallow extension derivations, or disallow restriction derivations.  Rationale: "For example, I may create a complexType and make it publicly available for others to use. However, I don't want them to extend it with their proprietary extensions or subset it to remove, say, copyright information." (Jon Cleaver) Publication cannot be extended nor restricted Publication cannot be restricted Publication cannot be extended

83 Schemas 83 Terminology: Declaration vs Definition In a schema:  You declare elements and attributes. Schema components that are declared are those that have a representation in an XML instance document.  You define components that are used just within the schema document(s). Schema components that are defined are those that have no representation in an XML instance document. Declarations: - element declarations - attribute declarations Definitions: - type (simple, complex) definitions - attribute group definitions - model group definitions

84 Schemas 84 Terminology: Global versus Local Global element declarations, global type definitions:  These are element declarations/type definitions that are immediate children of Local element declarations, local type definitions:  These are element declarations/type definitions that are nested within other elements/types.

85 Schemas 85 Global type definition Global element declaration Local element declarations Local type definition <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> Example I

86 Schemas 86 Element Substitution In daily conversation there are several ways to express something.  In Boston we use the words "T" and "subway" interchangeably. For example, "we took the T into town", or "we took the subway into town". Thus, "T" and "subway" are substitutable. Which one is used may depend upon what part of the state you live in, what mood you're in, or any number of factors. We would like to be able to express this substitutability in XML Schemas.  That is, we would like to be able to declare in a schema an element called "subway", an element called "T", and state that "T"may be substituted for "subway". Instance documents can then use either or, depending on their preference.

87 Schemas 87 substitutionGroup We can define a group of substitutable elements (called a substitutionGroup) by declaring an element (called the head) and then declaring other elements which state that they are substitutable for the head element. subway is the head element T is substitutable for subway So what's the big deal? - Anywhere a head element can be used in an instance document, any member of the substitutionGroup can be substituted!

88 Schemas 88 Red Line Schema: Instance doc: Red Line Alternative instance doc (substitute T for subway): This example shows the element being substituted with the element. Example J

89 Schemas 89 Red Line Schema: Instance doc: Linea Roja Alternative instance doc (customized for our Spanish clients): Explanation J1

90 Schemas 90 BookType and MagazineType Derive from PublicationType PublicationType BookTypeMagazineType In order for Book and Magazine to be in a substitutionGroup with Publication, their type (BookType and MagazineType, respectively) must be the same as, or derived from Publication's type (PublicationType)

91 Schemas 91 Example K

92 Schemas 92 Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. Natural Health 1999 The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row can contain any element in the substitutionGroup with Publication! XML ex K

93 Schemas 93 <!ATTLIST Book Category (autobiography | non-fiction | fiction) #REQUIRED InStock (true | false) "false" Reviewer CDATA " "> BookStore.dtd Example L: Attributes

94 Schemas 94 (see example07) InStock (true | false) "false" Reviewer CDATA " " Category (autobiography | non-fiction | fiction) #REQUIRED Example M

95 Schemas 95 "Instance documents are required to have the Category attribute (as indicated by use="required"). The value of Category must be either autobiography, non-fiction, or fiction (as specified by the enumeration facets)." Note: attributes can only have simpleTypes (i.e., attributes cannot have child elements). Explanation M1

96 Schemas 96 Summary of Declaring Attributes required optional prohibited Do not use the "use" attribute if you use either default or fixed. xsd:string xsd:integer xsd:boolean... … 1 2

97 Schemas 97 Inlining Attributes On the next slide is another way of expressing the last example - the attributes are inlined within the Book declaration rather than being separately defined in an attributeGroup. (I only show a portion of the schema - the Book element declaration.)

98 Schemas 98 (see example08) Example N

99 Schemas 99 Notes about Attributes The attribute declarations always come last, after the element declarations. The attributes are always with respect to the element that they are defined (nested) within. … "bar and boo are attributes of foo"

100 Schemas 100 These attributes apply to the element they are nested within (Book) That is, Book has three attributes - Category, InStock, and Reviewer. cont.

101 Schemas 101 Element with Simple Content and Attributes Example. Consider this: 5440 The elevation element has these two constraints: - it has a simple (integer) content - it has an attribute called units How do we declare elevation? (see next slide)

102 Schemas 102 1. elevation contains an attribute. - therefore, we must use 2. However, elevation does not contain child elements (which is what we generally use to indicate). Instead, elevation contains simpleContent. 3. We wish to extend the simpleContent (an integer)... 4. with an attribute. 1 2 3 4 cont.

103 Schemas 103 elevation - use Stronger Datatype In the declaration for elevation we allowed it to hold any integer. Further, we allowed the units attribute to hold any string. Let's restrict elevation to hold an integer with a range 0 - 12,000 and let's restrict units to hold either the string "feet" or the string "meters"

104 Schemas 104 cont.

105 Schemas 105 Summary of Declaring Elements 1. Element with Simple Content. Declaring an element using a built-in type: Declaring an element using a user-defined simpleType: An alternative formulation of the above shapes example is to inline the simpleType definition:

106 Schemas 106 Summary of Declaring Elements (cont.) 2. Element Contains Child Elements Defining the child elements inline: An alternate formulation of the above Person example is to create a named complexType and then use that type:

107 Schemas 107 Summary of Declaring Elements (cont.) 3. Element Contains a complexType that is an Extension of another complexType

108 Schemas 108 4. Element Contains a complexType that is a Restriction of another complexType Summary of Declaring Elements (cont.)

109 Schemas 109 Summary of Declaring Elements (concluded) 5. Element Contains Simple Content and Attributes Example. Large, green, sour

110 Schemas 110 complexContent versus simpleContent With complexContent you extend or restrict a complexType With simpleContent you extend or restrict a simpleType … X must be a complexType … Y must be a simpleType versus Do Lab 8.b, 8.c

111 Schemas 111 group Element The group element enables you to group together element declarations. Note: the group element is just for grouping together element declarations, no attribute declarations allowed!

112 Schemas 112 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> cont.

113 Schemas 113 Another example showing the use of the element cont.

114 Schemas 114 Expressing Alternates DTD: XML Schema: <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.travel.org" xmlns="http://www.travel.org" elementFormDefault="qualified"> (see example10) Note: the choice is an exclusive-or, that is, transportation can contain only one element - either train, or plane, or automobile.

115 Schemas 115 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.binary.org" xmlns="http://www.binary.org" elementFormDefault="qualified"> DTD: XML Schema: Notes: 1. An element can fix its value, using the fixed attribute. 2. When you don't specify a value for minOccurs, it defaults to "1". Same for maxOccurs. See the last example (transportation) where we used a element with no minOccurs or maxOccurs. (see example 11) Expressing Repeatable Choice

116 Schemas 116 fixed/default Element Values When you declare an element you can give it a fixed or default value.  Then, in the instance document, you can leave the element empty. … 0 or equivalently: … red or equivalently:

117 Schemas 117 Using and <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.binary.org" xmlns="http://www.binary.org" elementFormDefault="qualified"> DTD: XML Schema:

118 Schemas 118 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> XML Schema: Problem: create an element, Book, which contains Author, Title, Date, ISBN, and Publisher, in any order (Note: this is very difficult and ugly with DTDs). means that Book must contain all five child elements, but they may occur in any order. (see example 12) Expressing Any Order

119 Schemas 119 Constraints on using Elements declared within must have a maxOccurs value of "1" (minOccurs can be either "0" or "1") If a complexType uses and it extends another type, then that parent type must have empty content. The element cannot be nested within either,, or another The contents of must be just elements. It cannot contain or

120 Schemas 120 Empty Element <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.photography.org" xmlns="http://www.photography.org" elementFormDefault="qualified"> Schema: Instance doc (snippet): DTD:

121 Schemas 121 No targetNamespace (noNamespaceSchemaLocation) Sometimes you may wish to create a schema but without putting the elements within a namespace. The targetNamespace attribute is actually an optional attribute of. Thus, if you don’t want to specify a namespace for your schema then simply don’t use the targetNamespace attribute. Consequences of having no namespace  1. In the instance document don’t namespace qualify the elements.  2. In the instance document, instead of using schemaLocation use noNamespaceSchemaLocation.

122 Schemas 122 Note that there is no targetNamespace attribute, and note that there is no longer a default namespace. <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> cont.

123 Schemas 123 <BookStore xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation= "BookStore.xsd"> My Life and Times Paul McCartney 1998 1-56592-235-2 McMillin Publishing … (see example14) 1. Note that there is no default namespace declaration. So, none of the elements are associated with a namespace. 2. Note that we do not use xsi:schemaLocation (since it requires a pair of values - a namespace and a URL to the schema for that namespace). Instead, we use xsi:noNamespaceSchemaLocation. cont.

124 Schemas 124 Assembling an Instance Document from Multiple Schema Documents An instance document may be composed of elements from multiple schemas. Validation can apply to the entire XML instance document, or to a single element.

125 Schemas 125 <Library xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.book.org Book.xsd http://www.employee.org Employee.xsd"> My Life and Times Paul McCartney 1998 1-56592-235-2 Macmillan Publishing Illusions The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row John Doe 123-45-6789 Sally Smith 000-11-2345 Validating against two schemas The elements are defined in Book.xsd, and the elements are defined in Employee.xsd. The,, and elements are not defined in any schema! 1. A schema validator will validate each Book element against Book.xsd. 2. It will validate each Employee element against Employee.xsd. 3. It will not validate the other elements. cont.

126 Schemas 126 Assembling a Schema from Multiple Schema Documents The include element allows you to access components in other schemas  All the schemas you include must have the same namespace as your schema (i.e., the schema that is doing the include)  The net effect of include is as though you had typed all the definitions directly into the containing schema … LibraryBook.xsd LibraryEmployee.xsd Library.xsd

127 Schemas 127 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.library.org" xmlns="http://www.library.org" elementFormDefault="qualified"> Library.xsd These are referencing element declarations in the other schemas. Nice! Library.xsd

128 Schemas 128 Assembling a Schema from Multiple Schema Documents with Different Namespaces The import element allows you to access elements and types in a different namespace <xsd:import namespace="A" schemaLocation="A.xsd"/> <xsd:import namespace="B" schemaLocation="B.xsd"/> … Namespace A A.xsd Namespace B B.xsd C.xsd

129 Schemas 129 Camera Schema Camera.xsd Nikon.xsd Olympus.xsd Pentax.xsd Ex: Multiple Schemas

130 Schemas 130 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.camera.org" xmlns:nikon="http://www.nikon.com" xmlns:olympus="http://www.olympus.com" xmlns:pentax="http://www.pentax.com" elementFormDefault="qualified"> <xsd:import namespace="http://www.nikon.com" schemaLocation="Nikon.xsd"/> <xsd:import namespace="http://www.olympus.com" schemaLocation="Olympus.xsd"/> <xsd:import namespace="http://www.pentax.com" schemaLocation="Pentax.xsd"/> Camera.xsd (see example 18) These import elements give us access to the components in these other schemas. Here I am using the body_type that is defined in the Nikon namespace Ex cont.

131 Schemas 131 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.nikon.com" xmlns="http://www.nikon.com" elementFormDefault="qualified"> Nikon.xsd <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.olympus.com" xmlns="http://www.olympus.com" elementFormDefault="qualified"> Olympus.xsd <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.pentax.com" xmlns="http://www.pentax.com" elementFormDefault="qualified"> Pentax.xsd Ex. cont.

132 Schemas 132 <c:camera xmlns:c="http://www.camera.org" xmlns:nikon="http://www.nikon.com" xmlns:olympus="http://www.olympus.com" xmlns:pentax="http://www.pentax.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.camera.org Camera.xsd http://www.nikon.com Nikon.xsd http://www.olympus.com Olympus.xsd http://www.pentax.com Pentax.xsd"> Ergonomically designed casing for easy handling 300mm 1.2 1/10,000 sec to 100 sec The Camera instance uses elements from the Nikon, Olympus, and Pentax namespaces. Camera.xml Ex. cont.

133 Schemas 133 Creating Lists There are times when you will want an element to contain a list of values, e.g., "The contents of the Numbers element is a list of numbers". Example: For a document containing a Lottery drawing we might have 12 49 37 99 20 67 How do we declare the element Numbers... (1) To contain a list of integers, and (2) Each integer is restricted to be between 1 and 99, and (3) The total number of integers in the list is exactly six.

134 Schemas 134 <LotteryDrawings xmlns="http://www.lottery.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.lottery.org Lottery.xsd"> July 1 21 3 67 8 90 12 July 8 55 31 4 57 98 22 July 15 70 77 19 35 44 11 Lottery.xml (see example19) cont.

135 Schemas 135 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.lottery.org" xmlns="http://www.lottery.org" elementFormDefault="qualified"> Lottery.xsd cont.

136 Schemas 136 LotteryNumbers --> Need Stronger Datatyping The list in the previous schema has two problems:  It allows to contain an arbitrarily long list  The numbers in the list may be any positiveInteger We need to:  Restrict the list to length value="6"  Restrict the numbers to maxInclusive value="99"

137 Schemas 137 <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.lottery.org" xmlns="http://www.lottery.org" elementFormDefault="qualified"> cont.

138 Schemas 138 NumbersList is a list where the type of each item is OneToNintyNine. LotteryNumbers restricts NumbersList to a length of six (i.e., an element declared to be of type LotteryNumbers must hold a list of numbers, between 1 and 99, and the length of the list must be exactly six). cont.

139 Schemas 139 Alternatively, This is read as: "We are creating a new type called LotteryNumbers. It is a restriction. At this point we can either use the base attribute or a simpleType child element to indicate the type that we are restricting (you cannot use both the base attribute and the simpleType child element). We want to restrict the type that is a list of OneToNintyNine. We will restrict that type to a length of 6."

140 Schemas 140 3. simpleType that declares a list type: where the datatype OneToNintyNine is declared as: 4. An alternate form of the above, where the list's datatype is specified using an inlined simpleType: Summary of Declaring simpleTypes

141 Schemas 141 5. simpleType that declares a union type: where the datatype UnboundedType is declared as: 6. An alternate form of the above, where the datatype UnboundedType is specified using an inline simpleType: cont.

142 Schemas 142 any Element The element enables the instance document author to extend his/her document with elements not specified by the schema. Now an instance document author can optionally extend (after ) the content of elements with any element.

143 Schemas 143 Strategy for Defining Semantics of your XML Elements Capture the semantics in the XML Schema  Describe the semantics within the element  Adopt the convention that every element and attribute have an annotation which provides information on the meaning Advantages:  The XML Schema will capture the data structure, meta-data, and relationships between the elements  Use of strong typing will capture much of the data content  The annotations can capture definitions and other explanatory information  The structure of the "definitions" will always be consistent with the structure used in the schema since they are linked  Since the schema itself is an XML document, we can use XSLT to extract the annotations and transform the "semantic" information into a format suitable for human consumption

144 Schemas 144 Cont.


Download ppt "Schemas1 XML Schema More Powerful. Schemas 2 DTD – Schema - Relax An XML schema is a description of a type of XML document, typically expressed in terms."

Similar presentations


Ads by Google