Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tutorial 13 Validating Documents with Schemas

Similar presentations


Presentation on theme: "Tutorial 13 Validating Documents with Schemas"— Presentation transcript:

1 Tutorial 13 Validating Documents with Schemas

2 Objectives Compare schemas and DTDs
Explore different schema vocabularies Declare simple type elements and attributes Declare complex type elements Apply a schema to an instance document Work with XML Schema data types Derive new data types for text strings, numeric values, and dates Create data types for patterned data using regular expressions New Perspectives on HTML, CSS, and XML 4th edition

3 The Limits of DTDs DTDs are commonly used for validation largely because of XML’s origins as an offshoot of SGML. One complaint about DTDs is their lack of data types. DTDs also do not recognize namespaces, so they are not well suited to compound documents in which content from several vocabularies needs to be validated. DTDs employ a syntax called Extended Backus–Naur Form (EBNF), which is different from the syntax used for XML. New Perspectives on HTML, CSS, and XML 4th edition

4 Schemas and DTDs A schema is an XML document that contains validation rules for an XML vocabulary. When applied to a specific XML file, the document to be validated is called the instance document. New Perspectives on HTML, CSS, and XML 4th edition

5 Schemas and DTDs New Perspectives on HTML, CSS, and XML 4th edition

6 Schema Vocabularies A single standard doesn’t exist for schemas.
A schema vocabulary is simply an XML vocabulary created for the purpose of describing schema content. Support for a particular schema depends solely on the XML parser being used for validation. New Perspectives on HTML, CSS, and XML 4th edition

7 Schema Vocabularies New Perspectives on HTML, CSS, and XML 4th edition

8 Starting a Schema File A schema, is always placed in an external XML file. XML Schema filenames end with the .xsd file extension. The root element in any XML Schema document is the schema element. The general structure of an XML Schema file is: <?xml version=”1.0” ?> <schema xmlns=” content </schema> New Perspectives on HTML, CSS, and XML 4th edition

9 Starting a Schema File By convention, the namespace prefix xsd or xs is assigned to the XML Schema namespace to identify elements and attributes that belong to the XML Schema vocabulary. The usual form of an XML Schema document is: <?xml version=”1.0” ?> <xs:schema xmlns=” content </xs:schema> New Perspectives on HTML, CSS, and XML 4th edition

10 Understanding Simple and Complex Types
XML Schema supports two types of content—simple and complex. A simple type contains only text and no nested elements. A complex type contains two or more values or elements placed within a defined structure. New Perspectives on HTML, CSS, and XML 4th edition

11 Understanding Simple and Complex Types
New Perspectives on HTML, CSS, and XML 4th edition

12 Understanding Simple and Complex Types
New Perspectives on HTML, CSS, and XML 4th edition

13 Defining a Simple Type Element
An element in the instance document containing only text and no attributes or child elements is defined in XML Schema using the <xs:element> tag: <xs:element name=”name” type=”type” /> Here name is the name of the element in the instance document and type is the type of data stored in the element. If you use a different namespace prefix or declare XML Schema as the default namespace for the document, the prefix will be different. New Perspectives on HTML, CSS, and XML 4th edition

14 Data Types The data type can be: one of XML Schema’s built-in data types, defined by the schema author, or user data type. The most commonly used data type in XML Schema is string, which allows an element to contain any text string. Example: <xs:element name=”lastName” type=”xs:string” /> Another popular data type in XML Schema is decimal, which allows an element to contain a decimal number. New Perspectives on HTML, CSS, and XML 4th edition

15 Defining a Simple Type Element
New Perspectives on HTML, CSS, and XML 4th edition

16 Defining an Attribute To define an attribute in XML Schema, you use the <xs:attribute> tag: <xs:attribute name=”name” type=”type” default=”default” fixed=”fixed” /> Here name is the name of the attribute, type is the data type, default is the attribute’s default value, and fixed is a fixed value for the attribute. The default and fixed attributes are optional. New Perspectives on HTML, CSS, and XML 4th edition

17 Defining an Attribute New Perspectives on HTML, CSS, and XML 4th edition

18 Defining a Complex Type Element
The basic structure for defining a complex type element with XML Schema is <xs:element name=”name”> <xs:complexType> declarations </xs:complexType> </xs:element> Here name is the name of the element and declarations represents declarations of the type of content within the element. New Perspectives on HTML, CSS, and XML 4th edition

19 Defining a Complex Type Element
This content could include nested child elements, basic text, attributes, or any combination of the three: An empty element containing only attributes An element containing text content and attributes but no child elements An element containing child elements but no attributes An element containing both child elements and attributes New Perspectives on HTML, CSS, and XML 4th edition

20 Defining an Element Containing Attributes and Basic Text
The definition needs to indicate that the element contains simple content and a collection of one or more attributes. The structure of the element definition is: <xs:element name=”name”> <xs:complexType> <xs:simpleContent> <xs:extension base=”type”> attributes </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> New Perspectives on HTML, CSS, and XML 4th edition

21 Defining an Element Containing Attributes and Basic Text
Example: <xs:element name=”gpa”> <xs:complexType> <xs:simpleContent> <xs:extension base=”xs:string”> <xs:attribute name=”degree” type=”xs:string” /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> The base attribute in the <xs:extension> element sets the data type for the gpa element. It also sets the data type of the degree attribute to xs:string. New Perspectives on HTML, CSS, and XML 4th edition

22 Referencing an Element or Attribute Definition
XML Schema allows for a great deal of flexibility in writing complex types. Rather than repeating that earlier attribute declaration, you can create a reference to it. A reference to an element definition is <xs:element ref=”elemName” /> where elemName is the name used in the element definition. A reference to an attribute definition is <xs:attribute ref=”attName” /> where attName is the name used in the attribute definition. New Perspectives on HTML, CSS, and XML 4th edition

23 Defining an Element with Nested Children
Complex elements that contain nested child elements but no attributes or text: <xs:element name=”name”> <xs:complexType> <xs:compositor> elements </xs:compositor> </xs:complexType> </xs:element> where name is the name of the element, compositor is a value that defines how the child elements appear in the document, and elements is a list of the nested child elements. New Perspectives on HTML, CSS, and XML 4th edition

24 Defining an Element with Nested Children
The following compositors are supported: sequence - requires the child elements to appear in the order listed in the schema choice - allows any one of the child elements listed to appear in the instance document all - allows any of the child elements to appear in any order in the instance document; however, each may appear only once, or not at all New Perspectives on HTML, CSS, and XML 4th edition

25 Defining an Element with Nested Children - Examples
New Perspectives on HTML, CSS, and XML 4th edition

26 Defining an Element Containing Nested Elements and Attributes
The code for a complex type element that contains both child elements and attributes is: <xs:element name="name"> <xs:complexType> <xs:compositor> elements </xs:compositor> attributes </xs:complexType> </xs:element> where name is the name of the element; compositor is either sequence, choice, or all; elements is a list of nested child elements; and attributes is a list of attribute definitions. New Perspectives on HTML, CSS, and XML 4th edition

27 Defining an Element Containing Nested Elements and Attributes
Example: New Perspectives on HTML, CSS, and XML 4th edition

28 Specifying Mixed Content
An element is said to have mixed content when it contains both a text string and child elements. XML Schema assumes that the element contains both text and child elements. The structure of the child elements can then be defined with the conventional method. New Perspectives on HTML, CSS, and XML 4th edition

29 Specifying Mixed Content
<summary> student <firstName>Cynthia</firstName> <lastName>Berstein</lastName> is enrolled in an IT degree program and has completed <credits>12</credits> credits since 01/01/2012. </summary> The summary element for this document in a schema file can be declared using the following complex type: <element name=”summary”> <complexType mixed=”true”> <sequence> <element name=”firstName” type=”string” /> <element name=”lastName” type=”string” /> <element name=”credits” type=”string” /> </sequence> </complexType> </element> New Perspectives on HTML, CSS, and XML 4th edition

30 Indicating Required Attributes
To indicate whether an attribute is required, the use attribute can be added to the statement that assigns the attribute to an element: <xs:element name=”name”> <xs:complexType> element content <xs:attribute properties use=”use” /> </xs:complexType> </xs:element> New Perspectives on HTML, CSS, and XML 4th edition

31 Indicating Required Attributes
use is one of the following three values: required - The attribute must always appear with the element. optional - The use of the attribute is optional with the element. prohibited - The attribute cannot be used with the element. Example: <xs:attribute name=”degree” type=”xs:string” use=”required” /> New Perspectives on HTML, CSS, and XML 4th edition

32 Specifying the Number of Child Elements
To specify the number of times an element appears in the instance document, you can apply the minOccurs and maxOccurs attributes to the element definition: <xs:element name=”name” type=”type” minOccurs=”value” maxOccurs=”value” /> The value of the minOccurs attribute defines the minimum number of times the element can occur, and the value of the maxOccurs attribute defines the maximum number of times the element can occur. New Perspectives on HTML, CSS, and XML 4th edition

33 Validating a Schema Document
New Perspectives on HTML, CSS, and XML 4th edition

34 Applying a Schema to an Instance Document
To attach a schema to an instance document, you: Declare the XML Schema instance namespace in the instance document. Specify the location of the schema file. To declare the XML Schema instance namespace, you add the following attribute to the root element of the instance document: xmlns:xsi=” New Perspectives on HTML, CSS, and XML 4th edition

35 Applying a Schema to an Instance Document
You add a second attribute to the root element to specify the location of the schema file. The attribute you use depends on whether the instance document is associated with a namespace. If the document is not associated with a namespace, you add the attribute: xsi:noNamespaceSchemaLocation=”schema” to the root element, where schema is the location and name of the schema file. New Perspectives on HTML, CSS, and XML 4th edition

36 Validating with Built-In Data Types
XML Schema divides its built-in data types into two classes—primitive and derived. A primitive data type, also called a base type, is one of 19 fundamental data types that are not defined in terms of other types. A derived data type is one of 25 data types that are developed from one of the base types. New Perspectives on HTML, CSS, and XML 4th edition

37 Validating with Built-In Data Types
New Perspectives on HTML, CSS, and XML 4th edition

38 String Data Types New Perspectives on HTML, CSS, and XML 4th edition

39 Numeric Data Types New Perspectives on HTML, CSS, and XML 4th edition

40 Date and Time Data Types
New Perspectives on HTML, CSS, and XML 4th edition

41 Deriving Customized Data Types
The code to derive a new data type is: <xs:simpleType name=”name”> rules </xs:simpleType> Here name is the name of the user-defined data type and rules is the list of statements that define the properties of that data type. This structure is also known as a named simple type. You can also create a simple type without a name, which is known as an anonymous simple type. New Perspectives on HTML, CSS, and XML 4th edition

42 Deriving Customized Data Types
The following three components are involved in deriving any new data type: value space - The set of values that correspond to the data type. lexical space - The set of textual representations of the value space. facets - The properties that distinguish one data type from another. New Perspectives on HTML, CSS, and XML 4th edition

43 Deriving Customized Data Types
New data types are created by manipulating the properties of value space, lexical space, and facets. It can be done by: 1. Creating a list based on preexisting data types. 2. Creating a union of one or more of the preexisting data types. 3. Restricting the values of a preexisting data type. New Perspectives on HTML, CSS, and XML 4th edition

44 Deriving a List Data Type
A list data type is a list of values separated by white space, in which each item in the list is derived from an established data type. The syntax for deriving a customized list data type is: <xs:simpleType name=”name”> <xs:list itemType=”type” /> </xs:simpleType> Here name is the name assigned to the list data type and type is the data type from which each item in the list is derived. New Perspectives on HTML, CSS, and XML 4th edition

45 Deriving a Union Data Type
A union data type is based on the value and/or lexical spaces from two or more preexisting data types. Each base data type is known as a member data type. The syntax is: <xs:simpleType name=”name”> <xs:union memberTypes=”type1 type2 type3 ...” /> </xs:simpleType> where type1, type2, type3, etc., are the member types that constitute the union. New Perspectives on HTML, CSS, and XML 4th edition

46 Deriving a Union Data Type
XML Schema also allows unions to be created from nested simple types. The syntax is: <xs:simpleType name=”name”> <xs:union> <xs:simpleType> rules </xs:simpleType> <xs:simpleType> rules </xs:simpleType> </xs:union> </xs:simpleType> where rules1, rules2, etc., are rules for creating different user-derived data types. New Perspectives on HTML, CSS, and XML 4th edition

47 Deriving a Restricted Data Type
New Perspectives on HTML, CSS, and XML 4th edition

48 Constraining Facets Constraining facets are applied to a base type using the structure: <xs:simpleType name=”name”> <xs:restriction base=”type”> <xs:facet1 value=”value1” /> <xs:facet2 value=”value2” /> </xs:restriction> </xs:simpleType> where type is the data type on which the restricted data type is based; facet1, facet2, etc., are constraining facets; and value1, value2, etc., are values for the constraining facets. New Perspectives on HTML, CSS, and XML 4th edition

49 Deriving Data Types Using Regular Expressions
A regular expression is a text string that defines a character pattern. Regular expressions can be created to define patterns for many types of data, including phone numbers, postal address codes, and addresses. New Perspectives on HTML, CSS, and XML 4th edition

50 Deriving Data Types Using Regular Expressions
To apply a regular expression in a data type, you create the simple type: <xs:simpleType name=”name”> <xs:restriction base=”type”> <xs:pattern value=”regex” /> </xs:restriction> </xs:simpleType> where regex is a regular expression pattern. Example: <xs:pattern value=”ABC” /> New Perspectives on HTML, CSS, and XML 4th edition

51 Regular Expression Character Types
Character types are representations of different kinds of characters. The general form of a character type is: \char New Perspectives on HTML, CSS, and XML 4th edition

52 Common Regular Expression Character Sets
Characters can also be grouped into lists called character sets that specify exactly what characters or ranges of characters are allowed in the pattern. The syntax of a character set is: [chars] New Perspectives on HTML, CSS, and XML 4th edition

53 Regular Expression Quantifiers
To specify the number of occurrences for a particular character or group of characters, a quantifier can be appended to a character type or set. New Perspectives on HTML, CSS, and XML 4th edition

54 Applying Regular Expression
New Perspectives on HTML, CSS, and XML 4th edition


Download ppt "Tutorial 13 Validating Documents with Schemas"

Similar presentations


Ads by Google