XML Technologies X-Schema
What is X-Schema? (1) XML-based alternative to DTD Describes the structure of an XML document Also referred to as XML Schema Definition (XSD)
What is X-Schema? (2) defines elements that can appear in a document defines attributes that can appear in a document defines which elements are child elements defines the order of child elements defines the number of child elements defines whether an element is empty or can include text defines data types for elements and attributes defines default and fixed values for elements and attributes
X-Schema vrs DTD XML Schemas are extensible to future additions XML Schemas are richer and more powerful than DTDs XML Schemas are written in XML XML Schemas support data types XML Schemas support namespaces
X-Schema makes it easy to describe allowable document content validate the correctness of data work with data from a database define restrictions on data define data formats convert data between different data types
Why is it better to use an XML based schema? We know the language Can be edited with XML editors Can be parsed with XML parsers Can be manipulated with XML DOM Can be transformed with XSLT
Benefits of X-Schemas Ensure secure data communication 3/11/2004 is the 3rd of November or the 11th of March? Can reuse schemas Use one or many in the same document Create own data types
How to use X-Schema <?xml version="1.0"?> <note xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="note.xsd"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
The actual schema <?xml version="1.0"?> <xs:schema> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
<schema> element Root element of any schema May contain attributes xmlns:xs="http://www.w3.org/2001/XMLSchema" Indicate the XSchema namespace targetNamespace="http://www.um.edu.mt" Indicate that the elements defined (to, from, etc) come from the above namespace xmlns="http://www.um.edu.mt" Sets the default namespace elementFormDefault="qualified“ Qualified = all elements can be validated Unqualified = only root element can be validated with the namespace
Elements A simple element contains only text But it can have different types or restrictions (facets) <xs:element name="xxx" type="yyy"/> <xs:element name=“age" type=“xs:integer"/>
Element Built-In Types xs:string xs:decimal xs:integer xs:boolean xs:date xs:time
Default and Fixed types <xs:element name="color" type="xs:string" default="red"/> <xs:element name="color" type="xs:string" fixed="red"/>
<xs:attribute name="xxx" type="yyy"/> Attributes Simple elements cannot have attributes Only complex elements have attributes An attribute is declared as a simple type <xs:attribute name="xxx" type="yyy"/>
Default, Fixed and Optional attributes <xs:attribute name="lang" type="xs:string" default="EN"/> <xs:attribute name="lang" type="xs:string" fixed="EN"/> <xs:attribute name="lang" type="xs:string" use="required"/> * Optional by default (unless specified)
Restrictions (Facets) - Range <xs:element name="age"> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> <xs:maxInclusive value="120"/> </xs:restriction> </xs:simpleType> </xs:element>
Restrictions (Facets) – Set 1 <xs:element name="car"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/> </xs:restriction> </xs:simpleType> </xs:element>
Restrictions (Facets) – Set 2 <xs:element name="car" type="cType"/> <xs:simpleType name="cType"> <xs:restriction base="xs:string"> <xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/> </xs:restriction> </xs:simpleType>
Restrictions (Facets) - Pattern <xs:element name="letter"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[a-z]"/> </xs:restriction> </xs:simpleType> </xs:element>
Restrictions (Facets) - Patterns Regular expressions example [A-Z] any letter uppercase [A-Z][A-Z] two letters uppercase [a-zA-Z] any letter any case [xyz] x or y or z [0-9] any number one digit ([A-Z])+ one or more ([A-Z])* zero or more male|female either or [a-zA-Z]{8} exactly 8 characters
Restricting white space (1) <xs:element name="address"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:whiteSpace value="preserve"/> </xs:restriction> </xs:simpleType> </xs:element>
Restricting white space (2) Value can be Preserve Do not remove white space Replace Replace line feeds, tabs, spaces, and carriage returns with spaces Collapse Same as replace but collapse multiple spaces into one
Restrictions (Facets) - Length <xs:element name="password"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:minLength value="5"/> <xs:maxLength value="8"/> </xs:restriction> </xs:simpleType> </xs:element>
Summarising Restrictions Constraint Description enumeration Defines a list of acceptable values fractionDigits Specifies the maximum number of decimal places allowed. Must be equal to or greater than zero length Specifies the exact number of characters or list items allowed. Must be equal to or greater than zero maxExclusive Specifies the upper bounds for numeric values (the value must be less than this value) maxInclusive Specifies the upper bounds for numeric values (the value must be less than or equal to this value) maxLength Specifies the maximum number of characters or list items allowed. Must be equal to or greater than zero minExclusive Specifies the lower bounds for numeric values (the value must be greater than this value) minInclusive Specifies the lower bounds for numeric values (the value must be greater than or equal to this value) minLength Specifies the minimum number of characters or list items allowed. Must be equal to or greater than zero pattern Defines the exact sequence of characters that are acceptable totalDigits Specifies the exact number of digits allowed. Must be greater than zero whiteSpace Specifies how white space (line feeds, tabs, spaces, and carriage returns) is handled
Complex Elements empty elements elements that contain only other elements elements that contain only text elements that contain both other elements and text
Complex Empty Elements <xs:element name="product"> <xs:complexType> <xs:attribute name="prodid" type="xs:positiveInteger"/> </xs:complexType> </xs:element> Ex <product prodid="1345" />
Complex Elements Only Contains only other elements <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element>
Complex Elements Only Ex <person> <firstname>John</firstname> <lastname>Smith</lastname> </person>
Complex TextOnly Element <xs:element name="shoesize"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:integer"> <xs:attribute name="country" type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> Eg <shoesize country="france">35</shoesize>
Complex Mixed Content <xs:element name="letter"> <xs:complexType mixed="true"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="orderid" type="xs:positiveInteger"/> <xs:element name="shipdate" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element>
7 Indicators Order indicators: All Choice Sequence All must occur but in any order Choice Either or can occur Sequence Must appear in a specified sequence Occurrence indicators: maxOccurs minOccurs Group indicators: Group elements or attributes together and allow for reuse Group name attributeGroup name
All, Choice, Sequence indicator <xs:element name="person"> <xs:complexType> <xs:all> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:all> </xs:complexType> </xs:element>
Occurrence indicator <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="child_name" type="xs:string" maxOccurs="10" minOccurs="0"/> </xs:sequence> </xs:complexType> </xs:element>
<any> and <anyAttribute> <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="child_name" type="xs:string" maxOccurs="10" minOccurs="0"/> <xs:any minOccurs="0"/> </xs:sequence> <xs:anyAttribute/> </xs:complexType> </xs:element> *allows for any element or attribute but it has to be defined just the same in another XSchema
Substitutions <xs:element name="name" type="xs:string"/> <xs:element name=“isem" substitutionGroup="name"/> Allows for … <customer> <isem>John</isem> </customer>
String Datatype Name Description ENTITIES ENTITY ID ENTITY ID A string that represents the ID attribute in XML (only used with schema attributes) IDREF A string that represents the IDREF attribute in XML (only used with schema attributes) IDREFS language A string that contains a valid language id A string that contains a valid XML name NCName NMTOKEN A string that represents the NMTOKEN attribute in XML (only used with schema attributes) NMTOKENS normalizedString A string that does not contain line feeds, carriage returns, or tabs string A string token A string that does not contain line feeds, carriage returns, tabs, leading or trailing spaces, or multiple spaces
Date Datatype Name Description date Defines a date value dateTime Defines a date and time value duration Defines a time interval gDay Defines a part of a date - the day (DD) gMonth Defines a part of a date - the month (MM) gMonthDay Defines a part of a date - the month and day (MM-DD) gYear Defines a part of a date - the year (YYYY) gYearMonth Defines a part of a date - the year and month (YYYY-MM) time Defines a time value
Numeric Datatype Name Description byte A signed 8-bit integer decimal A decimal value int A signed 32-bit integer integer An integer value long A signed 64-bit integer negativeInteger An integer containing only negative values ( .., -2, -1.) nonNegativeInteger An integer containing only non-negative values (0, 1, 2, ..) nonPositiveInteger An integer containing only non-positive values (.., -2, -1, 0) positiveInteger An integer containing only positive values (1, 2, ..) short A signed 16-bit integer unsignedLong An unsigned 64-bit integer unsignedInt An unsigned 32-bit integer unsignedShort An unsigned 16-bit integer unsignedByte An unsigned 8-bit integer
Example
Exercise Define an XML Vocabulary for a Toaster Markup Language (TML) using XSchema It must support features such as: Print message on toast Display on LCD Time stamp
Questions?