CH 20 XML Schema
Objective What’s wrong with DTDs? What is a schema? The W3C XML Schema Language Hello schemas Complex types Simple types Deriving simple types
What’s wrong with DTDs? XML is being used for object serialization, stock trading, remote procedure calls, vector graphics, and many more things First issue - DTD lack data type, especially for element content - DTD can’t say that a PRICE element must contain a number There’s no way to say that a MONTH element must be an integer between 1 and 12 There’s no way to indicate that a TITLE must contain between 1 and 255 characters
What’s wrong with DTDs? Cont… Data type is not needed for SGML was aimed For computer-to-computer exchange of information, data type is needed 2. The second problem is that DTDs have an unusual non-XML syntax Parsers and APIs that read an XML document can’t read a DTD
What’s wrong with DTDs? Cont… This above example is not a legal XML. Reason You can’t begin an element name with an exclamation point TITLE is not an attribute. Neither is (#PCDATA) 3. The third problem is that DTDs are only marginally extensible and don’t scale very well <!ELEMENT TITLE (#PCDATA)>
What’s wrong with DTDs? Cont… It’s difficult to combine independent DTDs together in a sensible way XML applications can be defined before the entire DTD becomes completely unmanageable and incomprehensible 4. DTDs don’t allow you to do things that it really feels like you ought to be able to do
What’s wrong with DTDs? Cont… DTDs cannot enforce the order or number of child elements in mixed content You can’t enforce constraints such as each PARAGRAPH element must begin with exactly one SUMMARY element that is followed by plain text
What’s wrong with DTDs? Cont… Schemas are an attempt to solve all these problems by defining a new XML-based syntax for describing the permissible contents of XML documents that includes the following Powerful data typing including range checking Namespace-aware validation based on namespace URIs rather than on prefixes Extensibility and scalability
What Is a Schema? Schema simply mean form or shape Schema was a description of all the tables in a database(Oracle, MySQL, etc..) and the fields in the table Schema also described what type of data each field could contain. E.g. CHAR,INT, DATE and so on. An XML Schema describes the structure of an XML document Different kinds of schemas from different technologies, including vocabulary schemas, RDF schemas, organizational schemas, X.500 schemas, and of course, XML schemas
The W3C XML Schema Language Was created by the W3C XML Schema Working Group based on many different submissions from a variety of companies and individuals There are no known patent, trademark, or other intellectual property restrictions that would prevent you from doing anything you might reasonably want to do with schemas
Hello Schemas cont… Schema can be written and saved in any text editor that knows how to save Unicode files Schema documents are XML documents and have all the privileges and responsibilities of other XML documents
Hello Schemas The greeting schema <?xml version=”1.0”?> Hello XML! </GREETING> <?xml version=”1.0”?> <xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema”> <xsd:element name=”GREETING” type=”xsd:string”/> </xsd:schema> Save as .xsd
Hello Schemas The root element of this and all other schemas is schema This must be in the http://www.w3.org/2001/XMLSchema namespace Namespace is bound to the prefix xsd or xs Elements are declared using xsd:element elements
Hello Schemas The name attribute specifies which element is being declared This xsd:element element also has a type attribute whose value is the data type of the element To attach a schema to a document, add an xsi:noNamespaceSchemaLocation attribute to the document’s root element
Complex Types The W3C XML Schema Language divides elements into complex and simple types A simple type element is one such as GREETING that can only contain text and does not have any attributes It cannot contain any child elements Complex type elements can have attributes and can have child elements Most documents need a mix of both complex and simple elements
Complex Types cont… <?xml version=”1.0”?> yesiam.xml <?xml version=”1.0”?> <SONG xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:noNamespaceSchemaLocation=”song.xsd”> <TITLE>Yes I Am</TITLE> <COMPOSER>Melissa Etheridge</COMPOSER> <PRODUCER>Hugh Padgham</PRODUCER> <PUBLISHER>Island Records</PUBLISHER> <LENGTH>4:24</LENGTH> <YEAR>1993</YEAR> <ARTIST>Melissa Etheridge</ARTIST> <PRICE>$1.25</PRICE> </SONG>
Complex Types cont… Schema that describes the xml song.xsd <?xml version=”1.0”?> <xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema”> <xsd:element name=”SONG” type=”SongType”/> <xsd:complexType name=”SongType”> <xsd:sequence> <xsd:element name=”TITLE” type=”xsd:string”/> <xsd:element name=”COMPOSER” type=”xsd:string”/> <xsd:element name=”PRODUCER” type=”xsd:string”/> <xsd:element name=”PUBLISHER” type=”xsd:string”/> <xsd:element name=”LENGTH” type=”xsd:string”/> <xsd:element name=”YEAR” type=”xsd:string”/> <xsd:element name=”ARTIST” type=”xsd:string”/> <xsd:element name=”PRICE” type=”xsd:string”/> </xsd:sequence> </xsd:complexType> </xsd:schema>
Complex Types cont… The xsd:complexType element defines a new type
Complex Types cont… minOccurs and maxOccurs Specify the minimum and maximum number of instances of the element that may appear at that point in the document The value of each attribute is an integer greater than or equal to zero The maxOccurs attribute can also have the value unbounded unbounded to indicate that an unlimited number of the particular element may appear See List 20-7
Complex Types cont… This schema says that every SongType element must have, in order: 1. Exactly one TITLE (minOccurs=”1” maxOccurs=”1”) 2. At least one, and possibly a great many, COMPOSERs (minOccurs=”1” maxOccurs=”unbounded”) 3. Any number of PRODUCERs, although possibly no producer at all (minOccurs=”0” maxOccurs=”unbounded”) 4. Either one PUBLISHER or no PUBLISHER at all (minOccurs=”0” maxOccurs=”1”) 5. Exactly one LENGTH (minOccurs=”1” maxOccurs=”1”) 6. Exactly one YEAR (minOccurs=”1” maxOccurs=”1”) 7. At least one ARTIST, possibly more (minOccurs=”1” 8. An optional PRICE, (minOccurs=”0” maxOccurs=”1”)
Complex Types cont… This is much more flexible and easier to use than the limited ?, *, and + that are available in DTDs If minOccurs and maxOccurs are not present, the default value of each is 1
Complex Types cont… Element content When elements contain other elements Listing 20-9
Complex Types cont… Element Content <xsd:complexType name=”ComposerType”> <xsd:sequence> <xsd:element name=”NAME” type=”xsd:string”/> </xsd:sequence> </xsd:complexType> <xsd:complexType name=”ProducerType”> <TITLE>Hot Cop</TITLE> <COMPOSER> <NAME>Jacques Morali</NAME> </COMPOSER> <NAME>Henri Belolo</NAME> <NAME>Victor Willis</NAME> <PRODUCER> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG>
Complex Types cont… Sharing content models <TITLE>Hot Cop</TITLE> <COMPOSER> <NAME>Jacques Morali</NAME> </COMPOSER> <NAME>Henri Belolo</NAME> <NAME>Victor Willis</NAME> <PRODUCER> </PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG> <xsd:complexType name=“PersonType”> <xsd:sequence> <xsd:element name=”NAME” type=”xsd:string”/> </xsd:sequence> </xsd:complexType>
Complex Types cont… Mixed Content <xsd:complexType name=”PersonType”> <xsd:sequence> <xsd:element name=”NAME”> <xsd:complexType mixed=”true”> <xsd:element name=”GIVEN” type=”xsd:string”/> <xsd:element name=”FAMILY” type=”xsd:string”/> </xsd:sequence> </xsd:complexType> </xsd:element> <COMPOSER> <NAME> Mr. <GIVEN>Jacques</GIVEN> <FAMILY>Morali</FAMILY> Esq. </NAME> </COMPOSER> Mr. <GIVEN>Henri</GIVEN> L. <FAMILY>Belolo</FAMILY>, M.D. Mr. <GIVEN>Victor</GIVEN> C. <FAMILY>Willis</FAMILY> <PRODUCER> Mr. <GIVEN>Jacques</GIVEN> S. <FAMILY>Morali</FAMILY> </PRODUCER>
Complex Types cont… Mixed Content de <xsd:complexType name=”PersonType”> <xsd:sequence> <xsd:element name=”NAME”> <xsd:complexType mixed=”true”> <xsd:element name=”GIVEN” type=”xsd:string”/> <xsd:element name=”FAMILY” type=”xsd:string”/> </xsd:sequence> </xsd:complexType> </xsd:element>
Simple Types xsd:gYear and xsd:duration <xsd:element name=”LENGTH” type=”xsd:duration”/> <xsd:element name=”YEAR” type=”xsd:gYear”/> <xsd:element name=”ARTIST” type=”xsd:string” maxOccurs=”unbounded”/> <xsd:element name=”PRICE” type=”xsd:string” minOccurs=”0”/> </xsd:sequence> </xsd:complexType> xsd:gYear and xsd:duration These declarations say that it’s no longer okay for the YEAR and LENGTH elements to contain just any old string of text Instead, they must contain strings in particular formats
Simple Types There are 44 built-in simple types in the W3C XML Schema Language These can be unofficially divided into seven groups ✦ Numeric types ✦ Time types ✦ XML types ✦ String types ✦ The boolean type ✦ The URI reference type ✦ The binary types Go to the Book for Examples
references Schema http://www.w3schools.com/schema/ Attribute http://www.w3schools.com/schema/schema_simple_attributes.asp