XML Schema Languages Dongwon Lee, Ph.D.

Slides:



Advertisements
Similar presentations
XML Schema Heewon Lee. Contents 1. Introduction 2. Concepts 3. Example 4. Conclusion.
Advertisements

1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
1 XML DTD & XML Schema Monica Farrow G30
SDPL 2003Notes 2: Document Instances and Grammars1 2.5 XML Schemas n A quick introduction to XML Schema –W3C Recommendation, May 2, 2001: »XML Schema Part.
XML Schema Definition Language
Lecture 14 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
1 Week5 – Schema Why Schema? Schemas vs. DTDs Introduction – W3C vs. Microsoft XDR Schema, How To? Element Types – Simple vs. Complex Attributes Restrictions/Facets.
Sunday, June 28, 2015 Abdelali ZAHI : FALL 2003 : XML Schemas XML Schemas Presented By : Abdelali ZAHI Instructor : Dr H.Haddouti.
Introduction to XML This material is based heavily on the tutorial by the same name at
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Lecture 15 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
XML Schema Vinod Kumar Kayartaya. What is XML Schema?  XML Schema is an XML based alternative to DTD  An XML schema describes the structure of an XML.
Dr. Azeddine Chikh IS446: Internet Software Development.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Schemas Ellen Pearlman Eileen Mullin Programming the Web Using XML.
XML Language Family Detailed Examples Most information contained in these slide comes from: These slides are intended.
Introduction to XML. What is XML? Extensible Markup Language XML Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
1 Dr Alexiei Dingli XML Technologies X-Schema. 2 XML-based alternative to DTD Describes the structure of an XML document Also referred to as XML Schema.
Session IV Chapter 9 – XML Schemas
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
SDPL 2005Notes 2.5: XML Schemas1 2.5 XML Schemas n Short introduction to XML Schema –W3C Recommendation, 1 st Ed. May, 2001; 2 nd Ed. Oct, 2004: »XML Schema.
New Perspectives on XML, 2nd Edition
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
An Introduction to XML Sandeep Bhattaram
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 5 XML Schema (Based on Møller and Schwartzbach,
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XSD Presented by Kushan Athukorala. 2 Agenda XML Namespaces XML Schema XSD Indicators XSD Data Types XSD Schema References.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
Management of XML and Semistructured Data Lecture 10: Schemas Monday, April 30, 2001.
Processing of structured documents Spring 2003, Part 3 Helena Ahonen-Myka.
XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
CITA 330 Section 4 XML Schema. XML Schema (XSD) An alternative industry standard for defining XML dialects More expressive than DTD Using XML syntax Promoting.
4 Copyright © 2004, Oracle. All rights reserved. Validating XML by Using XML Schema.
SDPL : XML Schemas1 2.5 XML Schemas n Short introduction to XML Schema –W3C Recommendation, 1 st Ed. May, 2001; 2 nd Ed. Oct, 2004: »XML Schema.
MSc in Communication Sciences Program in Technologies for Human Communication Davide Eynard Facoltà di scienze della comunicazione Università.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML Schemas Dr. Awad Khalil Computer Science Department AUC.
XML: Extensible Markup Language
XML Schema.
XML Schema Languages 1/2 Dongwon Lee, Ph.D.
CMP 051 XML Introduction Session IV
Lecture 9 XML & its applications
Database Systems Week 12 by Zohaib Jan.
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
دانشكده مهندسي كامپيوتر
Design and Implementation of Software for the Web
Web Programming Maymester 2004
DTD and XML Schema.
ece 720 intelligent web: ontology and beyond
New Perspectives on XML
CMP 051 XML Introduction Session III
XML Technologies X-Schema.
DTD (Document Type Definition)
Lecture 9 XML & its applications
eXtensible Markup Language XML
XML: The new standard -Eric Taylor.
XML Schema Diyar A. Abdulqder
New Perspectives on XML
Presentation transcript:

XML Schema Languages Dongwon Lee, Ph.D. The Pennsylvania State University IST 516 / Fall 2010 http://www.practicingsafetechs.com/TechsV1/XMLSchemas/

In the Last Class

Today DUE: Proj #1 Planning Doc Assignment: Lab #1 Team Based Individual

Course Objectives Understand the purpose of using schemas Study regular expression as the framework to formalize schema languages Understand details of DTD and XML Schema Learn the concept of schema validation

Outline Schema Language in general Content Model DTD XML Schema (XSD) Schema Validation

Motivation The company “Nittany Vacations, LLC” wants to standardize all internal documents using the XML-based format  nvML Gather requirements from all employees Informal description (ie, narrative) of how nvML should look like Q How to describe formally and unambiguously? How to validate an nvML document?

Motivation: Schema = Rules XML Schemas is all about expressing rules: Rules about what data is allowed Rules about how the data must be organized Rules about the relationships between data

Example modified from Roger L. Costello’s slides @ xfront.com Motivation: sep-9.xml <Vacation date=“2010-09-09” guide-by=“Lee”> <Trip segment="1" mode="air"> <Transportation>airplane<Transportation> </Trip> <Trip segment="2" mode="water"> <Transportation>boat</Transportation> <Trip segment="3" mode="ground"> <Transportation>car</Transportation> </Vacation> Example modified from Roger L. Costello’s slides @ xfront.com

Motivation: Validate Validate the XML document against the XML Schema <Vacation date=“2010-09-09” guide-by=“Lee”> <Segment id="1" mode="air"> <Transportation>airplane</Transportation> </Segment> <Segment id="2" mode="water"> <Transportation>boat</Transportation> <Segment id="3" mode="ground"> <Transportation>car</Transportation> </Vacation> Validate the XML document against the XML Schema nvML.dtd or nvML.xsd XML Schema = RULES Rule 1: A vacation has segments. Rule 2: Each segment is uniquely identified. Rule 3: There are three modes of transportation: air, water, gound. Rule 4: Each segment has a mode of transportation. Rule 5: Each segment must identify the specific mode used.

Schema Languages Schema: a formal description of structures / constraints Eg, relational schema describes tables, attributes, keys, .. Schema Language: a formal language to describe schemas Eg, SQL DDL for relational model CREATE TABLE employees ( id INTEGER PRIMARY KEY, first_name CHAR(50) NULL, last_name CHAR(75) NOT NULL, dateofbirth DATE NULL );

Rules in Formal Schema Lang. Why bother formalizing the syntax with a schema? A formal definition provides a precise but human-readable reference Schema processing can be done with existing implementations One’s own tools for own language can benefit: by piping input documents through a schema processor, one can assume that the input is valid and defaults have been inserted

Schema Processing http://www.brics.dk/~amoeller/XML/schemas/schemas.html

Requirements for Schema Lang. Expressiveness Efficiency Comprehensibility

Regular Expressions (RE) Commonly used to describe sequences of characters or elements in schema languages RE to capture content models Σ: a finite Alphabet α in Σ: set only containing the character α α ?: matches zero or one α α *: matches zero or more α’s α +: matches one ore more α’s α β: matches the concatenation of α and β α | β: matches the union of α and β

RE Examples a|b* denotes {ε, a, b, bb, bbb, ...} (a|b)* denotes the set of all strings with no symbols other than a and b, including the empty string: {ε, a, b, aa, ab, ba, bb, aaa, ...} ab*(c|ε) denotes the set of strings starting with a, then zero or more bs and finally optionally a c: {a, ac, ab, abc, abb, abbc, ...}

RE Examples Valid integers: Valid contents of table element in XHTML: 0 | -? (1|2|3|4|5|6|7|8|9) (1|2|3|4|5|6|7|8|9) * Valid contents of table element in XHTML: caption ? (col * | colgroups *) thead ? tfoot ? (tbody * | tr *)

Which Schema Language? Many proposals competing for acceptance W3C Proposals: DTD, XML Data, DCD, DDML, SOX, XML-Schema, … Non-W3C Proposals: Assertion Grammars, Schematron, DSD, TREX, RELAX, XDuce, RELAX-NG, … Different applications have different needs from a schema language

By Rick Jelliffe

Expressive Power (content model) DTD XML-Schema XDuce, RELAX-NG “Taxonomy of XML Schema Languages using Formal Language Theory”, Makoto Murata, Dongwon Lee, Murali Mani, Kohsuke Kawaguchi, In ACM Trans. on Internet Technology (TOIT), Vol. 5, No. 4, page 1-45, November 2005

Closure (content model) DTD XML-Schema XDuce, RELAX-NG Closed under INTERSECT Closed under INTERSECT, UNION, DIFFERENCE

DTD: Document Type Definition XML DTD is a subset of SGML DTD XML DTD is the standard XML Schema Language of the past (and present maybe…) It is one of the simplest and least expressive schema languages proposed for XML model It does not use XML tag notation, but use its own weird notation It cannot express relatively complex constraint (eg, key with scope) well It is being replaced by XML-Schema of W3C and RELAX-NG of OASIS

DTD: Elements <!ELEMENT element-name content-model> Associates a content model to all elements of the given name content models EMPTY: no content is allowed ANY: any content is allowed Mixed content: (#PCDATA | e1 | … | en)* arbitrary sequence of character data and listed elements

DTD: Elements Eg: “Name” element consists of an optional FirstName, followed by mandatory LastName elements, where Both are text string <!ELEMENT Name (FirstName? , LastName) <!ELEMENT FirstName (#PCDATA)> <!ELEMENT LastName (#PCDATA)

DTD: Attributes <!ATTLIST element-name attr-name attr-type attr-default ...> Declares which attributes are allowed or required in which elements attribute types: CDATA: any value is allowed (the default) (value|...): enumeration of allowed values ID, IDREF, IDREFS: ID attribute values must be unique (contain "element identity"), IDREF attribute values must match some ID (reference to an element) ENTITY, ENTITIES, NMTOKEN, NMTOKENS, NOTATION: consider them obsolete…

DTD: Attributes Attribute defaults: #REQUIRED: the attribute must be explicitly provided #IMPLIED: attribute is optional, no default provided "value": if not explicitly provided, this value inserted by default #FIXED "value": as above, but only this value is allowed

DTD: Attributes Eg: “Name” element consists of an optional FirstName, followed by mandatory LastName attributes, where Both are text string <!ELEMENT Name (EMPTY)> <!ATTLIST Name FirstName CDATA #IMPLIED LastName CDATA #REQUIRED>

DTD: Attributes ID vs. IDREF/IDREFS ID: document-wide unique ID (like key in DB) IDREF: referring attribute (like foreign key in DB) <!ELEMENT employee (…)> <!ATTLIST employee eID ID #REQUIRED boss IDREF #IMPLIED> … <employee eID=“a”>…</>…. <employee eID=“b” boss=“a”>…</>

DTD Example <?xml version="1.0"?> XML document that conforms to “event.dtd” <?xml version="1.0"?> <!DOCTYPE event SYSTEM “../../dir/event.dtd” <event eID=“sigmod02”> <acronym>SIGMOD</acronym> <society>ACM</society> <url>www.sigmod02.org</url> <loc> <city>Madison</city> <state>WI</state> </loc> <year>2002</year> </event>

DTD: Example // event.dtd <!ELEMENT event (acronym, society*,url?, loc, year)> <!ATTLIST event eID ID #REQUIRED> <!ELEMENT acronym (#PCDATA)> <!ELEMENT society (#PCDATA)> <!ELEMENT url (#PCDATA)> <!ELEMENT loc (city, state)> <!ELEMENT city (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT year (#PCDATA)>

DTD: Type Declaration <?xml version="1.0"?> Associate a DTD schema with an XML document At the beginning lines of an XML document // DTD File: event.dtd <!ELEMENT …. > // XMLFile: event.xml <?xml version="1.0"?> <!DOCTYPE event SYSTEM “http://foo.com/event.dtd">

Exercise: Citation Authors Papers Publish *AID Name Title *PID Area Venues Attend Appear *Vname Year City 31

Exercise: Citation To RDBMS: Authors(*AID, Name, Title) Papers(*PID, Area, *Vname) Venues(*Vname, Year, City) Publish(*AID, *PID) Attend(*AID, *Pname) Appear(*PID, *Vname) 32

Exercise: Citation Relational  XML: Create a dummy root element Make entities as 1st-level children Make columns of entities as attributes of those Relationship as attributes No violation of 1st normal form for many-many relationship like RDBMS One => IDREF, Many => IDREFS 33

Exercise: Citation “Lee” publishes two papers “p1” and “p2” which appear in venues “X” and “Y” in 2006, respectively, and attend only “Y”. “p2” is co-authored by “John” who attends “X”. <Author AID=‘1’ Name=‘Lee’ Title=‘Prof.’/> <Author AID=‘2’ Name=‘John’ Title=‘Prof.’/> <Paper PID=‘p1’ Area=‘DB’/> <Paper PID=‘p2’ Area=‘DB’/> <Venue Vname=‘X’ Year=‘2006’ … /> <Venue Vname=‘Y’ Year=‘2006’ … /> 34

Exercise: Citation “Lee” publishes two papers “p1” and “p2” which appear in venues “X” and “Y” in 2006, respectively, and attend only “Y”. “p2” is co-authored by “John” who attends “X”. <Author AID=‘1’ Name=‘Lee’ Title=‘Prof.’/> <Author AID=‘2’ Name=‘John’ Title=‘Prof.’/> <Paper PID=‘p1’ Area=‘DB’ Vname=‘X’ /> <Paper PID=‘p2’ Area=‘DB’ Vname=‘Y’ /> <Venue Vname=‘X’ Year=‘2006’ … /> <Venue Vname=‘Y’ Year=‘2006’ … /> 35

Exercise: Citation “Lee” publishes two papers “p1” and “p2” which appear in venues “X” and “Y” in 2006, respectively, and attend only “Y”. “p2” is co-authored by “John” who attends “X”. <Author AID=‘1’ Name=‘Lee’ Title=‘Prof.’ Publish=‘p1 p2’ Attend=‘Y’ /> <Author AID=‘2’ Name=‘John’ Title=‘Prof.’ Publish=‘p2’ Attend=‘X’ /> <Paper PID=‘p1’ Area=‘DB’ Vname=‘X’ /> <Paper PID=‘p2’ Area=‘DB’ Vname=‘Y’ /> <Venue Vname=‘X’ Year=‘2006’ … /> <Venue Vname=‘Y’ Year=‘2006’ … /> 36

Exercise: Citation <Dummy> </Dummy> <Author AID=‘1’ Name=‘Lee’ Title=‘Prof.’ Publish=‘p1 p2’ Attend=‘Y’ /> <Author AID=‘2’ Name=‘John’ Title=‘Prof.’ Publish=‘p2’ Attend=‘X’ /> <Paper PID=‘p1’ Area=‘DB’ Vname=‘X’ /> <Paper PID=‘p2’ Area=‘DB’ Vname=‘Y’ /> <Venue Vname=‘X’ Year=‘2006’ … /> <Venue Vname=‘Y’ Year=‘2006’ … /> </Dummy> 37

Exercise: Citation <!ELEMENT Dummy (Author*|Paper*|Venue*)> <!ELEMENT Author EMPTY> <!ATTLIST Author AID ID #REQUIRED Name CDATA #IMPLIED Title CDATA #IMPLIED Publish IDREFS #IMPLIED Attend IDREFS #IMPLIED> <!ELEMENT Paper EMPTY> <!ATTLIST Paper PID ID #REQUIRED Area CDATA #IMPLIED Vname IDREF #REQUIRED> <!ELEMENT Venue EMPTY> <!ATTLIST Venue Vname ID #REQUIRED Year CDATA #IMPLIED City CDATA #IMPLIED> 38

XML Schema New XML schema language from W3C Successor of DTD Unlike DTD, XML Schema is in XML syntax http://www.w3.org/XML/Schema <xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType>

XML Schema vs. DTD: What’s New XML Schemas are extensible to future additions XML Schema V 1.0  1.1  … XML Schemas are richer and more powerful than DTDs XML Schemas are written in XML No <!ELEMENT …> or <!ATTLIST ..> notation XML Schemas support data types XML Schemas support namespaces

New: Data Types XML Schema support data types. Easier to: Describe allowable document content Validate the correctness of data Work with data from a database Define data facets (restrictions on data) Define data patterns (data formats) Convert data between different data types Eg, <date type="date">2010-09-11</date> Ensures a mutual understanding of the content The XML data type "date" requires the format “YYYY-MM-DD”

New: in XML Notation XML Schema uses XML notation <> and </> XML Schema file itself IS an XML file, too No need to learn a new language No need to use new tools Use an XML editor to edit XML Schema files Use XML parser to parse XML Schema files Manipulate an XML Schema using DOM Transform an XML Schema with XSLT

New: Extensibility XML Schema is extensible because XML is extensible XML Schema lets you: Reuse your schema in other schemas Create your own data types derived from the standard types  Inheritance Reference multiple schemas in the same document

Well-Formed: Not Enough Well-Formed: a document conforms to XML syntax rules such as: Begin with XML decl. One unique root Case-sensitive Matching Start / End tags Properly nested Well-formed documents can still contain semantic errors or inconsistencies  Need VALID documents according to schema

note.xml <?xml version="1.0"?> // Reference to schema goes here <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

note.dtd <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

note.xml with Reference to DTD <?xml version="1.0"?> <!DOCTYPE note SYSTEM "http://www.w3schools.com/dtd/note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

note.xsd <?xml version="1.0"?> <xs:schema xmlns:xs= “http://www.w3.org/2001/XMLSchema” targetNamespace= “http://www.w3schools.com” xmlns= “http://www.w3schools.com” elementFormDefault= "qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

<schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://www.w3schools.com” xmlns = “http://www.w3schools.com” elementFormDefault= "qualified"> . . . </xs:schema> <schema> element is the root element of every XML Schema

<schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://www.w3schools.com” xmlns = “http://www.w3schools.com” elementFormDefault= "qualified"> . . . </xs:schema> Elements & data types in this schema file come from http://www.w3.org/2001/XMLSchema namespace They are to be prefixed with “xs:”

<schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://www.w3schools.com” xmlns = “http://www.w3schools.com” elementFormDefault= "qualified"> . . . </xs:schema> Indicates that the elements defined by this schema (eg, note, to, from, heading, body.) come from the target namespace

<schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://www.w3schools.com” xmlns = “http://www.w3schools.com” elementFormDefault= "qualified"> . . . </xs:schema> Default namespace

<schema> element <?xml version="1.0"?> <xs:schema xmlns:xs = “http://www.w3.org/2001/XMLSchema” targetNamespace = “http://www.w3schools.com” xmlns = “http://www.w3schools.com” elementFormDefault= "qualified"> . . . </xs:schema> Any elements used by the XML instance document which were declared in this schema must be namespace qualified.

note.xml with Reference to XML Schema <?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd”> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

note.xml with Reference to XML Schema <?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd”> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> Default namespace for the “note.xml” file Tell schema validator that all the elements used in “note.xml” file are declared in this namespace

note.xml with Reference to XML Schema <?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd”> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> Once the XML Schema Instance namespace is available  can use schemaLocation attribute

note.xml with Reference to XML Schema <?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd”> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> First value: the namespace to use Second value: the name/location of the XML schema to use for that namespace

Main Features XML Schema defines elements Simple elements: contains only “text” No sub-elements or attributes “text” can be of different types Types from XML schema built-in Eg, boolean, string, date User-defined types Can add restrictions (facets) to a data type to limit its content

Simple Element Common built-in types in XML Schema: <xs:element name="xxx" type="yyy"/> “xxx”: the name of the element “yyy”: the data type of the element Common built-in types in XML Schema: xs:string xs:decimal xs:integer xs:boolean xs:date xs:time

Simple Element Some simple XML elements: <lastname>Refsnes</lastname> <age>36</age> <dateborn>1970-03-27</dateborn> Corresponding simple element definitions: <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>

Simple Element Simple elements may have a default value OR a fixed value specified. Default value is automatically assigned to the element when no other value is specified <xs:element name="color" type="xs:string" default="red"/> Fixed value is also automatically assigned to the element, and one cannot specify another value <xs:element name="color" type="xs:string" fixed="red"/>

<xs:attribute> The syntax for defining an attribute is: <xs:attribute name="xxx" type="yyy"/> Where xxx is the name of the attribute and yyy specifies the data type of the attribute. Simple elements can’t have attributes!

<xs:attribute> An XML element with an attribute: <lastname lang="EN">Smith</lastname> Corresponding attribute definition: <xs:attribute name="lang" type="xs:string"/> Attributes can have default or fixed values. If the attribute is required, add use=“required”

Conforming to Types When an XML element or attribute has a data type defined, it puts restrictions on the element's or attribute's content If an XML element is of type "xs:date" and contains a string like "Hello World", the element will not validate With XML Schemas, you can also add your own restrictions to your XML elements and attributes

Constraining User-Defined Types Defines an element called "age" with a restriction The value of age cannot be lower than 0 or greater than 120 <xs:element name="age"> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> <xs:maxInclusive value="120"/> </xs:restriction> </xs:simpleType> </xs:element>

Constraining User-Defined Types Defines an element called "car" with a restriction The only acceptable values are: Audi, Golf, BMW: <xs:element name="car" type="carType"/> <xs:simpleType name="carType"> <xs:restriction base="xs:string"> <xs:enumeration value="Audi"/> <xs:enumeration value="Golf"/> <xs:enumeration value="BMW"/> </xs:restriction> </xs:simpleType> Note: In this case the type "carType" can be used by other elements because it is not a part of the "car" element.

Complex Element What is a Complex Element? A complex element is an XML element that contains other elements and/or attributes. There are four kinds of complex elements: empty elements elements that contain only other elements elements that contain only text elements that contain both other elements and text Note: Each of these elements may contain attributes as well!

Complex Element: Type 1 A complex XML element, "product", which is empty: <product pid="1345"/>

Complex Element: Type 2 A complex XML element, "employee", which contains only other elements: <employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee>

Complex Element: Type 3 A complex XML element, "food", which contains only text: <food type="dessert">Ice cream</food>

Complex Element: Type 4 A complex XML element, "description", which contains both elements and text: <description> It happened on <date lang="norwegian">03.03.99</date> .... </description>

Eg, Define a Complex Element Type 2: element with only sub-elements <employee> <firstname>John</firstname> <lastname>Smith</lastname> </employee>

Eg, Define a Complex Element Method 1: no re-use foreseen <xs:element name="employee"> <xs:complexType> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname“ </xs:sequence> </xs:complexType> </xs:element>

Eg, Define a Complex Element Method 2: can reuse “myInfo” type <xs:element name="employee” type=“myInfo”> <xs:complexType name=“myInfo”> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname“ </xs:sequence> </xs:complexType> </xs:element>

Eg, Define a Complex Element Method 2: 3 elements can reuse “myInfo” type <xs:element name="employee" type="myInfo"/> <xs:element name="student" type="myInfo"/> <xs:element name="member" type="myInfo"/> <xs:complexType name="myInfo"> <xs:sequence> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complexType>

Indicators Order Occurrence Group All: in any order, only once Choice: either A or B occur Sequence: appear in a specific order Occurrence maxOccurs minOccurs Group Group Name attributeGroup Name

Indicators Example <?xml version="1.0" encoding="ISO-8859-1"?> <persons xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="family.xsd” <person> <full_name>Hege Refsnes</full_name> <child_name>Cecilie</child_name> </person> <person> <full_name>Tove Refsnes</full_name> <child_name>Hege</child_name> <child_name>Stale</child_name> <child_name>Jim</child_name> <child_name>Borge</child_name> </person> <person> <full_name>Stale Refsnes</full_name> </person> </persons>

Indicators Example <?xml version="1.0" encoding="ISO-8859-1"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="persons">   <xs:complexType>     <xs:sequence>       <xs:element name="person" maxOccurs="unbounded">         <xs:complexType>           <xs:sequence>             <xs:element name="full_name" type="xs:string"/>             <xs:element name="child_name" type="xs:string"             minOccurs="0" maxOccurs="5"/>           </xs:sequence>         </xs:complexType>       </xs:element>     </xs:sequence>   </xs:complexType> </xs:element> </xs:schema>

DTD vs. XML Schema <!ELEMENT e1 ((e2,e3?)+|e4)> <element name=“e1”> <complexType> <choice> <sequence maxOccurs=“unbounded”> <element ref=“e2”/> <element ref=“e3” minOccurs=“0”/> </sequence> <element ref=“e4”> </choice> </complexType> </element>

Schema Validation http://www.w3.org/2001/03/webdata/xsv

Schema Validation: DTD

Schema Validation: XML Schema

Schema Editors Many open-source or shareware on the Web Eg, XMLSpy Visual schema writing (DTD, XML Schema) And More Xpath Xquery XSLT WSDL SOAP

Your Project #1 If your project is to use XML data at some point Then you should define your schema using either DTD or XML Schema And validate all your XML data according to your schema Include this aspect as part of your project #1 tasks

Lab #1 (DUE: Sep. 15) Individual Lab Given XML files, infer DTD and XML Schema Validate them using W3C’s schema validator Accessibly from the Web Submit DTD and XML Schema files Screenshots showing validation succeeded

References An Introduction to XML and Web Technologies, Anders Møller and Michael I. Schwartzbach, Addison-Wesley, 2006 http://www.brics.dk/~amoeller/XML/schemas/ W3Schools XML Schema Tutorial Much of slides for the XML Schema (2nd half) are modified from W3Schools materials http://www.w3schools.com/schema/default.asp Zvon XML Schema Tutorial http://www.zvon.org/comp/m/schema.html