XML Schema – Part 1 1.Introduction to XML-Schema 2.Schema basics 3.Mechanisms (strategies) for Designing Schema 4.Creating your own Datatypes.

Slides:



Advertisements
Similar presentations
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
Advertisements

1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
1 XML DTD & XML Schema Monica Farrow G30
SDPL 2003Notes 2: Document Instances and Grammars1 2.5 XML Schemas n A quick introduction to XML Schema –W3C Recommendation, May 2, 2001: »XML Schema Part.
An Introduction to XML Schema CSCI 7818 by Ming Rutar.
CSE 636 Data Integration XML Schema. 2 XML Schemas W3C Recommendation: Generalizes DTDs Uses XML syntax Two documents: structure.
A Simple Schema Design. First Schema Design Being a Dog Is a Full-Time Job Charles M. Schulz Snoopy Peppermint Patty extroverted beagle Peppermint.
XML Schemas Lecture 10, 07/10/02. Acknowledgements A great portion of this presentation has been borrowed from Roger Costello’s excellent presentation.
1 Substitution Groups in XML Schemas Tomer Shiran Winter 2003/4 Semester.
DECO 3002 Advanced Technology Integrated Design Computing Studio Tutorial 6 – XML Schema School of Architecture, Design Science and Planning Faculty of.
1 XML Schemas Marco Mesiti This Presentation has been extracted from Roger L. Costello (XML Technologies Course)
XML Schemas and Namespaces Lecture 11, 07/10/02. BookStore.dtd.
ΑΝΑΠΑΡΑΣΤΑΣΗ ΓΝΩΣΗΣ ΣΤΟΝ ΠΑΓΚΟΣΜΙΟ ΙΣΤΟ XML Schema
Sunday, June 28, 2015 Abdelali ZAHI : FALL 2003 : XML Schemas XML Schemas Presented By : Abdelali ZAHI Instructor : Dr H.Haddouti.
Introduction to XML Schema Winter Sources XML Schema Part 1: Structures W3C Recommendation 2 May 2001,
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Unit 4 – XML Schema XML - Level I Basic.
Introduction to XML This material is based heavily on the tutorial by the same name at
Introduction to XML: Part I By Sandeep Jangity CS 157B, Section 2 Dr. Lee.
Processing of structured documents Spring 2003, Part 3 Helena Ahonen-Myka.
17 Apr 2002 XML Schema Andy Clark. What is it? A grammar definition language – Like DTDs but better Uses XML syntax – Defined by W3C Primary features.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Schema Design „Advanced XML Schema“ Lecture on Walter Kriha.
SDPL 2002Notes 2: Document Instances and Grammars1 2.5 XML Schemas n A quick introduction to XML Schema –W3C Recommendation, May 2, 2001: »XML Schema Part.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Schema 1 Lecturer.
XML Schema Vinod Kumar Kayartaya. What is XML Schema?  XML Schema is an XML based alternative to DTD  An XML schema describes the structure of an XML.
1 XML Schemas. 2 Useful Links Schema tutorial links:

Dr. Azeddine Chikh IS446: Internet Software Development.
Copyright © [2001]. Roger L. Costello. All Rights Reserved. 1 XML Schemas (Primer)
Neminath Simmachandran
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Schemas Ellen Pearlman Eileen Mullin Programming the Web Using XML.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XML and friends Part 2 - XML Schema ELAG 2001 workshop 8 Jan Erik Kofoed © BIBSYS Library Automation.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
Introduction to XML. What is XML? Extensible Markup Language XML Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Ceng 520 XML Schemas IntroductionXML Schemas 2 Part 0: Introduction Why XML Schema?
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Schema 2 Lecturer.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
Beginning XML 4th Edition. Chapter 5: XML Schemas.
SDPL 2005Notes 2.5: XML Schemas1 2.5 XML Schemas n Short introduction to XML Schema –W3C Recommendation, 1 st Ed. May, 2001; 2 nd Ed. Oct, 2004: »XML Schema.
New Perspectives on XML, 2nd Edition
XML Schema. Why Schema? To define a class of XML documents Serve same purpose as DTD “Instance document" used for XML document conforming to schema.
An OO schema language for XML SOX W3C Note 30 July 1999.
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
XML Extras Outline 1 - XML in 10 Points 2 - XML Family of Technologies 3 - XML is Modular 4 - RDF and Semantic Web 5- XML Example: UK GovTalk Group’s Schema.
An Introduction to XML Sandeep Bhattaram
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 5 XML Schema (Based on Møller and Schwartzbach,
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
Working with XML Schemas ©NIITeXtensible Markup Language/Lesson 3/Slide 1 of 36 Objectives In this lesson, you will learn to: * Declare attributes in an.
XML Schema (W3C) Thanks to Jussi Pohjolainen TAMK University of Applied Sciences.
XML Validation II Schemas Robin Burke ECT 360. Outline Namespaces Documents  Data types XML Schemas Elements Attributes Derived data types RELAX NG.
Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: (x2073)
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Schema Definition (XSD). Definition of a Schema It is a model for describing the structure and content of data The XML Schema was developed as a content.
Tutorial 2: XML Working with Namespaces. 2 Name Collision This figure shows two documents each with a Name element.
Lecture 0 W3C XML Schema. Topics Status Motivation Simple type vs. complex type.
4 Copyright © 2004, Oracle. All rights reserved. Validating XML by Using XML Schema.
1 XML and XML in DLESE Katy Ginger November 2003.
XML QUESTIONS AND ANSWERS
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
New Perspectives on XML
Presentation transcript:

XML Schema – Part 1 1.Introduction to XML-Schema 2.Schema basics 3.Mechanisms (strategies) for Designing Schema 4.Creating your own Datatypes

Simple XML Alice Smith 123 Maple Street Mill Valley CA Lawnmower Confirm this is electric Baby Monitor

XML 1. XML is for structuring data spreadsheets, address books, configuration parameters, financial transactions, and technical drawings. XML is text format for representing structured data. XML makes it easy for a computer to generate data, read data, and ensure that the data structure is unambiguous. XML is extensible, platform-independent, and it supports internationalization and localization. 2. XML looks a bit like HTML Like HTML, XML makes use of tags (words bracketed by ' ') and attributes (of the form name="value"). While HTML specifies what each tag and attribute means, and often how the text between them will look in a browser, XML uses the tags only to delimit pieces of data, and leaves the interpretation of the data completely to the application that reads it. if you see " " in an XML file, do not assume it is a paragraph. Depending on the context, it may be a price, a parameter, a person, a p... (and who says it has to be a word with a "p"?). XML can keep data separated from your HTML

3. XML is a group of technologies XML 1.0 is the specification that defines what "tags" and "attributes" are. Beyond XML 1.0, "the XML family" is a growing set of modules that offer useful services to accomplish important and frequently demanded tasks. XPointer and XFragments are syntaxes in development for pointing to parts of an XML document. An XPointer is a bit like a URL, but instead of pointing to documents on the Web, it points to pieces of data inside an XML file. XSLT, a transformation language used for rearranging, adding and deleting tags and attributes. The DOM is a standard set of function calls for manipulating XML (and HTML) files from a programming language. XML Schemas help developers to precisely define the structures of their own XML-based formats. There are several more modules and tools available or under development. Keep an eye on W3C's technical reports page.XML 1.0XPointerXSLTDOMW3C's technical reports page

Purpose of XML Schemas (and DTDs) Specify: –the structure of instance documents "this element contains these elements, which contains these other elements, etc" –the datatype of each element/attribute "this element shall hold an integer with the range 0 to 12,000" (DTDs don't do too well with specifying datatypes like this)

What is a Schema? A piece of information marked up by presence of tags is called element. Elements may further be enriched by attaching name-value pairs called attributes. Like Data Type Definitions (DTDs) Define the document's structure. Elements and attributes definition Empty or text content elements. Default values for attributes and elements. More powerful and flexible than DTDs. XML syntax. Agreed upon Schema Exchanging XML data. Verify the received data against schema. Valid and well-formed.

<xs:element name="name"type="xs:string"/> Example 1

Structure of the Data BOOK title author character name dob isbn We want to define this structure in the schema Pets M.Cat Snoopy 1966

Pets M. Cat Snoopy 1950 Patty 1966 XML Schema Example 1

Explanations: Type ( xsd:string ) is prefixed by the namespace prefix associated with XML Schema, indicating a predefined XML Schema datatype: Specify both minOccurs and maxOccurs. unbounded value, default value (one). Only in local definition Facets …/… Attributes after element declarations. Compositors: Sequence (ordered sequence) All (no order but all) Choice

Simple/Complex Data type SimpleType is reserved for data types holding only values and no attribute or element sub-nodes. ComplexType - data types which aren’t simple (Book element has attributes and children elements)

Compositors : Using and <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified">

Compositors: Expressing Any Order <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> ……. Problem: create an element, Book, which contains Author, Title, Date, ISBN, and Publisher, in any order (Note: this is very difficult and ugly with DTDs).

Namespaces A namespace is a collection of names used as element or attribute names in an XML document. to qualify element names to make them unique to avoid conflicts between elements with the same name. xmlns - keyword for a namespace declaration.. The idea is that when you are dealing with XML documents from ten different (external) sources, name collisions can occur. If you use namespaces, you are distinguishing one element from another based on the namespace prefix. myNameSpace:Employee is not the same as yourNameSpace:Employee. When you declare a namespace you also give it an unique URI ( Universal Resource). explicit or default declaration.

Explicit and Default namespace declaration Explicit declaration (bk – qualifier) Tourist guide Default Tourist guide Identified by a Universal Resource Identifier (URI) or by Uniform Resource Locator (URL) It doesn't matter what the URI points to. URIs are used because they are globally unique across the Internet.

The elements and datatypes that are used to construct schemas - schema - element - complexType - sequence - string come from the ma namespace Example 1 Book.xsd

enc targetNamespace=" xmlns=" elementFormDefault="qualified"> The default namespace is which is the targetNamespace! Example 1 “qualified“ This is a directive to any instance documents which conform to this schema: Any elements used by the instance document which were declared in this schema must be namespace qualified. The Book in what namespace? Since thereis no namespace qualifier it is referencing the Book element in the default namespace, which is the targetNamespace! Thus, this is a reference to the Book element declaration in this schema.

XML Schema Namespace element complexType schema sequence string integer boolean (schema-for-schemas)

Book Namespace(target namespace) ISBN Book Charac ter Title Auth or name dob

<Book xmlns =" xmlns:xsi=" xsi:schemaLocation=" Book.xsd"> … My Life and Times Paul McCartney First, using a default namespace declaration, tell the schema-validator that all of the elements used in this instance document come from the namespace. 2. Second, with schemaLocation tell the schema-validator that the namespace is defined by Book.xsd (i.e., schemaLocation contains a pair of values). 3. Third, tell the schema-validator that the schemaLocation attribute we are using is the one in the XML Schema-instance namespace XMLSchema and XML instance document

Referencing a schema in an XML instance document Book.xml Book.xsd targetNamespace=" schemaLocation=" ww.books.org Book.xsd" - defines elements in namespace - uses elements from namespace A schema defines a new vocabulary. Instance documents use that new vocabulary.

Note multiple levels of checking Book.xmlBook.xsdXMLSchema.xsd (schema-for-schemas) Validate that the xml document conforms to the rules described in Book.xsd Validate that Book.xsd is a valid schema document, i.e., it conforms to the rules described in the schema-for-schemas

Russian Doll Design Example 1

Flat Catalog <xs:element ref="character" minOccurs="0 ” maxOccurs="unbounded"/> Example 2 First definition of the simple types These are global definitions Next definition of attributes Next, definition of complex types The definition of the cardinality is done when the elements are referenced

Summary Mechanisms of definitions Russian Doll Design : Tight structure Multiple occurrences of a same element name with different definitions. Depth in the embedded definitions Hardly readable and difficult to maintain when documents are complex. Flat Catalog : Catalog of all the elements. references to element and attribute definitions that need to be within the scope of the referencer. Using a reference to an element or an attribute is somewhat comparable to cloning an object. The element or attribute is defined first, and it can be duplicated at another place in the document structure by the reference mechanism, in the same way an object can be cloned. The two elements (or attributes) are then two instances of the same class.

Anonymous types (no name) <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> Russian Doll Design Example 3

Named Types <xsd:schema xmlns:xsd=" targetNamespace=" xmlns=" elementFormDefault="qualified"> …/….. …/.. Named type The advantage of splitting out Book's element declarations and wrapping them in a named type is that now this type can be reused by other elements. Example 4

Please note that: is equivalent to: Element A instantiates complexType foo. Element A has the complexType definition inlined in the element declaration.

Summary Definition Mechanisms Russian Doll Design: Tightly follows the structure Define each element and attribute within its context and to allow multiple occurrences of a same element name to carry different definitions. Hardly readable and difficult to maintain when documents are complex. Flat Catalog: Catalog of all the elements available in the instance document and, for each of them, lists of child elements and attributes. Use references to element and attribute definitions that need to be within the scope of the referencer. Somewhat comparable to cloning an object. The element or attribute is defined first, and it can be duplicated at another place in the document structure by the reference mechanism, in the same way an object can be cloned. The two elements (or attributes) are then two instances of the same class. Named Types Give a name to the simpleType and complexType elements. Comparable to defining a class and using it to create an object

Capture the semantics in the XML Schema The element is used for documenting the schema, both for humans and for programs. –Use for providing a comment to humans –Use for providing a comment to programs The content is any well-formed XML Note that annotations have no effect on schema validation Annotating Schemas

The following constraint is not expressible with XML Schema: The value of element A should be greater than the value of element B. So, we need to use a separate tool (e.g., Schematron) to check this constraint. We will express this constraint in the appinfo section (below). A should be greater than B

Code to check the structure and content (datatype) of the data Code to actually do the work "In a typical program, up to 60% of the code is spent checking the data!" Save time and money using XML Schemas Continued -->

Code to check the structure and content of the data Code to actually do the work If your data is structured as XML, and there is a schema, then you can hand the data-checking task off to a schema validator. Thus, your code is reduced by up to 60%!!! Big $$ savings! Save time and money using XML Schemas (cont.)

Classic use of XML Schemas (Trading Partners ) Supplier Consumer XML data Schema Validator XML Schema Software to Process D. “D. is okay" D (Schema at third-party, neutral web site)

XML Schema --> GUI Schema GUI Builder HTML Supplier Web Server

XML Schema --> Smart Editor Schema Smart Editor (e.g., XML Spy) Helps you build your instance documents. For example, it pops up a menu showing you what is valid next. It knows this by looking at the XML Schema!

Element Substitution We can define a group of substitutable elements (called a substitutionGroup) by declaring an element (called the head) and then declaring other elements which state that they are substitutable for the head element. <xsd:element name="T" substitutionGroup="subway" type="xsd:string"/> subway is the head element T is substitutable for subway

<xsd:element name="T" substitutionGroup="subway" type="xsd:string"/> Red Line Instance doc: Red Line Alternative instance doc (substitute T for subway): This example shows the element being substituted with the element. Schema

<xsd:element name="metro" substitutionGroup="subway" type="xsd:string"/> Red Line Schema Instance doc: Linea Roja Alternative instance doc (customized for Spanish clients): Substitution Groups (Example) Remarks : Transitive A, B, C Not Symmetric Blocking substitution:

Creating your own Datatypes Simple Types Derivation for example, you want to create your “special” string which will be at length 10 and etc. Complex Types Derivation

xs:restriction elements. The different kind of restrictions that can be applied on a datatype are called facets. Union of datatypes. White space separated lists. Creating your own Datatypes(Simple datatypes)

<xsd:simpleType name="TelephoneNumber“ type=“xsd:string”/> 1. This creates a new datatype called 'TelephoneNumber'. 2. Elements of this type can hold string values, 3. But the string length must be exactly 10 characters long and 4. The string must follow the pattern: dd-ddddddd, where 'd' represents a 'digit'. (Obviously, in this example the regular expression makes the length facet redundant.) patterns are specified using Regular Expressions 5. In general we use: restriction, facet, value. 6. Restriction with Complex Type Creating your own Datatypes

An element declared to be of type TelephoneNumber must be a string of length=10 and the string must follow the pattern: 2 digits, dash, 7 digits. An element declared to be of type shape must be a string with a value of either circle, or triangle, or square. Multiple Facets - "and" them together, or "or" them together?

General Form of Creating a Simple Datatype by Specifying Facet Values … Facets: - length - minlength - maxlength - pattern - enumeration - minInclusive - maxInclusive - minExclusive - maxExclusive... Sources: - string - boolean - number - float - double - duration - dateTime - time... The different kind of restrictions that can be applied on a datatype are called facets

Creating your own Datatypes A new datatype can be defined from an existing datatype (called the "base" type) by specifying values for one or more of the optional facets for the base type. Example. The string primitive datatype has six optional facets: –length –minLength –maxLength –pattern –enumeration –whitespace (legal values: preserve, replace, collapse)

Creating a simpleType from another simpleType Thus far we have created a simpleType using one of the built-in datatypes as our base type. However, we can create a simpleType that uses another simpleType as the base.

Example: creating simpleTypes from another simpleTypes 1. simpleType that uses a built-in base type: 2. simpleType that uses another simpleType as the base type:

Creating Simple Datatypes(list) xs:list defines a whitespace-separated list of values. “ TelephoneNumber “ – from the previous example You cannot create list types from existing list types, nor from complex types Facets can be applied to list types,such as: length, minLength, maxLength It allows to contain an arbitrarily long list of TelephoneNumbers

Creating simpleType - Union.../…../… <xsd:union memberTypes="TomsFamily RogersFamily "/>

Creating simpleType - Union Alternatively, … … …

Creating Simple Datatypes (union) NMTOKEN simple type (like string and etc.)used to define only attributes US, Br é sil Now isbnType may receive the union of simple types NMTOKEN or string

Complex Datatype Derivation We can do a form of subclassing Complex Type definitions: "derived types“. –derive by restriction: create a type which is a subset of the base type. There are two ways to subset the elements: redefine a base type element to have a restricted range of values, or redefine a base type element to have a more restricted number of occurrences. –derive by extension: extend the parent complexType with more elements

Title Author Date Publication ISBN Publisher BookPublication

Derive by Restriction Elements of type SingleAuthorPublication will have 3 child elements - Title, Author, and Date. However, there must be exactly one Author element. Note that in the restriction type you must repeat all the declarations from the base type (except when the base type has an element with minOccurs="0" and the subtype wishes to delete it. ).

Prohibiting Derivations Sometimes we may want to create a type and disallow all derivations of it, or just disallow extension derivations, or disallow restriction derivations. –Rationale: "For example, I may create a complexType and make it publicly available for others to use. However, I don't want them to extend it with their proprietary extensions or subset it to remove, say, copyright information." Publication cannot be extended nor restricted Publication cannot be restricted Publication cannot be extended