Lecture 9 XML & its applications

Slides:



Advertisements
Similar presentations
UFCE8V-20-3 Information Systems Development 3 (SHAPE HK) Lecture 12 Extensible Stylesheet Language Transformations : XSLT.
Advertisements

1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
1 XML DTD & XML Schema Monica Farrow G30
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
XML Schemas Microsoft XML Schemas W3C XML Schemas.
Lecture 14 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
Sunday, June 28, 2015 Abdelali ZAHI : FALL 2003 : XML Schemas XML Schemas Presented By : Abdelali ZAHI Instructor : Dr H.Haddouti.
Unit 4 – XML Schema XML - Level I Basic.
Introduction to XML This material is based heavily on the tutorial by the same name at
17 Apr 2002 XML Schema Andy Clark. What is it? A grammar definition language – Like DTDs but better Uses XML syntax – Defined by W3C Primary features.
Lecture 15 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
SDPL 2002Notes 2: Document Instances and Grammars1 2.5 XML Schemas n A quick introduction to XML Schema –W3C Recommendation, May 2, 2001: »XML Schema Part.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Schema 1 Lecturer.
XML Schema Vinod Kumar Kayartaya. What is XML Schema?  XML Schema is an XML based alternative to DTD  An XML schema describes the structure of an XML.
1 XML Schemas. 2 Useful Links Schema tutorial links:
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Schemas Ellen Pearlman Eileen Mullin Programming the Web Using XML.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Lecture 14 Extensible Stylesheet Language Transformations : XSLT.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
SDPL 2005Notes 2.5: XML Schemas1 2.5 XML Schemas n Short introduction to XML Schema –W3C Recommendation, 1 st Ed. May, 2001; 2 nd Ed. Oct, 2004: »XML Schema.
New Perspectives on XML, 2nd Edition
 XML DTD and XML Schema Discussion Sessions 1A and 1B Session 2.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
An Introduction to XML Sandeep Bhattaram
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
XML Schema. Why Validate XML? XML documents can generally have any structure XML grammars define specific document structures Validation is the act of.
Processing of structured documents Spring 2003, Part 3 Helena Ahonen-Myka.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel: (x2073)
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Validation II Advanced DTDs + Schemas Robin Burke ECT 360.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
CITA 330 Section 4 XML Schema. XML Schema (XSD) An alternative industry standard for defining XML dialects More expressive than DTD Using XML syntax Promoting.
MSc in Communication Sciences Program in Technologies for Human Communication Davide Eynard Facoltà di scienze della comunicazione Università.
Extensible Markup Language (XML) Pat Morin COMP 2405.
eXtensible Markup Language
Unit 4 Representing Web Data: XML
XML Schema.
ACG 4401 XML Schemas XML Namespaces XLink.
XML – XML DTD.
ACG 4401 XML Schemas XML Namespaces XLink.
XML QUESTIONS AND ANSWERS
CMP 051 XML Introduction Session IV
Lecture 9 XML & its applications
Eugenia Fernandez IUPUI
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
Creating an XML Document
Design and Implementation of Software for the Web
Web Programming Maymester 2004
Introduction to XML Extensible Markup Language
ece 720 intelligent web: ontology and beyond
Lecture 8 XML & its applications
CMP 051 XML Introduction Session III
Lecture 4 Introduction to XML Extensible Markup Language
New Perspectives on XML
Presentation transcript:

Lecture 9 XML & its applications XML Validation

Well formed XML (reminder from Lecture 13) xml declaration (optional) used by xml processor; this documents conforms to xml version 1 and uses the UTF-8 standard (Unicode optimized for ASCII) <?xml version="1.0" encoding="UTF-8"?> <patient nhs-no="7503557856"> <!-- Patient demographics --> <name > <first>Joseph</first> <middle>Michael</middle> <last>Bloggs</last> <previous/> <preferred>Joe</preferred> </name> <title>Mr</title> <address> <street>2 Gloucester Road</street> <street /> <city>Bristol</city> <county>Avon</county> <postcode>BS2 4QS</postcode> </address> <tel> <home>0117 9541054</home> <mobile>07710 234674</mobile> </tel> <email>joe.bloggs@email.com</email> <fax /> </patient> root element; every well formed xml document must be enclosed by exactly one root element. attribute; attributes provide additional information about an element and consist of a name value pair; the value must be enclosed in a single (‘) or double quote (“) a comment; comments must be delimited by the <!-- --> characters as in xhtml a simple element containing text a complex element containing other elements and text empty elements

Well formed XML displayed in IE & Netscape

Vocabularies and Validity XML documents are not directly written; instead XML is used to create one or more vocabularies, specific custom markup languages (often referred to as XML applications), and it is these languages which are used to create documents. such a language (a set of namespaces, elements, attributes etc. – a vocabulary) is defined using a set of rules which specify the set (potentially infinite) of complying documents. such a set of rules is generically referred to as a schema. for instance, in our example document, we may want to specify rules that state that the <name> element must always contain exactly one each of the <first>, <middle>, <last>, <previous> & <preferred> elements and that they must occur in this order. additional rules we might want to specify are that the <first> & <last> elements must always contain alphanumeric values (not empty) and that they must never exceed 256 characters each.

So what again is Validation? A document conforming to a particular schema is said to be valid, and the process of checking that conformance is called validation. Schema languages differentiate between at least four levels of validation : - The validation of the markup -- controlling the structure of a document. The validation of the content of individual leaf nodes (datatyping) The validation of integrity, i.e. of the links between nodes within a document or between documents. - Any other tests (often called "business rules").

XML schema systems (1) more formally, an XML schema language is a formalization of the constraints, expressed as rules or a model of structure, that apply to a class of XML documents. an XML document constrained (described) by a schema is called an instance document and such a document is considered schema-valid. schemas can serve as design tools, establishing a framework on which implementations can be built. many schema languages are now available including DTD, W3C Schema, Microsoft XML-Data Reduced (XDR), Schematron, NG Relax, TREX, Examplotron and others. the most widely used of these is W3C Schema but first we briefly consider the Document Type Definition (DTD) approach which originated in the days of SGML.

There are currently two types of schema languages : XML Schema Systems (2) - A set of rules that are used to validate a xml document is referred to as a schema. A document conforming to a particular schema is said to be valid against that schema, and the process of checking that conformance is called validation. Schema languages differentiate between at least four levels of validation: The validation of the markup -- controlling the structure of a document. The validation of the content of individual leaf nodes (data-typing) The validation of integrity, i.e. of the links between nodes within a document or between documents. Any other tests (often called "business rules"). There are currently two types of schema languages : - grammar based - for specifying structure, form, and syntax (e.g. DTD, XML Schema, Relax NG) - rule based - for expressing data relationships, such as operational and business rules (e.g. Schematron) We will look at XML validation in detail in a forthcoming lecture

The Document Type Definition (DTD) approach. XML schema systems (3) The Document Type Definition (DTD) approach. DTD’s are written in a formal notation (BNF) that specifies exactly which elements and entities may appear where in the document and what the elements’ contents and attributes are. a DTD can make statements of the type such as “a ‘ul’ element can only contain ‘li’ elements” and every ‘student’ element must have a ‘student_number’ attribute” hence a DTD lists all the elements, attributes and entities the document uses and the context in which it uses them. a validating parser compares a document to its DTD and lists any places where the document differs from the DTD. validity operates on the principal that everything not permitted is forbidden. if an instance document satisfies the DTD it is said to be valid otherwise it is said to be invalid.

Example DTD for patient.xml XML schema languages (1) Example DTD for patient.xml <!ELEMENT patient (name, title, address, tel+, email?, fax?)> <!ATTLIST patient nhs-no CDATA #REQUIRED> <!ELEMENT name (first, middle, last, previous, preferred)> <!ELEMENT title (#PCDATA)> <!ELEMENT address (street+, city, county, postcode)> <!ELEMENT tel (home*, mobile*)> <!ELEMENT email (#PCDATA)> <!ELEMENT fax (#PCDATA)> <!ELEMENT first (#PCDATA)> <!ELEMENT middle (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT previous (#PCDATA)> <!ELEMENT preferred (#PCDATA)> <!ELEMENT street (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT county (#PCDATA)> <!ELEMENT postcode (#PCDATA)> <!ELEMENT home (#PCDATA)> <!ELEMENT mobile (#PCDATA)>

XML schema languages (2) So what’s the problem with DTD’s? DTD’s work (to an extent) but there are many issues and limitations with this approach, for example DTD’s do not specify how many instances of each kind of element appear in a document what the character data inside the element look like the semantic meaning of the element; for instance, whether it contains a date or a person’s name. DTD’s cannot specify anything about the length, structure, meaning, allowed values, or other aspects of the text content of an element. what the root element of a document is DTD’s are not in themselves XML documents

XML schema languages (3) W3C XML Schema XML Schemas (http://www.w3.org/XML/Schema) offers a much more powerful way of constraining XML documents than DTD’s. Advantages of Schemas over DTD’s include: in additional to the traditional constraints, XML Schemas allow content model constraints for generic data formats to be built. these defined constraints can be shared (using namespaces) and referenced from other schemas using XLink and XPointer. it follows an object oriented approach, allowing for the definitions of types and inheritance which allows for better maintainability and can save a significant amount of design time.

W3C XML Schema: simple example XML schema languages (4) W3C XML Schema: simple example consider the following simple document <?xml version=“1.0”?> <studentName>Joseph Bloggs</studentName> assuming that the studentName element can only contain a simple string value, the schema for this document would look like <xs:schema xmlns:xsd=http://www.w3.org/2001/XMLSchema> <xs:element name=“studentName” type=“xs:string” /> </xs:schema> - Validatating an instance doc against its schema requires a validating parser such as the Xerces parsar from the Apache XML Project (http://xml.apache.org/xerces-j/)

W3C XML Schema: simple and complex types XML schema systems (5) W3C XML Schema: simple and complex types schemas support two different types of of content: simple and complex. Simple types equates with basic data types (strings, integers, dates, times, etc.) – simple types by definition cannot contain nested elements. <xs:element name=“studentName” type=“xs:string” /> elements that complex types may contain nested elements elements and attributes. Only elements can have complex types, attributes always have simple types. <xs:complexType name="addressType"> <xs:sequence> <xs:element ref="street" minOccurs="2" maxOccurs="unbounded"/> <xs:element ref="city"/> <xs:element ref="county"/> <xs:element ref="postcode"/> </xs:sequence> </xs:complexType>

W3C XML Schema: local versus global declarations XML schema systems (6) W3C XML Schema: local versus global declarations Instance elements declared at the top level of the schema (immediate child of the xs:schema element) are considered global elements. According to the schema specification, any elements declared globally can act as the root element of the instance doc. elements declared with another element declaration (i.e. within a complex type) are considered local. You can element declarations within a schema that have the same name but different semantics if they are declared locally. the side effect of using global declarations may include: - naming conflicts when schemas are shared and/or merged - if more than one element is declared globally, a schema valid document may not contain the expected root element

W3C XML Schema: attributes, data-types and derivation XML schema systems (7) W3C XML Schema: attributes, data-types and derivation attribute declarations attributes are declared using the xs:attribute element. Attributes may be declared globally or locally as part of a complex type definition. data-types there are great range of data-types bulit into XML Schema; xs:string, xs:integer, xs:dateTime, xs:decimal etc. etc. derivation there are three derivation methods in XML Schema - derivation by restriction – where constraints are added on datatype without changing its original meaning, - derivation by list – where new data-types are defined as being lists of values belonging to a data type - derivation by union – where new data-types are defined as allowing values from a set of other data types and lose most of their meaning

XML schema for patient.xml patient.xsd (fragment) <xs:element name="patient"> <xs:complexType> <xs:sequence> <xs:element name="name" type="nameType"/> <xs:element name="title" type="titleType"/> <xs:element name="address" type="addressType"/> <xs:element name="tel" type="telType" maxOccurs="2"/> <xs:element name="email" type="emailType" minOccurs="0"/> <xs:element name="fax" type="xs:string" minOccurs="0"/> </xs:sequence> <xs:attribute name="nhs-no" type="xs:integer" use="required"/> </xs:complexType> </xs:element> <xs:complexType name="nameType"> <xs:element name="first" type="nameStringType"/> <xs:element name="middle" type="nameStringType"/> <xs:element name="last" type="nameStringType"/> <xs:element name="previous" type="nameStringType"/> <xs:element name="preferred" type="nameStringType"/> <xs:simpleType name="nameStringType"> <xs:restriction base="xs:string"> <xs:maxLength value="64"/> </xs:restriction> </xs:simpleType> <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type ="text/xsl" href='patient.xslt'?> <patient nhs-no="7503557856" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="patient.xsd"> <name> <first>Joseph</first> <middle>Michael</middle> <last>Bloggs</last> <previous/> <preferred>Joe</preferred> </name> <title>Mr</title> <address> <street>2 Gloucester Road</street> <street/> <city>Bristol</city> <county>Avon</county> <postcode>BS2 4QS</postcode> </address> <tel> <home>0117 9541054</home> <mobile>07710 234674</mobile> </tel> <email>joe.bloggs@email.com</email> <fax></fax> </patient>

Resources Introduction to XML Schemas http://www.xml.com/pub/a/2000/11/29/schemas/part1.html W3C Schools XML Schema Tutorial http://www.w3schools.com/Schema/default.asp On-line Schema Validator http://schneegans.de/sv/