XML & XML Schema Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
eXtensible Markup Language
XML INTRODUCTION Prepared by Hongming Yu Modified by Fernando Farfán.
XML 6.5 XML Schema (XSD) 6. What is XML Schema? The origin of schema  XML Schema documents are used to define and validate the content and structure.
An Introduction to XML Based on the W3C XML Recommendations.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
Document Type Definitions
XML Schema Definition Language
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
26-Jun-15 XML. 2 HTML and XML, I XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users XML is used to.
XML Schemas. “Schemas” is a general term--DTDs are a form of XML schemas –According to the dictionary, a schema is “a structured framework or plan” When.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Tutorial 11 Creating XML Document
Introduction to XML: Yong Choi CSU Bakersfield.
Upgrading to XHTML DECO 3001 Tutorial 1 – Part 1 Presented by Ji Soo Yoon 19 February 2004 Slides adopted from
XML Introduction By Hongming Yu Feb 6 th, Index Markup Language: SGML, HTML, XML An XML example Why is XML important XML introduction XML applications.
Introduction to XML Rashmi Kukanur. XML XML stands for Extensible Markup Language XML was designed to carry data XML and HTML designed with different.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Introduction to XML This material is based heavily on the tutorial by the same name at
ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard.
Pemrograman Berbasis WEB XML part 2 -Aurelio Rahmadian- Sumber: w3cschools.com.
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
What is XML? XML stands for EXtensible Markup Language
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
Introduction to XML. What is XML? Extensible Markup Language XML Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML eXtensible Markup Language. Topics  What is XML  An XML example  Why is XML important  XML introduction  XML applications  XML support CSEB.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall XML & XML Schema Semantic Web - Fall 2005 Computer Engineering.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
Waqas Anwar Next SlidePrevious Slide. Waqas Anwar Next SlidePrevious Slide XML XML stands for EXtensible Markup Language.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
WEB APPLICATION DEVELOPMENT For More visit:
Schemas 1www.tech.findforinfo.com. What is a Schema a schematic or preliminary plan Description of a structure, details... 2www.tech.findforinfo.com.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
An Introduction to XML Sandeep Bhattaram
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
What it is and how it works
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
Tutorial 13 Validating Documents with Schemas
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
225 City Avenue, Suite 106 Bala Cynwyd, PA , phone , fax presents… XML Syntax v2.0.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
19-Dec-15 XML 2 HTML and XML, I XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users XML is used to.
Well Formed XML The basics. A Simple XML Document Smith Alice.
XSD: XML Schema Language Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
Web Technology (NCS-504) Prepared By Mr. Abhishek Kesharwani Assistant Professor,UCER Naini,Allahabad.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
XML Introduction to XML Extensible Markup Language.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
XML intro. What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display.
Unit 4 Representing Web Data: XML
eXtensible Markup Language
Introduction to XML Mr. Majed Bouchahma
Chapter 7 Representing Web Data: XML
XML Introduction By Hongming Yu Feb 6th, 2002.
Introduction to XML Mr. Majed Bouchahma
eXtensible Markup Language
Presentation transcript:

XML & XML Schema Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology

Semantic web - Computer Engineering Dept. - Spring Outline Markup Languages –SGML, HTML, XML XML Building Blocks XML Applications Namespaces XML Schema

Semantic web - Computer Engineering Dept. - Spring SGML(ISO 8879) S tandard G eneralized M arkup L anguage The international standard for defining descriptions of structure and content in text documents Interchangeable: device-independent, system-independent tags are not predefined Using DTD to validate the structure of the document Large, powerful, and very complex Heavily used in industrial and commercial usages for over a decade

Semantic web - Computer Engineering Dept. - Spring HTML(RFC 1866) H yper T ext M arkup L anguage A small SGML application used on web (a DTD and a set of processing conventions) Only uses a predefined set of tags

Semantic web - Computer Engineering Dept. - Spring What is XML? eXtensible Markup Language A simplified version of SGML Maintains the most useful parts of SGML Designed so that SGML can be delivered over the Web More flexible and adaptable than HTML XHTML: a reformulation of HTML 4 in XML 1.0XHTML

Semantic web - Computer Engineering Dept. - Spring HTML vs. XML HTML is used to mark up text so it can be displayed to users. XML is used to mark up data so it can be processed by computers. HTML describes both structure (e.g.,, ) and appearance (e.g.,, ) XML describes only content, or “meaning” HTML uses a fixed, unchangeable set of tags. In XML, you make up your own tags.

Semantic web - Computer Engineering Dept. - Spring HTML vs. XML (2) HTML is for humans –HTML describes web pages –You don’t want to see error messages about the web pages you visit –Browsers ignore and/or correct as many HTML errors as they can, so HTML is often sloppy XML is for computers –XML describes data –The rules are strict and errors are not allowed In this way, XML is like a programming language –Current versions of most browsers can display XML However, browser support of XML is spotty at best

Semantic web - Computer Engineering Dept. - Spring XML-related technologies DTD (Document Type Definition) and XML Schemas are used to define legal XML tags and their attributes for particular purposes XSLT (eXtensible Stylesheet Language Transformations) and XPath are used to translate from one form of XML to another SAX (Simple API for XML)

Semantic web - Computer Engineering Dept. - Spring XML Building blocks - Elements Delimited by angle brackets Identify the nature of the content they surround General format: … Empty element: XML Elements have Relationships –Elements are related as parents and children Elements have Content –Elements can have different content types: Element, mixed, Simple, empty

Semantic web - Computer Engineering Dept. - Spring XML Building blocks - Attributes Name-value pairs that occur inside start-tags after element name, like: Provide additional information about elements that often is not a part of data. Attributes and elements are somewhat interchangeable Should I use an element or an attribute? Example using just elements: David Matuszek Example using attributes: metadata (data about data) should be stored as attributes, and that data itself should be stored as elements

Semantic web - Computer Engineering Dept. - Spring XML Building blocks - Entities Five special characters must be written as entities: & for & (almost always necessary) < for < (almost always necessary) > for > (not usually necessary) " for " (necessary inside double quotes) &apos; for ' (necessary inside single quotes) These entities can be used even in places where they are not absolutely required. These are the only predefined entities in XML.

Semantic web - Computer Engineering Dept. - Spring XML Building blocks - Declaration The XML declaration looks like this: –The XML declaration is not required by browsers, but is required by most XML processors (so include it!) –If present, the XML declaration must be first--not even whitespace should precede it –Note that the brackets are –version="1.0" is required (this is the only version so far) –encoding can be "UTF-8" (ASCII) or "UTF-16" (Unicode), or something else, or it can be omitted –standalone tells whether there is a separate DTD

Semantic web - Computer Engineering Dept. - Spring XML Building blocks - Processing instructions PIs (Processing Instructions) may occur anywhere in the XML document (but usually first) A PI is a command to the program processing the XML document to handle it in a certain way XML documents are typically processed by more than one program Programs that do not recognize a given PI should just ignore it General format of a PI: Example:

Semantic web - Computer Engineering Dept. - Spring XML Building blocks - Comments Comments can be put anywhere in an XML document Comments are useful for: –Explaining the structure of an XML document –Commenting out parts of the XML during development and testing The character sequence -- cannot occur in the comment Comments are not displayed by browsers, but can be seen by anyone who looks at the source code

Semantic web - Computer Engineering Dept. - Spring CDATA By default, all text inside an XML document is parsed You can force text to be treated as unparsed character data by enclosing it in Any characters, even & and <, can occur inside a CDATA Whitespace inside a CDATA is (usually) preserved The only real restriction is that the character sequence ]]> cannot occur inside a CDATA CDATA is useful when your text has a lot of illegal characters (for example, if your XML document contains some HTML text)

Semantic web - Computer Engineering Dept. - Spring XML Syntax All XML elements must have a closing tag XML tags are case sensitive All XML elements must be properly nested All XML documents must have a root tag Attribute values must always be quoted With XML, white space is preserved With XML, a new line is always stored as LF Comments in XML:

Semantic web - Computer Engineering Dept. - Spring Well-formed XML Every element must have both a start tag and an end tag, e.g.... –But empty elements can be abbreviated:. –XML tags are case sensitive –XML tags may not begin with the letters xml, in any combination of cases Elements must be properly nested, e.g. not bold and italic Every XML document must have one and only one root element The values of attributes must be enclosed in single or double quotes, e.g. Character data cannot contain < or &

Semantic web - Computer Engineering Dept. - Spring Displaying XML XML documents do not carry information about how to display the data We can add display information to XML with –CSS (Cascading Style Sheets) –XSL (eXtensible Stylesheet Language) --- preferred

Semantic web - Computer Engineering Dept. - Spring XML Applications (1) Separate data XML can Separate Data from HTML Store data in separate XML files Using HTML for layout and display Using Data Islands Data Islands can be bound to HTML elements Benefits: Changes in the underlying data will not require any changes to your HTML

Semantic web - Computer Engineering Dept. - Spring XML Applications (2) Exchange data XML is used to Exchange Data Text format Software-independent, hardware-independent Exchange data between incompatible systems, given that they agree on the same tag definition. Can be read by many different types of applications Benefits: Reduce the complexity of interpreting data Easier to expand and upgrade a system

Semantic web - Computer Engineering Dept. - Spring XML Application (3) Store Data XML can be used to Store Data Plain text file Store data in files or databases Application can be written to store and retrieve information from the store Other clients and applications can access your XML files as data sources Benefits: Accessible to more applications

Semantic web - Computer Engineering Dept. - Spring XML Applications (4) Create new language XML can be used to Create new Languages, e.g. : WML (Wireless Markup Language) used to markup Internet applications for handheld devices like mobile phones (WAP) MusicXML used to publishing musical scores

Semantic web - Computer Engineering Dept. - Spring Names in XML Names (as used for tags and attributes) must begin with a letter or underscore, and can consist of: –Letters, both Roman (English) and foreign –Digits, both Roman and foreign. (dot) - (hyphen) _ (underscore) : (colon) should be used only for namespaces –Combining characters and extenders (not used in English)

Semantic web - Computer Engineering Dept. - Spring Namespaces Namespaces are a simple mechanism for creating globally unique names for the elements and attributes of your markup language. Benefits: –De-conflicts the meaning of identical names in different markup languages. –Allows different markup languages to be mixed together without ambiguity. Namespaces are implemented by requiring every XML name to consist of two parts: a prefix and a local part:

Semantic web - Computer Engineering Dept. - Spring Namespaces and URIs A namespace is defined as a unique string –To guarantee uniqueness, typically a URI (Uniform Resource Indicator) is used, because the author “owns” the domain –It doesn't have to be a “real” URI; it just has to be a unique string –Example: There are two ways to use namespaces: –Declare a default namespace –Associate a prefix with a namespace, then use the prefix in the XML to refer to the namespace

Semantic web - Computer Engineering Dept. - Spring Namespace syntax In any start tag you can use the reserved attribute name xmlns : –This namespace will be used as the default for all elements up to the corresponding end tag –You can override it with a specific prefix You can use almost this same form to declare a prefix: –Use this prefix on every tag and attribute you want to use from this namespace, including end tags--it is not a default prefix To Begin You can use the prefix in the start tag in which it is defined:

Semantic web - Computer Engineering Dept. - Spring Review of XML rules Start with XML is case sensitive You must have exactly one root element that encloses all the rest of the XML Every element must have a closing tag Elements must be properly nested Attribute values must be enclosed in double or single quotation marks There are only five pre-declared entities

Semantic web - Computer Engineering Dept. - Spring XML as a tree An XML document represents a hierarchy; a hierarchy is a tree novel foreword chapter number="1" paragraph This is the great American novel. It was a dark and stormy night. Suddenly, a shot rang out!

Semantic web - Computer Engineering Dept. - Spring Extended document standards You can define your own XML tag sets, but here are some already available: –XHTML: HTML redefined in XML –SMIL: Synchronized Multimedia Integration Language –MathML: Mathematical Markup Language –SVG: Scalable Vector Graphics –DrawML: Drawing MetaLanguage –ICE: Information and Content Exchange –ebXML: Electronic Business with XML –cxml: Commerce XML –CBL: Common Business Library

XML Schema

Semantic web - Computer Engineering Dept. - Spring XML Validation "Well Formed" XML document –correct XML syntax "Valid" XML document –“well formed” –Conforms to the rules of a DTD XML DTD –defines the legal building blocks of an XML document –Can be inline in XML or as an external reference XML Schema –an XML based alternative to DTD, more powerful –Support namespace and data types

Semantic web - Computer Engineering Dept. - Spring An Example XML with DTD <!DOCTYPE note [ ]> Tove Jani Reminder Don't forget me this weekend

Semantic web - Computer Engineering Dept. - Spring XML Schemas “Schema” is a general term –DTDs are a form of XML schemas When we say “XML Schemas,” we usually mean the W3C XML Schema Language –This is also known as “XML Schema Definition” language, or XSD.

Semantic web - Computer Engineering Dept. - Spring XSD vs. DTD DTDs provide a very weak specification language –You can’t put any restrictions on text content –You have very little control over mixed content (text plus elements) –You have little control over ordering of elements DTDs are written in a strange (non-XML) format –You need separate parsers for DTDs and XML The XML Schema Definition language solves these problems –XSD gives you much more control over structure and content –XSD is written in XML

Semantic web - Computer Engineering Dept. - Spring Referring to a schema To refer to a DTD in an XML document, the reference goes before the root element: –... To refer to an XML Schema in an XML document, the reference goes in the root element: – (This is where your XML Schema definition can be found)...

Semantic web - Computer Engineering Dept. - Spring The XSD document Since the XSD is written in XML, it can get confusing which we are talking about. The file extension is.xsd The root element is The XSD starts like this:

Semantic web - Computer Engineering Dept. - Spring The element may have attributes: –xmlns:xs=" This is necessary to specify where all our XSD tags are defined –elementFormDefault="qualified" This means that all XML elements must be qualified (use a namespace) It is highly desirable to qualify all elements, or problems will arise when another schema is added

Semantic web - Computer Engineering Dept. - Spring “Simple” and “complex” elements A “simple” element is one that contains text and nothing else –A simple element cannot have attributes –A simple element cannot contain other elements –A simple element cannot be empty –However, the text can be of many different types, and may have various restrictions applied to it If an element isn’t simple, it’s “complex” –A complex element may have attributes –A complex element may be empty, or it may contain text, other elements, or both text and other elements

Semantic web - Computer Engineering Dept. - Spring Defining a simple element A simple element is defined as where: –name is the name of the element –the most common values for type are xs:booleanxs:integer xs:datexs:string xs:decimalxs:time Other attributes a simple element may have: –default=" default value " if no other value is specified –fixed=" value " no other value may be specified

Semantic web - Computer Engineering Dept. - Spring Defining an attribute Attributes themselves are always declared as simple types An attribute is defined as where: –name and type are the same as for xs:element Other attributes a simple element may have: –default=" default value " if no other value is specified –fixed=" value " no other value may be specified –use="optional" the attribute is not required (default) –use="required" the attribute must be present

Semantic web - Computer Engineering Dept. - Spring Restrictions, or “facets” The general form for putting a restriction on a text value is: – (or xs:attribute )... the restrictions... For example: –

Semantic web - Computer Engineering Dept. - Spring Restrictions on numbers minInclusive -- number must be ≥ the given value minExclusive -- number must be > the given value maxInclusive -- number must be ≤ the given value maxExclusive -- number must be < the given value totalDigits -- number must have exactly value digits fractionDigits -- number must have no more than value digits after the decimal point

Semantic web - Computer Engineering Dept. - Spring Restrictions on strings length -- the string must contain exactly value characters minLength -- the string must contain at least value characters maxLength -- the string must contain no more than value characters pattern -- the value is a regular expression that the string must match whiteSpace -- not really a “restriction”--tells what to do with whitespace –value="preserve" Keep all whitespace –value="replace" Change all whitespace characters to spaces –value="collapse" Remove leading and trailing whitespace, and replace all sequences of whitespace with a single space

Semantic web - Computer Engineering Dept. - Spring Enumeration An enumeration restricts the value to be one of a fixed set of values Example: –

Semantic web - Computer Engineering Dept. - Spring Complex elements A complex element is defined as... information about the complex type... Example: says that elements must occur in this order Remember that attributes are always simple types

Semantic web - Computer Engineering Dept. - Spring Declaration and use So far we’ve been talking about how to declare types, not how to use them To use a type we have declared, use it as the value of type="..." –Examples: –Scope is important: you cannot use a type if is local to some other type

Semantic web - Computer Engineering Dept. - Spring xs:sequence We’ve already seen an example of a complex type whose elements must occur in a specific order:

Semantic web - Computer Engineering Dept. - Spring xs:all xs:all allows elements to appear in any order Despite the name, the members of an xs:all group can occur once or not at all You can use minOccurs="0" to specify that an element is optional (default value is 1 ) –In this context, maxOccurs is always 1

Semantic web - Computer Engineering Dept. - Spring Empty elements Empty elements are (ridiculously) complex

Semantic web - Computer Engineering Dept. - Spring Mixed elements Mixed elements may contain both text and elements We add mixed="true" to the xs:complexType element The text itself is not mentioned in the element, and may go anywhere (it is basically ignored)

Semantic web - Computer Engineering Dept. - Spring Extensions You can base a complex type on another complex type...new stuff...

Semantic web - Computer Engineering Dept. - Spring Predefined string types Recall that a simple element is defined as: Here are a few of the possible string types: –xs:string -- a string –xs:normalizedString -- a string that doesn’t contain tabs, newlines, or carriage returns –xs:token -- a string that doesn’t contain any whitespace other than single spaces Allowable restrictions on strings: – enumeration, length, maxLength, minLength, pattern, whiteSpace

Semantic web - Computer Engineering Dept. - Spring Predefined date and time types xs:date -- A date in the format CCYY-MM-DD, for example, xs:time -- A date in the format hh:mm:ss (hours, minutes, seconds) xs:dateTime -- Format is CCYY-MM- DD T hh:mm:ss –The T is part of the syntax Allowable restrictions on dates and times: – enumeration, minInclusive, minExclusive, maxInclusive, maxExclusive, pattern, whiteSpace

Semantic web - Computer Engineering Dept. - Spring Predefined numeric types Here are some of the predefined numeric types: Allowable restrictions on numeric types: – enumeration, minInclusive, minExclusive, maxInclusive, maxExclusive, fractionDigits, totalDigits, pattern, whiteSpace xs:decimalxs:positiveInteger xs:bytexs:negativeInteger xs:shortxs:nonPositiveInteger xs:intxs:nonNegativeInteger xs:long

Questions?

Semantic web - Computer Engineering Dept. - Spring References