Introduction to XML Extensible Markup Language

Slides:



Advertisements
Similar presentations
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
Advertisements

XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
1 XML DTD & XML Schema Monica Farrow G30
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
CSCI 7818 (Topics in Software Engineering) Web Infrastructure, Services, and Applications Document Type Definition (DTD) Author: Lukasz Kurgan.
 2002 Prentice Hall, Inc. All rights reserved. ISQA 407 XML/WML Winter 2002 Dr. Sergio Davalos.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
Introduction to XML Extensible Markup Language
Introduction to XML Extensible Markup Language Carol Wolf Computer Science Department.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Introduction to XML This material is based heavily on the tutorial by the same name at
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
XML - DTD. The building blocks of XML documents Elements, Tags, Attributes, Entities, PCDATA, and CDATA.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Introduction to XML Extensible Markup Language. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information.
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
XML - DTD Week 4 Anthony Borquez. What can XML do? provides an application independent way of sharing data. independent groups of people can agree to.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
An Introduction to XML Sandeep Bhattaram
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Understanding How XML Works Ellen Pearlman Eileen Mullin Programming the.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
Well Formed XML The basics. A Simple XML Document Smith Alice.
Introduction to DTD A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list.
XML DTD. XML Validation XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid" XML.
DTD Document Type Definition. Agenda Introduction to DTD DTD Building Blocks DTD Elements DTD Attributes DTD Entities DTD Exercises DTD Q&A.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML intro. What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
XML Technologies DTD.
eXtensible Markup Language
Introduction to XML Extensible Markup Language
XML QUESTIONS AND ANSWERS
XML Mihail L. Sichitiu.
Database Systems Week 12 by Zohaib Jan.
Document Type Definition DTDs
Session III Chapter 6 – Creating DTDs
Managing XML and Semistructured Data
Introducing HTML & XHTML:
Web Programming Maymester 2004
DTD and XML Schema.
ece 720 intelligent web: ontology and beyond
What is XML?.
Chapter 13 XML Yingcai Xiao.
Extensible Markup Language
Introduction to XML Extensible Markup Language
DTD (Document Type Definition)
CSE 544: Lecture 5 XML 4/15/2002.
Introduction to XML Extensible Markup Language
Session II Chapter 6 – Creating DTDs
Lecture 4 Introduction to XML Extensible Markup Language
Document Type Definition (DTD)
XML IST 421.
Presentation transcript:

Introduction to XML Extensible Markup Language

What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added to the document to provide the extra information. HTML tags tell a browser how to display the document. XML tags give a reader some idea what some of the data means.

Advantages of XML XML documents are used to transfer data from one place to another often over the Internet. XML is text (Unicode) based. Takes up less space. Can be transmitted efficiently. XML documents can be modularized. Parts can be reused.

Example of an HTML Document <head><title>Example</title></head. <body> <h1>This is an example of a page.</h1> <h2>Some information goes here.</h2> </body> </html>

Example of an XML Document <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>alee@aol.com</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address>

Difference Between HTML and XML HTML tags have a fixed meaning and browsers know what it is. XML tags are different for different applications, and users know what they mean. HTML tags are used for display. XML tags are used to describe documents and data.

XML Rules Tags are enclosed in angle brackets. Tags come in pairs with start-tags and end-tags. Tags must be properly nested. <name><email>…</name></email> is not allowed. <name><email>…</email><name> is. Tags that do not have end-tags must be terminated by a ‘/’. <br /> is an html example.

More XML Rules Tags are case sensitive. <address> is not the same as <Address> XML in any combination of cases is not allowed as part of a tag. Tags may not contain ‘<‘ or ‘&’. Tags follow Java naming conventions, except that a single colon and other characters are allowed. They must begin with a letter and may not contain white space. Documents must have a single root tag that begins the document.

Well-Formed Documents An XML document is said to be well-formed if it follows all the rules. An XML parser is used to check that all the rules have been obeyed. Recent browsers such as Internet Explorer 5 and Netscape 7 come with XML parsers. Parsers are also available for free download over the Internet. Java 1.4 also supports an open-source parser.

XML Example Revisited <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>alee@aol.com</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address> Markup for the data aids understanding of its purpose. A flat text file is not nearly so clear. Alice Lee alee@aol.com 212-346-1234 1985-03-22 The last line looks like a date, but what is it for?

Expanded Example <?xml version = “1.0” ?> <address> <name> <first>Alice</first> <last>Lee</last> </name> <email>alee@aol.com</email> <phone>123-45-6789</phone> <birthday> <year>1983</year> <month>07</month> <day>15</day> </birthday> </address>

Document Type Definitions A DTD describes the tree structure of a document and something about its data. There are two data types, PCDATA and CDATA. PCDATA is parsed character data. CDATA is character data, not usually parsed. A DTD determines how many times a node may appear, and how child nodes are ordered.

1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the DTD defines the XML document’s grammar, we can use an XML parser to check that if an XML document conforms to the grammar defined by the DTD.

The purpose of a DTD is to define the legal building blocks of an XML document. Terminology for XML: -- well-formed: if tags are correctly closed. -- valid: if it has a DTD and conforms to it. Validation is useful in data exchange.

A DTD can be declared inside the XML document, or as an external reference. 1) Internal DTD This is a example of a simple XML document with an internal DTD:

<!DOCTYPE company [ < ! ELEMENT company ( (person | product)*) > < ! ELEMENT person ( ssn, name, office, phone?) > < ! ELEMENT ssn ( # PCDATA) > < ! ELEMENT name ( # PCDATA) > < ! ELEMENT office ( # PCDATA) > < ! ELEMENT phone ( # PCDATA) > < ! ELEMENT product ( pid, name, description? ) > < ! ELEMENT pid ( # PCDATA) > < ! ELEMENT description ( # PCDATA) > ]> <company> <person> <ssn> 12345678 < /ssn> <name> John </name> <office> B432 </office> <phone> 1234 </phone> </person> <product> … </product> … </company>

2) External DTD This is the same XML document with an external DTD: <!DOCTYPE company SYSTEM “company.dtd”> <company> <person> <ssn> 12345678 < /ssn> <name> John </name> <office> B432 </office> <phone> 1234 </phone> </person> <product> … </product> … </company> The DTD can reside at a different URL, and then we refer to it as: <!DOCTYPE company SYSTEM “http://www.mydtd.com/company.dtd”>

This is the file “company.dtd” containing the DTD: < ! ELEMENT company ( (person | product)*) > < ! ELEMENT person ( ssn, name, office, phone?) < ! ELEMENT ssn ( # PCDATA) > < ! ELEMENT name ( # PCDATA) > < ! ELEMENT office ( # PCDATA) > < ! ELEMENT phone ( # PCDATA) > < ! ELEMENT product ( pid, name, description? ) > < ! ELEMENT pid ( # PCDATA) > < ! ELEMENT description ( # PCDATA) >

DTD for address Example-1 <!ELEMENT address (name, email, phone, birthday)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT birthday (year, month, day)> <!ELEMENT year (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT day (#PCDATA)>

DTD Example-2 <BOOKLIST> <BOOK GENRE = “Science” FORMAT = “Hardcover”> <AUTHOR> <FIRSTNAME> RICHRD </FIRSTNAME> <LASTNAME> KARTER </LASTNAME> </AUTHOR> </BOOK> </BOOKS> <!DOCTYPE BOOKLIST[ <!ELEMENT BOOKLIST(BOOK)*> <!ELEMENT BOOK(AUTHOR)> <!ELEMENT AUTHOR(FIRSTNAME,LASTNAME)> <!ELEMENT FIRSTNAME(#PCDATA)> <!ELEMENT>LASTNAME(#PCDATA)> <!ATTLIST BOOK GENRE (Science|Fiction)#REQUIRED> <!ATTLIST BOOK FORMAT (Paperback|Hardcover) “PaperBack”>]> Xml Document And Corresponding DTD

2. DTD - XML building blocks The building blocks of XML documents: 1) Elements: main building blocks. Example: “company”, “ person ” … 2) Tags: are used to markup elements. 3) Attributes: Attributes provide extra information about elements. Attributes are placed inside the starting tag of an element. Example: <img src="computer.gif" />

PCDATA is text that will be parsed by a parser. 4)PCDATA: parsed character data. Think of character data as the text found between the starting tag and the ending tag of an XML element. <name> John</name> PCDATA is text that will be parsed by a parser. 5)CDATA: character data. CDATA is text that will NOT be parsed by a parser. Think of the attribute value. 6) Entities: Entities as variables used to define common text

3.DTD - Elements <!ELEMENT element-name (element-content)> In the DTD, XML elements are declared with an element declaration. An element declaration has the following syntax: <!ELEMENT element-name (element-content)> Element-content model: 1) Empty Elements: keyword “EMPTY”. <!ELEMENT element-name (EMPTY)> example: <!ELEMENT img (EMPTY)>

2) Text-only: Elements with text are declared: example: <!ELEMENT element-name (#PCDATA)> example: < ! ELEMENT name ( # PCDATA) > 3) Any: keyword “ANY” declares an element with any content: <!ELEMENT element-name (ANY)>

4) Complex: a regular expression over other elements. Example: < ! ELEMENT company ( (person | product)*) > Note: The elements in the regular expression must be defined in the DTD.

4) Mixed content: Example: <!ELEMENT note (to+,from,header,message*,#PCDATA)> This example declares that the element note must contain at least one to child element, exactly one from child element, exactly one header, zero or more message, and some other parsed character data as well.

4.DTD - Attributes In DTD, XML element attributes are declared with an ATTLIST declaration. Attribute declaration has the following syntax: <!ATTLIST element-name attribute-name attribute-type default-value> Example: <! ELEMENT person (ssn, name, office, phone?) > <! ATTLIS person age CDATA #REQUIRED) >

Attribute declaration examples Example 1: DTD example: <!ELEMENT square EMPTY> <!ATTLIST square width CDATA "0"> XML example: <square width="100"></square> or <square width="100”/>

5.DTD - Entities Entities Entities as variables used to define shortcuts to common text. Entity references are references to entities. Entities can be declared internal. Entities can be declared external.

Internal Entity Declaration Syntax: <!ENTITY entity-name "entity-value"> DTD Example: <!ENTITY writer "Jan Egil Refsnes."> <!ENTITY copyright "Copyright XML101."> XML example: <author>&writer;&copyright;</author>

XML Schema An XML Schema: defines elements that can appear in a document defines attributes that can appear within elements defines which elements are child elements defines the sequence in which the child elements can appear defines the number of child elements defines whether an element is empty or can include text defines default values for attributes The purpose of a Schema is to define the legal building blocks of an XML document, just like a DTD.

Schema for First address Example <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="birthday" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

Explanation of Example Schema <?xml version="1.0" encoding="ISO-8859-1" ?> ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> www.w3.org/2001/XMLSchema contains the schema standards. <xs:element name="address"> <xs:complexType> This states that address is a complex type element. <xs:sequence> This states that the following elements form a sequence and must come in the order shown. <xs:element name="name" type="xs:string"/> This says that the element, name, must be a string. <xs:element name="birthday" type="xs:date"/> This states that the element, birthday, is a date. Dates are always of the form yyyy-mm-dd.

DTD: <!ELEMENT paper (title,author*,year, (journal|conference))> XML Schemas <xsd:element name=“paper” type=“papertype”/> <xsd:complexType name=“papertype”> <xsd:sequence> <xsd:element name=“title” type=“xsd:string”/> <xsd:element name=“author” minOccurs=“0”/> <xsd:element name=“year”/> <xsd: choice> < xsd:element name=“journal”/> <xsd:element name=“conference”/> </xsd:choice> </xsd:sequence> </xsd:element> DTD: <!ELEMENT paper (title,author*,year, (journal|conference))>

XML Schema – Better than DTDs XML Schemas are easier to learn than DTD are extensible to future additions are richer and more useful than DTDs are written in XML support data types

Example: Shipping Order <items>     <item>       <title>Wheel</title>       <quantity>1</quantity>       <price>10.90</price>     </item>     <title>Cam</title>       <price>9.90</price>     </item>   </items> </shipOrder> <?xml version="1.0"?> <shipOrder>   <shipTo>     <name>Svendson</name>     <street>Oslo St</street>     <address>400 Main</address> <country>Norway</country>   </shipTo>  

XML Schema for Shipping Order <xsd:schema xmlns:xsd=http://www.w3.org/1999/XMLSchema> <xsd:element name="shipOrder" type="order"/> <xsd:complexType name="order">   <xsd:element name="shipTo" type="shipAddress"/>   <xsd:element name="items" type="cdItems"/> </xsd:complexType> <xsd:complexType name="shipAddress">   <xsd:element name="name“ type="xsd:string"/>   <xsd:element name="street" type="xsd:string"/>   <xsd:element name="address" type="xsd:string"/>   <xsd:element name="country" type="xsd:string"/>

XML Schema - Shipping Order (continued) <xsd:complexType name="cdItems">   <xsd:element name="item" minOccurs="0" maxOccurs="unbounded" type="cdItem"/> </xsd:complexType> <xsd:complexType name="cdItem">   <xsd:element name="title" type="xsd:string"/>   <xsd:element name="quantity“ type="xsd:positiveInteger"/>   <xsd:element name="price" type="xsd:decimal"/> </xsd:schema>