XML I.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML III. Learning Objectives Formatting XML Documents: Overview Using Cascading Style Sheets to format XML documents Using XSL to format XML documents.
Defining XML The Document Type Definition. Document Type Definition text syntax for defining –elements of XML –attributes (and possibly default values)
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
Document Type Definition DTDs CS-328. What is a DTD Defines the structure of an XML document Only the elements defined in a DTD can be used in an XML.
Document Type Definitions
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
 2002 Prentice Hall, Inc. All rights reserved. ISQA 407 XML/WML Winter 2002 Dr. Sergio Davalos.
Physical and Logical Structure
Tutorial 11 Creating XML Document
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
VALIDATING AN XML DOCUMENT
Introduction to XML This material is based heavily on the tutorial by the same name at
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Validating DOCUMENTS with DTDs
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
Chapter 4: Document Type Definitions. Chapter 4 Objectives Learn to create DTDs Validate an XML document against a DTD Use DTDs to create XML documents.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Syntax - Writing XML and Designing DTD's
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
 2002 Prentice Hall, Inc. All rights reserved. Chapter 6 – Document Type Definition (DTD) Outline 6.1Introduction 6.2Parsers, Well-formed and Valid XML.
Lecture 6 XML DTD Content of.xml fileContent of.dtd file.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
SNU OOPSLA Lab. XML Documents 1 : Structure The ubiquitous XML(2) © copyright 2001 SNU OOPSLA Lab.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
An Introduction to XML Sandeep Bhattaram
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Understanding How XML Works Ellen Pearlman Eileen Mullin Programming the.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
1/11 ITApplications XML Module Session 3: Document Type Definition (DTD) Part 1.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Tutorial 13 Validating Documents with Schemas
Beginning XML 3 rd Edition. Chapter 4: Document Type Definitions.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
225 City Avenue, Suite 106 Bala Cynwyd, PA , phone , fax presents… XML Syntax v2.0.
Document Type Definition (DTD) Eugenia Fernandez IUPUI.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
Extensible Markup Language (XML) Pat Morin COMP 2405.
Session III Chapter 6 – Creating DTDs
Creating an XML Document
New Perspectives on XML
Session II Chapter 6 – Creating DTDs
Document Type Definition (DTD)
Review of XML IST 421 Spring 2004 Lecture 5.
XML IST 421.
Presentation transcript:

XML I

Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

What is XML? XML means Extensible markup language. It is NOT a version of HTML Derived from SGML (Standard Generalized Mark-up language, which was established in 1986 as a standard for generalized electronic document exchange. Has 3 main features: structure, extensibility and validation. XML defines a framework for transmitting structured data, hence an XML document is essentially a structured document for storing information. Allows creation of custom mark-up tags for describing virtually anything. XML documents are processed by an XML processor.

Uses of XML Applied use of its capability of storage, and exchange of structural data between applications, that constitute the core of systems). Examples of XML applications are Chemical Markup Language (CML), Extensible Financial Reporting Markup Language (XFRML), and Mathematical Markup Language. Used in e-commerce to store, and transmit product, and other data, including financial information. Used in Open Financial eXchange. Used in search engines to store, and search data. Applied use in virtually every sector.

XML Syntax Fundamentals By including, or referencing a Document type definition (DTD), XML documents can be validated. XML Syntax Fundamentals XML syntax describes the constructs used to define the structure and layout of an XML document, as well as the constraints involved. An XML processor is a software module that reads an XML document, and provides access to its content and structure. XML processors typically process documents on behalf of applications, and are readily available as software plug-ins. IE 5.0 is an e.g. of an XML application that processes and displays XML documents.

Entity: The basic building block of an XML document Entity: The basic building block of an XML document. Contains either parsed or unparsed data. Parsed data consists of characters that are considered as character data or mark-up, and are processed by an XML processor. Unparsed character is handled as raw text and is not processed. E.g. <name>John</name>, <name> and </name> are mark-up, while John is character data. Markup: Used to provide a description of a document’s storage structure (entities) and logical structures (elements). Elements: Describe the logical structure. They have start tags e.g. <name> and end tags ( </name> ), or a single empty tag (<name/>).

XML mark-up components include: Tags: Most obvious component in XML syntax, used to describe elements. Processing instructions: Passed by the parser to the application. Begin with <? and end with ?>. E.g <?xml version=“1.0”?> indicates that the document is based on xml version 1.0 Document type declarations: Used to specify information about the document, including the document’s root element, and the Document Type Definition (DTD). Must appear after the XML declaration, but before the root element e.g <?xml version=“1.0”> <!DOCTYPE addressbook SYSTEM “Addressbook.dtd”> <addressbook> <contact> addressbook declared in line 2 must correspond to <addressbook> in line 3, the root element of the document.

Entity references: Used to assign aliases to pieces of data Entity references: Used to assign aliases to pieces of data. They are made within an ampersand (&) and a colon (;). E.g. &apos; corresponds to an apostrophe (‘) while & corresponds to ‘&’. Comments: Used to present information that is technically not part of the document’s content. Begin with <!– and end with -- > Marked (CDATA) Sections: Used to block off text that is to be sidestepped by the parser. Defined by enclosing it in within <![CDATA[ and ]]>. E.g. <![CDATA[<name>John</name>]]. In this example, the name element is not recognized as mark-up and John is not recognized as parsed character data. It is common to use CDATA sections to quote a piece of XML code, e.g. in a tutorial.

Styling XML for display Accomplished in 2 ways: With the use of CSS. With XSL. More complex and advanced than CSS Parsing XML Can be validating or non-validating. Validating parsers validate XML documents against a DTD or XML Schema. E.g.s of XML parsers are The Lark and Larval XML parsers for Java, Sun’s Project X Parser for Java, IBM’s XML Parser for Java, Oracle XML parser for Java, IBM’s XML Parser for C++.

Example of an XML Document <?xml version=“1.0”?> <!DOCTYPE addressbook SYSTEM “Addressbook.dtd”> <addressbook> <contact> <name>Tony Benn</name> <address>210 Temple road</address> <city>London</city> <postcode>NW9 0RT</postcode> <phone>02082049565</phone> </contact> <name>Peter Bloggs</name>

<address>230 The Vale</address> <city>London</city> <postcode>NW6 2BT</postcode> <phone>02082029517</phone> </contact> </addressbook> The above example is a well-formed XML document used to store contact information. However, it is not valid yet! Note that the root element (<addressbook>) has nested child elements that are defined with opening and closing tags respectively.

XML Data Modelling Involves describing the structure of XML documents, for the purpose of validation. After defining a data model, you can create structured XML documents that must adhere to that model, to be valid. Valid vs Well-formed XML: It is perfectly legal to create an XML document without a data model, in which case the document could be considered well-formed, but is not valid. There are 2 approaches to creating data models: DTDs (Document Type Definitions) and XML Schemas The data model (DTD or XML Schema) defines the arrangement of mark-up and character data within a valid XML document, i.e. the order of nesting of the elements.

Modelling Data with DTDs DTDs (Document Type Definitions) rely on specialized syntax for describing the structure of XML vocabulary (class of document). DTDs can be broken down into 2 subsets: Internal or Local DTD: Mark-up declarations are contained in the prolog (section of document preceding the root element) of the same document. External DTD: External mark-up declarations that can be referenced by one or more documents. The 2 subsets may be combined, with Internal having higher precedence. The DTD declares every element, attribute and entity used in the XML document. It must be declared, or referenced in the document type declaration.

Example: Addressbook.dtd <!ELEMENT addressbook (contact)+> <!ELEMENT contact (name, address, city, postcode, phone)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT postcode (#PCDATA)> <!ELEMENT phone (#PCDATA)> <addressbook> <contact> <name>Tony Benn</name> <address>210 Temple road</address> <city>London</city> <postcode>NW9 0RT</postcode> <phone>02082049565</phone> </contact>

Document type declaration syntax: <!DOCTYPE rootElem SYSTEM ExtDTDRef [InternalDTDDecl]> where rootElem is the root element, ExtDTDRef is the External DTD reference, and InternalDTDDecl is the Internal DTD declaration. Illustration: <!DOCTYPE movies SYSTEM “Movies.dtd” [ <!ELEMENT actor (#PCDATA)> ]> <movies> <title>Lord of the rings</title> <!– the other child elements go here -- > External DTDs are more commonly used, and are especially useful when you are creating multiple documents of the same class; when you would like to use an existing DTD; or to make your document as concise as possible.

Elements and Attributes Internal DTDs are preferable in situations where you’re creating only one document, or to reduce the overhead associated with your documents. Elements and Attributes The primary contents described in a DTD are elements and attributes. Think of an element as a logical unit of information, and an Attribute as a characteristic of that information. By looking at a document as a group of information objects, it is usually possible to associate each object with an element. Any leftover information would usually be represented as attributes. Another approach is to consider the type of information and how it will be used.

Attributes provide tighter constraints on information, while elements on the other hand, are very loosely constrained and are better suited for long strings of text. Attributes can be constrained against a predefined list of values, and can have default values. Attributes are very concise, and are easier to parse. They however can not contain nested information. Elements Declared with element declarations in the DTD. Syntax: <!ELEMENT ElementName Type> ElementName corresponds to the tag used to mark up that element in the XML document. Type specifies the content. 4 types are supported in XML:

Empty types: The element doesn’t contain any content, but may contain attributes. In the DTD, they are declared in the form: <!ELEMENT ElementName EMPTY> E.g <!ELEMENT img EMPTY> Empty elements are defined in the XML document in 2 ways: <start tag><end tag> with no space in between e.g <img src=“pic.gif”></img>. with an empty tag e.g <img/> or <img src=“pic.gif”/> Element only type: The element only type contains child elements. Denoted by <!ELEMENT ElementName contentModel> The content model is specified using a combination of special element declaration symbols and child element names. The symbols represent the relationship of the child, to the container element.

Example: Table of Special Symbols Symbol Usage Parentheses (()) Enclose a sequence or choice group of child elements Comma (,) Separates the items in a sequence and establishes the order in which they must appear. Pipe (|) Separates items in a choice group of elements. No symbol Implies that the child element must appear exactly once Question mark (?) Child element must appear only once or not at all Asterisk (*) Child element can appear any number of times Plus sign (+) Must appear at least once Example: <!ELEMENT resume (intro, (education| experience+)+,hobbies?,references*)>

Mixed Elements ANY Elements Contain both character and child elements. The simplest mixed element is that declared to contain only character data. Take the following form: <!ELEMENT ElementName (#PCDATA)>. E.g. <!ELEMENT city (#PCDATA)> ANY Elements The ANY element, so named because it is declared with the symbol ANY, can contain any type of element, or a combination of elements. Due to its lack of structure, you should avoid using it. Typically used during development of a DTD, but should not appear in a production DTD. Form: <!ELEMENT ElementName ANY>

Attributes Used to specify additional information about elements. Within an element, attributes are used to form name/value pairs that describe a particular property of the element. Declared in a DTD with attribute list declaration which take the form: <! ATTLIST ElementName AttrName AttrType Default> There are 4 types of default types that can be specified: #REQUIRED: The attribute is required #IMPLIED: The attribute is optional #FIXED value: The attribute has a fixed value default: The default value of the attribute #REQUIRED implies that the attribute is required, and you must define that attribute if you use the element.

Must be specified, in addition to the attribute default value. Attribute Type Must be specified, in addition to the attribute default value. XML supports 10 attribute types: CDATA- Unparsed character data Enumerated: Series of string values NOTATION: A notation declared somewhere else in the DTD ENTITY: An external binary entity ENTITIES: Multiple external binary entities separated by whitespace. ID: A unique identifier IDREF: Reference to an ID declared somewhere else in the DTD IDREFS: Multiple references to IDs declared somewhere else in the DTD NMTOKEN: A name consisting of XML token characters (letters, numbers, periods, dashes, colons and underscores). NMTOKENS: Multiple names consisting of XML token characters.

String Attributes Most commonly used attribute Example: <!ATTLIST player team CDATA #REQUIRED> In the above example, the team to which a player belongs is a required character data attribute that must be defined in the player element. <!ATTLIST player team CDATA #IMPLIED> would have made the team optional. Another example: <!ELEMENT movie (Producer, Director, Actor, Writer+, Duration) <!ATTLIST movie type (comedy | thriller) #REQUIRED> In this example, the movie element contains the child elements defined, but it also has a mandatory attribute called Type which has 2 possible values.