XML Syntax - Writing XML and Designing DTD's

Slides:



Advertisements
Similar presentations
XML I.
Advertisements

Defining XML The Document Type Definition. Document Type Definition text syntax for defining –elements of XML –attributes (and possibly default values)
© De Montfort University, XML – a meta language Howell Istance and Peter Norris School of Computing De Montfort University.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
History Leading to XHTML
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
Introduction to XML: DTD
XML Study-Session: Part II Validating XML Documents.
Document Type Definition DTDs CS-328. What is a DTD Defines the structure of an XML document Only the elements defined in a DTD can be used in an XML.
Document Type Definitions
A Technical Introduction to XML Transparency No. 1 XML quick References.
 2002 Prentice Hall, Inc. All rights reserved. ISQA 407 XML/WML Winter 2002 Dr. Sergio Davalos.
Creating a Well-Formed Valid Document. 2 Objectives Introducing XHTML Creating a Well-Formed Document Creating a Valid Document Creating an XHTML Document.
Physical and Logical Structure
Declare A DTD File. Declare A DTD Inline File For example, use DTD to restrict the value of an XML document to contain only character data.
COS 381 Day 14. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
VALIDATING AN XML DOCUMENT
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Validating DOCUMENTS with DTDs
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
Chapter 4: Document Type Definitions. Chapter 4 Objectives Learn to create DTDs Validate an XML document against a DTD Use DTDs to create XML documents.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a family of languages ● semantic/structured mark-up language.
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
XML (2) DTD Sungchul Hong.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
 2002 Prentice Hall, Inc. All rights reserved. Chapter 6 – Document Type Definition (DTD) Outline 6.1Introduction 6.2Parsers, Well-formed and Valid XML.
Lecture 6 XML DTD Content of.xml fileContent of.dtd file.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML - DTD Week 4 Anthony Borquez. What can XML do? provides an application independent way of sharing data. independent groups of people can agree to.
XML Extensible Markup Language Aleksandar Bogdanovski Programing Enviroment LABoratory
SNU OOPSLA Lab. XML Documents 1 : Structure The ubiquitous XML(2) © copyright 2001 SNU OOPSLA Lab.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 4. Document Type Definitions (DTDs)
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
2 XML Syntax XML Document Structure August 15, :00 Darmstadt Hessen Germany fine 25 SW 6 Markup Content.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
An Introduction to XML Sandeep Bhattaram
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
1 Dr Alexiei Dingli XML Technologies DTD. 2 Document Type Definition Defines –the legal building blocks of an XML document –the document structure –The.
1/11 ITApplications XML Module Session 3: Document Type Definition (DTD) Part 1.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
Document Type Definitions (DTD) A Document Type Definition (DTD) defines the structure and the legal elements and attributes of an XML document. A DTD.
Beginning XML 3 rd Edition. Chapter 4: Document Type Definitions.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
Well Formed XML The basics. A Simple XML Document Smith Alice.
Document Type Definition (DTD) Eugenia Fernandez IUPUI.
DTD Document Type Definition. Agenda Introduction to DTD DTD Building Blocks DTD Elements DTD Attributes DTD Entities DTD Exercises DTD Q&A.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
Extensible Markup Language (XML) Pat Morin COMP 2405.
Session III Chapter 6 – Creating DTDs
New Perspectives on XML
Session II Chapter 6 – Creating DTDs
Allyson Falkner Spokane County ISD
XML IST 421.
Presentation transcript:

XML Syntax - Writing XML and Designing DTD's

HTML – 1st Example <html><head><title>Chocolate Cake</title><body> <b>Ingredient List</b><hr /> <br>2 cups flour <br>1 cup sugar <br>2 bars chocolate <br>1 cup milk <br><br><b>Instructions</b> <hr><br>Mix flour, sugar and milk <br>Eat chocolate <br>Bake at 400 degrees </body></html> Ingredients and Instructions marked by <b> Bold tag, Each ingredient / instruction distinguishable only by the new line <br /> Break tag.

XML Document Structure Text file containing Elements, Attributes & Text <?xml version=“1.0” ?> <Recipe name=“Chocolate Cake” type=“Desert” > <IngredientList> <Ingredient>2 cups flour</Ingredient> <Ingredient>1 cup sugar</Ingredient> </IngredientList> <Instruction>Sift the flour</Instruction> </Recipe> - XML is just the data, no presentation instructions. Later we’ll see how the presentation of the data is taken care of. Show the elements, the attributes and the text Point out the hierarchical / tree structure of the document

XML Document Structure Text file containing Elements, Attributes & Text <?xml version=“1.0” ?> <Recipe name=“Chocolate Cake” type=“Desert” > <IngredientList> <Ingredient>2 cups flour</Ingredient> <Ingredient>1 cup sugar</Ingredient> </IngredientList> <Instruction>Sift the flour</Instruction> </Recipe> - XML is just the data, no presentation instructions. Later we’ll see how the presentation of the data is taken care of. Show the elements, the attributes and the text Point out the hierarchical / tree structure of the document

10 Rules – Well Formed XML 1. Must start with XML declaration <?xml version=“1.0” ?> In order for XML to be used, it has to be “WELL FORMED”. This means it is in a structure that makes it readable by an xml program.

2. Must be only one document element Valid Example(s) <?xml version=“1.0” ?> <recipe> </recipe> or <recipeBook> <recipe></recipe> </recipeBook> Invalid Example <?xml version=“1.0”?> <recipe> </recipe> This highlights also the hierarchical / tree structure of all xml documents

3. Match opening & closing tags Carry over from html origins <hr> <p> or <bold><italic></bold></italic> Browsers forgive, XML Parsers do NOT <p></p> or <br /> <bold><italic></italic></bold> <recipe></recipe> HTML doesn’t care if you close your tags, or the order in which you close them. XML is much fussier, you must close ALL tags, you must close them in the hierarchical order.

4. Comments allowed, but not inside attribute or element tag <!-- Isn’t XML really cool? --> <!-- Just like being a student!!! --> Comments can also extend across multiple lines.

5. Elements and Attributes must start with a letter <Recipe> OK <Second third=“false”> OK <2nd> INVALID <Recipe 2nd=“true”> INVALID Comments can also extend across multiple lines.

6. Attributes must go in the opening tag Valid: <recipe name=“Chocolate Cake” category=“Desert”></recipe> Invalid: <recipe></recipe name=“Chocolate Cake”>

7. Attributes must be enclosed in matching quotes Can use either single or double quotes but must use same type to start and end attribute value Name=“Australian Computer Society” Name=‘Australian Computer Society’

Let’s finish these rules! 8. Only simple text for attributes, no nested values. Nesting is allowed in elements, not in attributes. 9. Use < & > " and &apos; for special characters. < & > “ ‘ 10. Write empty elements using <recipe /> syntax if no nested values, can still have attributes in tag <recipe type=“desert” />.

With these 10 rules, we have a “Well Formed” xml document It means the xml can be read, processed or parsed. Doesn’t mean the structure makes sense. <recipe model=“Holden”> <chapter></chapter> <engine cylinders=“4”></engine> <recipe> All XML must be well formed, if not well formed, can’t be treated as XML, can’t be processed / parsed. 2nd part of xml basics is validating the xml document. This is optional and allows us to ensure the document follows a specific structure.

Examples Buggy dictionary Non-buggy dictionary FIDA

DTD – Document Type Definition Allows us to define the exact elements and attributes for the document These effectively become the rules of our own markup language, the extensible part of xml DTD – really only defines the structure, limited in what you can validate in regards to the text values of the element or attribute.

Recipe DTD <!ELEMENT Recipe (Name, Description?, Ingredients?, Instructions?)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Ingredient (Qty, Item)> <!ELEMENT Qty (#PCDATA)> <!ATTLIST Qty unit CDATA #REQUIRED> <!ELEMENT Item (#PCDATA)> <!ATTLIST Item optional CDATA “0” isVegetarian CDATA “true”>

Elements Basic rules Start tag <tag_name> and end tag </tag_name> Tags must be nested <tag1><tag2>…</tag2></tag1> Tags may be empty (no enclosed data) <empty_tag/> Whitespace in element content usually ignored <section><p> … </p></section> <section> <p> … </p> </section>

Element Declarations Used to define new elements and their content <!ELEMENT name (#PCDATA)>  <name> … </name> Empty element has no content <!ELEMENT name EMPTY>  <name/> When children allowed - any or model group <!ELEMENT name ANY> <!ELEMENT person (name, e-mail*)>

Model Groups Used to define content of elements <!ELEMENT person (name, e-mail*)> Used to define hierarchies of elements <!ELEMENT name (fname, surname)> <!ELEMENT fname (#PCDATA)> <!ELEMENT surname (#PCDATA)> <!ELEMENT e-mail (#PCDATA)> Control organisation of elements Sequence connector - ',' - (A, B, C) [then] Choice connector - '|' - (A | B | C) [or]

Model Group Quantity Indicators Describe constraints on elements in DTD A? May occur [0..1] A+ Must occur [1..*] A* May occur [0..*] A | B Either A or B A, B A followed by B (A, B)+ ((A,B?) | C+)*

Attributes Provide additional information about an element Enclosed by quotes - either " or ' Case-sensitive May be character data or tokenized value="Blue Peter" (character data) value = "blue" (single token) value = "red green blue" (tokens) Values may be enumerated or defaulted (DTD)

Attribute Declarations Attributes can be attached to elements Declared separately in ATTLIST declaration <!ATTLIST tag … > Rest of definition specifies attribute name attribute type default value

Attribute Names and Types <!ATTLIST tag nme type default> <!ATTLIST tag first_attr … secon_attr … third_attr … > Attribute types CDATA NMTOKEN NMTOKENS ENTITY ENTITIES ID IDREF IDREFS NOTATION name group

Attribute Types CDATA NMTOKEN ID NMTOKENS IDREF ENTITY IDREFS ENTITIES Character data NMTOKEN Single token NMTOKENS Multiple tokens ENTITY Attribute is entity ref ENTITIES Multiple entity ref's ID Unique ID IDREF Match to ID IDREFS Match to multiple ID's NOTATION Describe non-XML data Name group Restricted list

Attribute Types CDATA NMTOKEN ID NMTOKENS IDREF ENTITY IDREFS ENTITIES <!ATTLIST person name CDATA … > NMTOKEN <!ATTLIST mug color NMTOKEN … > NMTOKENS <!ATTLIST temp values NMTOKENS … > ENTITY <!ATTLIST person photo ENTITY … > ENTITIES <!ATTLIST album photos ENTITIES …> ID <!ATTLIST person id ID … > IDREF <!ATTLIST person father IDREF … > IDREFS <!ATTLIST person children IDREFS … > NOTATION <!ATTLIST image format NOTATION (TeX|TIFF) …> Name group <!ATTLIST point coord (X|Y|Z) … >

Attribute Types CDATA NMTOKEN ID NMTOKENS IDREF ENTITY IDREFS ENTITIES name = "Tom Jones" NMTOKEN color="red" NMTOKENS values="12 15 34" ENTITY photo="MyPic" ENTITIES photos="pic1 pic2" ID ID = "P09567" IDREF IDREF="P09567" IDREFS IDREFS="A01 A02" NOTATION FORMAT="TeX" Name group coord="X"

Default Attribute Values Can specify a default attribute value for when its missing from XML document, or state that value must be entered #REQUIRED Must be specified #IMPLIED May be specifed "default" Default value if unspecified #FIXED Only one value allowed <ATTLIST tag name type default> <!ATTLIST seqlist sepchar NMTOKEN #REQUIRED type (alpha|num) "num"

Declarations Instructions for the XML processor Format - <! … > or <! … [<! … >]> Document type - <!DOCTYPE … > Character data - <![CDATA[ … ]]> Entities - <!ENTITY … > Notation - <!NOTATION … > Element - <!ELEMENT … > Attributes - <!ATTLIST … > <![INCLUDE[…]]> and <![IGNORE[…]]>

Document Type Declaration Identifies the name of the document root element <!DOCTYPE My_XML_Doc> May also add entity definitions and DTD <!DOCTYPE My_XML_Doc [ … ] > <My_XML_Doc> ... </My_XML_Doc>

Comment Declaration Comments are not considered part of XML document and should not be published <!-- A comment --> Cannot have additional '--' in comment Cannot embed inside other declarations

Character Data Declaration For occasions when text must contain uninterpreted markup characters Press <<<ENTER>>> <![CDATA[Press <<<ENTER>>>]]>

Processing Instructions Information required by an external application Processing Instructions Format - <? … ?> XML PI - <?xml version='1.0’ ?> Confusingly, this is called the XML declaration, but is a processing instruction

Entities XML document may be distributed among a number of files Each unit of information is called an entity Each entity has a name to identify it Defined using an entity declaration Used by calling an entity reference

When to use Entities Use an entity when the information Is used in several places May be represented differently Is part of a larger document that needs to be split up to be manageable Conforms to a data format other than XML

Types of Entity General Entity Internal Entity Parameter Entity Referred to in XML document Parameter Entity Referred to in markup declarations in DTD Internal Entity Stored in main document Text content only External Entity Stored externally to the main document Text or binary Can use to group many internal entities together

General Entities Declared in 'Document Type Declaration' <!DOCTYPE My_XML_Doc [ <!ENTITY name "replacement"> ]> <!ENTITY xml "eXtensible Markup Language"> The &xml; includes entities The eXtensible Markup Language includes entities

Parameter Entities Declared in 'Document Type Declaration' <!DOCTYPE My_XML_Doc [ <!ENTITY % name "replacement"> ]> <!ENTITY % param "(para | list)"> <!ELEMENT section (%param;)*>

External Entities External Text Entities External Binary Entities Location specified with SYSTEM keyword <!ENTITY ent SYSTEM "/ENTS/MYENT.XML"> May specify with public identifier <!ENTITY ent PUBLIC "-//EBI//ENTITIES ents//EN" … > External Binary Entities Need to identify format of data - NDATA <!ELEMENT pic EMPTY> <!ATTLIST pic name ENTITY #REQUIRED> <!ENTITY photo SYSTEM "/ENTS/photo.tif" NDATA TIFF> Referenced by empty element A photograph <pic name="photo"/>.

Restrictions on Entities General text entities Can appear in element content <para> … &ent; … </para> Can appear in attribute value <para name="&ent;"> … </para> Can appear in internal entity content <!ENTITY cod "&ent;"> Cannot appear in other parts of DTD

Restrictions on Entities (2) Binary entities If entity content is not XML, the entity cannot be used as a textual reference Error - <!ELEMENT sec (para|&photo;)> Error - <para> &photo; </para> Binary entity can only appear as an attribute of type ENTITY <!ENTITY photo SYSTEM "photo.tif" NDATA TIFF> … <!ELEMENT pic (#PCDATA)> <!ATTLIST pic name ENTITY #REQUIRED>

Parameter Entities Use parameter entities within DTD <!ENTITY % common "(para|list|table)"> <!ELEMENT chapter ((%common;)*, section*)> <!ELEMENT section (%common;)*> Safest to include parentheses in entity definition and around entity reference

Putting it all together... Have now been introduced to the main components and rules of XML and DTD’s Entities, elements, declarations, processing instructions, attribute lists Use all these components in the 'Document Definition Type' (DTD) to specify the rules about the format of the XML document