Document Type Definition DTDs

Slides:



Advertisements
Similar presentations
XML I.
Advertisements

Defining XML The Document Type Definition. Document Type Definition text syntax for defining –elements of XML –attributes (and possibly default values)
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
DTDs : definitions. Defining Elements PCDATA: Parsed character data i.e., any characters without further XML structure.
Introduction to XML: DTD
Document Type Definition DTDs CS-328. What is a DTD Defines the structure of an XML document Only the elements defined in a DTD can be used in an XML.
Document Type Definitions
More of DTDs Lecture 3. Symbols used in DTD COMMA “, ” enforces sequence.
1 XML: Document Type Definitions 2 Road Map  Introduction to DTDs  What’s a DTD?  Why are they important?  What will we cover?  Our First DTD 
A Technical Introduction to XML Transparency No. 1 XML quick References.
 2002 Prentice Hall, Inc. All rights reserved. ISQA 407 XML/WML Winter 2002 Dr. Sergio Davalos.
Full declaration When an element is declared to have element content, the children element types must also be declared Example: to which the following.
Declare A DTD File. Declare A DTD Inline File For example, use DTD to restrict the value of an XML document to contain only character data.
1 Print your own copy If you bring it along, hand in with your exam script Do not write anything extra or you will be penalized Student Name: Student Number:
XML Verification Well-formed XML document  conforms to basic XML syntax  contains only built-in character entities Validated XML document  conforms.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
VALIDATING AN XML DOCUMENT
Introduction to XML This material is based heavily on the tutorial by the same name at
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Validating DOCUMENTS with DTDs
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
Copyright © 2003 Pearson Education, Inc. Slide 3-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide.
Chapter 4: Document Type Definitions. Chapter 4 Objectives Learn to create DTDs Validate an XML document against a DTD Use DTDs to create XML documents.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML - DTD. The building blocks of XML documents Elements, Tags, Attributes, Entities, PCDATA, and CDATA.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Syntax - Writing XML and Designing DTD's
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
XML (2) DTD Sungchul Hong.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
 2002 Prentice Hall, Inc. All rights reserved. Chapter 6 – Document Type Definition (DTD) Outline 6.1Introduction 6.2Parsers, Well-formed and Valid XML.
Lecture 6 XML DTD Content of.xml fileContent of.dtd file.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML - DTD Week 4 Anthony Borquez. What can XML do? provides an application independent way of sharing data. independent groups of people can agree to.
SNU OOPSLA Lab. XML Documents 1 : Structure The ubiquitous XML(2) © copyright 2001 SNU OOPSLA Lab.
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
XML Validation I DTDs Robin Burke ECT 360 Winter 2004.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
An Introduction to XML Sandeep Bhattaram
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
1 Dr Alexiei Dingli XML Technologies DTD. 2 Document Type Definition Defines –the legal building blocks of an XML document –the document structure –The.
1/11 ITApplications XML Module Session 3: Document Type Definition (DTD) Part 1.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
CSE3201 Information Retrieval Systems DTD Document Type Definition.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
Document Type Definitions (DTD) A Document Type Definition (DTD) defines the structure and the legal elements and attributes of an XML document. A DTD.
Beginning XML 3 rd Edition. Chapter 4: Document Type Definitions.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
XML DTD. XML Validation XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid" XML.
Document Type Definition (DTD) Eugenia Fernandez IUPUI.
DTD Document Type Definition. Agenda Introduction to DTD DTD Building Blocks DTD Elements DTD Attributes DTD Entities DTD Exercises DTD Q&A.
Copyrighted material John Tullis 3/18/2016 page 1 04/29/00 XML Part 4 John Tullis DePaul Instructor
CH 9 Attribute Declaration 1. Objective What is an attribute Declaring attributes Declaring multiple attribute Alternatives to default attributes values.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML Technologies DTD.
Session III Chapter 6 – Creating DTDs
XML Data DTDs, IDs & IDREFs.
New Perspectives on XML
Session II Chapter 6 – Creating DTDs
Document Type Definition (DTD)
XML IST 421.
Presentation transcript:

Document Type Definition DTDs

What is a DTD Defines the structure of an XML document Only the elements defined in a DTD can be used in an XML document can be internal or external A DTD defines the structure of a “valid” XML document Processing overhead is incurred when validating XML with a DTD

An internal DTD <?xml version=“1.0”?> <!DOCTYPE invoice [ <!ELEMENT invoice (sku, qty, desc, price) > <!ELEMENT sku (#PCDATA) > <!ELEMENT qty (#PCDATA) > <!ELEMENT desc (#PCDATA) > <!ELEMENT price (#PCDATA) > }> <invoice> <sku>12345</sku> <qty>55</qty> <desc>Left handed monkey wrench</desc> <price>14.95</price> </invoice>

An referenced external DTD <?xml version=“1.0”> <!DOCTYPE invoice SYSTEM “invoice.dtd”> <invoice> <sku>12345</sku> <qty>55</qty> <desc>Left handed monkey wrench</desc> <price>14.95</price> </invoice>

An external DTD (invoice.dtd) <?xml version=“1.0”?> <!ELEMENT invoice (sku, qty, desc, price) > <!ELEMENT sku (#PCDATA) > <!ELEMENT qty (#PCDATA) > <!ELEMENT desc (#PCDATA) > <!ELEMENT price (#PCDATA) >

Content Model Identify the name of the element and the nature of that element’s content The example declares an element that then describes the document’s content model Name Content model <!ELEMENT note (to, from, subject, body)> Element definition

Document Type Declarations There are four types of declarations: Element type declarations http://www.w3.org/TR/REC-xml#elemdecls Attribute List Declarations http://www.w3.org/TR/RECxml-attdecls Entity declarations http://www.w3.org/TR/REC-xml#sec-entity-decl Notation declarations http://www.w3.org/TR.REC-xml#Notations

Element Type Declarations Three types of elements EMPTY elements ANY elements MIXED elements

Empty Elements An element that can not contain any content The html image tag in xml would typically be empty, such as <image></image> or <image/> empty elements are more useful with the use of attributes <!ELEMENT test EMPTY> <!ELEMENT image EMPTY> <!ELEMENT br EMPTY>

ANY Element An element that can contain any content it is recommended not to get into the habit declaring elements with the ANY keyword useful when transferring a lot of mixed or unknown data <!ELEMENT test ANY >

Mixed Element Elements that can contain a set of content alternatives Separate the options with the “or” symbol “|” <!ELEMENT test <#PCDATA | name>

Data Types Parsed Character Data Unparsed Character Data #PCDATA CDATA <!ELEMENT firstname (#PCDATA) <!ELEMENT lastname (#PCDATA) Unparsed Character Data CDATA <firstname><![CDATA[<b>Jim</b>]]></firstname> <lastname><![CDATA[<b>Peters</b>]]></lastname>

Structure Symbols Parenthesis (samp1, samp2) - The element must contain the sequence samp1 and samp2 Comma (samp1,samp2,samp3) - The element must contain samp1,samp2 and samp3 in that order Or (samp1|samp2|samp3) - The element can contain samp1, samp2 or samp3 ? samp1? - Element might contain samp1, if it does it can only do it once * samp1* - Element can contain samp1 one or more times + samp1+ - Element must contain samp1 at least once none samp1 - Element must contain samp1

Elements with more structure <!ELEMENT email (to+ , from , subject? , body) to: is reqd and can appear more than once from: must appear only once subject: optional, but if included can only appear once body: optional, but if included can only appear once

XML Element Attributes XML tags can contain attributes similar to attributes in HTML tags Attributes are usually used to provide processing information to the XML application (the application that is going to consume the XML) HTML Examples: <h1> align=“center”>An XML Example<h1> <table width=page> </table>

Attribute Rules attribute values must be placed in “ “ in HTML this is only required id the attribute contains the space character attribute values are not processed by the XML parser this means the values can’t be automatically checked by the parser

Attributes or Elements? Is it better to use attributes or to just make additional XML elements there are no set rules when to use one over the other experience is best teacher but to help you decide: attribute values are not parsed can contain special characters that aren’t allowed in elements drawback - they cannot be validated by the parser must be validated by additional code in the application

An Example this can’t this can be validated <?xml version=“1.0” ?> <invoice> <date> <month>12</month <day>22</day> <year>2002</year> </date> <sku>12345</sku> <qty>55</qty> <desc>Left handed monkey wrench</desc> <price>14.95</price> </invoice> <?xml version=“1.0” ?> <invoice date=“7/22/2002”> <sku>12345</sku> <qty>55</qty> <desc>Left handed monkey wrench</desc> <price>14.95</price> </invoice> this can’t this can be validated

Attribute Declarations Invoice Element Declaration: <?xml version=“1.0” ?> <!ELEMENT employee (#PCDATA) <!ATTLIST ElementName AttributeName Type Default > <!ATTLIST employee type (FullTime | PartTime) “FullTime” > Usage in XML file: <?xml version=“1.0” ?> <employee type=“PartTime”/>

Other Attribute Declarations CDATA CDATA attributes are strings , any text is allowed ID The values of an ID attribute must be a name. All id the ID attributes used in a document must be unique. IDs uniquely identify individual elements in a document.Elements can only have a single ID attrinute IDREF or IDREFS An IDREF attributes value must be the value of a single ID attribute on some element in the document. The value of an IDREFs attribute may contain multiple IDREF values seperated by white space. ENTITY or ENTITIES An ENTITY attribute’s must be the name of a single ENTITY. The value of an ENTITIES attribute may contain multiple entity names separated by white space. NMTOKEN or NMTOKENS Name token attributes are a restricted form of string attribute, but there are no other restrictions on the word. List of Names Enumerated You can specify that the value of an attribute must be taken from a specific list of names. This frequently called an enumerated type because each of the possible values must be explicitely enumerated in the declaration

Attribute Defaults #REQUIRED #IMPLIED “value” #FIXED “value” The attribute must have an explicitly specified value for every occurrence of the element in the document #IMPLIED The attribute value is not required and no default value is provided. If a value is not specified the XMP processor must proceed without one. “value” An attrubute can be given any legal value as a default. The attribute value is not required on each element of the document, and if it is not present it will appear to be the specified default #FIXED “value” An attribute declaration may specify that an attribute has a fixed value. In this case, the attribute is not required, but if it occurrs, it must have the specified value. If it is not present, it will appear to be the specified defualt

A Code sample <?xml version=“1.0” ?> <!DOCTYPE email[ <!ATTLIST email language (english | french | spanish) “english” priority (normal | high | low) “normal” > <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA) > <!ELEMENT subject (#PCDATA) > <!ELEMENT message (#PCDATA) > ] > <email language=“spanish” priorit=“high”> <to>Peter Brenner</to> <from>Dick Steflik</from <subject> Test Reminder</subject> <message>The exam is a week from today</message> </email>

Attribute Summary Attributes cannot contain multipe values cannot be validated cannot describe structures like child elements can It is recommended to use attributes sparingly The following code would not be good form: <?xml version=“1.0” ?> <email language=“english” priority=“high” to=“you” from=“me” subject=“Reminder” message=“The test is a week from today !” />