Writing Your Last DTD ? Alex Brown Griffin Brown Digital Publishing Ltd.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
 Fundamentals of Web Design.  Describe the history and theory of XHTML  Understand the rules for creating valid XHTML documents  Apply a DTD to an.
History Leading to XHTML
Achieving Distributed Extensibility and Versioning in XML Dave Orchard W3C Lead BEA Systems.
Tutorial 9 Working with XHTML
Review Writing XML  Style  Common errors 1XML Technologies David Raponi.
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
Creating a Well-Formed Valid Document. 2 Objectives Introducing XHTML Creating a Well-Formed Document Creating a Valid Document Creating an XHTML Document.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
XML Verification Well-formed XML document  conforms to basic XML syntax  contains only built-in character entities Validated XML document  conforms.
Introduction to XML This material is based heavily on the tutorial by the same name at
 ACORD ACORD’s Experiences using W3C Schemas Dan Vint Senior Architect
Topics The "bigger picture" –The "XML sales pitch" –XML/XHTML vs. SGML/HTML –XML in electronic publishing –XML and the future, web 2.0 XML basics: –Building.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
XML – Extensible Markup Language Sivakumar Kuttuva & Janusz Zalewski.
Validating DOCUMENTS with DTDs
Working with XHTML Creating a Well-Formed Valid Document.
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
XP The University of Akron Summit College Business Technology Department Computer Information Systems 2440: 140 Internet Tools Instructor: Enoch E. Damson.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Copyright © 2003 Pearson Education, Inc. Slide 3-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
XML and XSL Institutional Web Management 2001: Organising Chaos.
Scientific Markup Languages Birds of a Feather A 10-Minute Introduction to XML Timothy W. Cole Mathematics Librarian & Professor of.
Chapter 4: Document Type Definitions. Chapter 4 Objectives Learn to create DTDs Validate an XML document against a DTD Use DTDs to create XML documents.
2440: 211 Interactive Web Programming Introduction to the Internet & the World Wide Web.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML - Why: The HTML-Dilemma HTML, SGML, XML - How: Syntax, Concept, Language Elements Basics Well-formed XML-Documents (without DTD) Valid XML-Documents.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
XML (2) DTD Sungchul Hong.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
1 XML An Overview Roger Debreceny University of Hawai`i Skip White University of Delaware XBRL Workshop, August 2006.
CP3024 Lecture 9 XML: Extensible Markup Language.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
The Official 2002 XML Marathon April 4, Revised Requirements A photocopy of the original text A short description (read: single paragraph) discussing.
Processing of structured documents Spring 2003, Part 3 Helena Ahonen-Myka.
INFSY 547: WEB-Based Technologies Gayle J Yaverbaum, PhD Professor of Information Systems Penn State Harrisburg.
SNU OOPSLA Lab. Logical structure © copyright 2001 SNU OOPSLA Lab.
XML eXtensible Markup Language. XML A method of defining a format for exchanging documents and data. –Allows one to define a dialect of XML –A library.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML Validation II Advanced DTDs + Schemas Robin Burke ECT 360.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Lifecycle Metadata for Digital Objects October 2, 2006 Implementing Metadata in XML.
Tutorial 9 Working with XHTML. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Describe the history and theory of XHTML.
CIS 228 The Internet 9/20/11 XHTML 1.0. “Quirks” Mode Today, all browsers support standards Compliant pages are displayed similarly There are multiple.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 14 This presentation © 2004, MacAvon Media Productions XML.
XML Introduction Bill Jerome.
Unit 4 Representing Web Data: XML
Chapter 7 Representing Web Data: XML
Presentation transcript:

Writing Your Last DTD ? Alex Brown Griffin Brown Digital Publishing Ltd

Background By DTD I mean simply the formal declarations as allowed by XML 1.x A ‘last DTD’ doesn’t mean a last validation mechanism: the future is not well-formed This presentation is in two parts: –Modelling –DTD-specific features

DTDs on the Wane? Some say DTDs are on the way out; have been saying this for a while Some evidence of shift, mostly driven by new tools and new XML implementers Rise of the pipelining model of validation (DSDL) likely. DTDs need to cooperate with other technologies DTDs are not very complete instruments of validation

Part I - Modelling

Human-facing XML Models XML can be seen as ‘just’ a serialisation format, in which case the models need ‘just’ to work This presentation concerned also with models that people experience (at some level) People often look at raw markup, and experience content models through tools (e.g. syntax-directed editors)

Machine-facing XML Models Desirable features: –Normalised –Machine efficient –Programmer efficient Techniques fairly easily borrowed from other disciplines (database schema design, type system design, etc.)

Machines vs People Also known as data vs documents ? In reality few resources are at the extremes of this spectrum Many resources mix data-like and document-like features The challenge is in finding a balance and tolerating the mess

Data Normalisation i.e., single items of data appear once A really good idea for some data E.g. link targets, database dumps

Mixed Content Normalisation not a natural feature of human languages The cat sat on the mat not The cat on the mat

When natural language is suitable Don’t be afraid to model mixed content (‘diamonds in the mud’ approach) –e.g. bibliographic references Sometimes the precision of human language cannot be modelled precisely –e.g. addresses

Type Hierarchies (1) ExpiryNumberName ’ Expiry NumberName Issue Number ?

Type Hierarchies (2) visa-card Expi ry Num ber Nam e switch-card Expi ry Num ber Nam e visa-card (etc.) Expi ry Num ber Nam e Issue Num ber Credit-card

Optional Elements? Optional often doesn’t mean ‘optional’, in practice it is used to mean ‘must exist’ or ‘must not exist’ Consider making choice explicit: e.g., ( issue-number|no-issue-number ) Type-safe models are good for machine facing data; but require maintenance

Mega Markup ‘Just Tag It’ ? Models should have a justification (often a business justification) Rich inline tagging in particular needs to be thought-through (KM technologies often better for enriching documents)

Part II - Practicalities

Documentation DTDs are comparatively easy to document: content models are terse but expressive (people like them) e.g. A DTD is not a.DTD – and documentation is costly! Don’t make the limits of the DTD the limits of your specification; DTDs ‘rough out’ content We need a graphical standard for representing models (not UML please)

Deployment Deploy a normalised version of your DTD via a web server Require that this authoritative version is used during data handovers Consider requiring the use of PUBLIC identifiers

Parameterisation Parameter entities: macro-like features for use in DTDs More useful in development than mature phases in a DTD’s life time.

Entities Entity declarations are a DTD-only feature. Not in W3 Schema or RELAX NG (but maybe in DSDL) Good reason for sticking with DTDs – especially character entities. But, will make your data DTD-dependent In publishing, losing entities has not proved a problem (surprisingly)

Namespaces DTDs and Namespaces are uneasy partners –Prefix inflexibility –Conventions and kludges, not standard –Buggy software (microsoft parsers) Avoid using Namespaces with DTDs whenever possible

But if you must … Do not use #FIXED or default attributes in the DTD (tools will complain) Pre-pick your prefixes, and qualify the names of vocabularies within your DTD (e.g. m: for MathML) #REQUIRE the xmlns attribute(s) on your root elements, and use an external tool to enforce this

Example <!ATTLIST root xmlns CDATA #REQUIRED xmlns:m CDATA #REQUIRED> …

But if you must (2) This works with tools, and means your namespaces work with/without the DTD being present Don’t get stressed: remember XSLT

Defaulting DTDs provide the means to add items to the infoset – default attribute values So do W3 Schemas; RELAX NG does not * Using defaulting makes your document depend on your DTD/Schema; do not use it (remember XSLT)

Example Make the value inferable, and document it Again, remember XSLT

Off-the-shelf standards For XML: MathML, SVG, CALS or Exchange Tables, XHTML, etc. Forget XLink: much pain, no gain Remember there are standards for many things: country, language, date time, latitude/longtitude. Good DTDs leverage standards.

In Summary Pick good models Document your DTD and control its deployment Use Namespaces defensively Do not use entity (or notation) declarations Do not use attribute defaulting Use standards where possible

Thank You Any Questions ?