Presentation is loading. Please wait.

Presentation is loading. Please wait.

Writing Your Last DTD ? Alex Brown Griffin Brown Digital Publishing Ltd.

Similar presentations


Presentation on theme: "Writing Your Last DTD ? Alex Brown Griffin Brown Digital Publishing Ltd."— Presentation transcript:

1 Writing Your Last DTD ? Alex Brown Griffin Brown Digital Publishing Ltd

2 Background By DTD I mean simply the formal declarations as allowed by XML 1.x A ‘last DTD’ doesn’t mean a last validation mechanism: the future is not well-formed This presentation is in two parts: –Modelling –DTD-specific features

3 DTDs on the Wane? Some say DTDs are on the way out; have been saying this for a while Some evidence of shift, mostly driven by new tools and new XML implementers Rise of the pipelining model of validation (DSDL) likely. DTDs need to cooperate with other technologies DTDs are not very complete instruments of validation

4 Part I - Modelling

5 Human-facing XML Models XML can be seen as ‘just’ a serialisation format, in which case the models need ‘just’ to work This presentation concerned also with models that people experience (at some level) People often look at raw markup, and experience content models through tools (e.g. syntax-directed editors)

6 Machine-facing XML Models Desirable features: –Normalised –Machine efficient –Programmer efficient Techniques fairly easily borrowed from other disciplines (database schema design, type system design, etc.)

7 Machines vs People Also known as data vs documents ? In reality few resources are at the extremes of this spectrum Many resources mix data-like and document-like features The challenge is in finding a balance and tolerating the mess

8 Data Normalisation i.e., single items of data appear once A really good idea for some data E.g. link targets, database dumps

9 Mixed Content Normalisation not a natural feature of human languages The cat sat on the mat not The cat on the mat

10 When natural language is suitable Don’t be afraid to model mixed content (‘diamonds in the mud’ approach) –e.g. bibliographic references Sometimes the precision of human language cannot be modelled precisely –e.g. addresses

11 Type Hierarchies (1) Credit-card @type=‘…’ ExpiryNumberName Credit-card @type=‘SWITCH ’ Expiry NumberName Issue Number ?

12 Type Hierarchies (2) visa-card Expi ry Num ber Nam e switch-card Expi ry Num ber Nam e visa-card (etc.) Expi ry Num ber Nam e Issue Num ber Credit-card

13 Optional Elements? Optional often doesn’t mean ‘optional’, in practice it is used to mean ‘must exist’ or ‘must not exist’ Consider making choice explicit: e.g., ( issue-number|no-issue-number ) Type-safe models are good for machine facing data; but require maintenance

14 Mega Markup ‘Just Tag It’ ? Models should have a justification (often a business justification) Rich inline tagging in particular needs to be thought-through (KM technologies often better for enriching documents)

15 Part II - Practicalities

16 Documentation DTDs are comparatively easy to document: content models are terse but expressive (people like them) e.g. A DTD is not a.DTD – and documentation is costly! Don’t make the limits of the DTD the limits of your specification; DTDs ‘rough out’ content We need a graphical standard for representing models (not UML please)

17 Deployment Deploy a normalised version of your DTD via a web server Require that this authoritative version is used during data handovers Consider requiring the use of PUBLIC identifiers

18 Parameterisation Parameter entities: macro-like features for use in DTDs More useful in development than mature phases in a DTD’s life time.

19 Entities Entity declarations are a DTD-only feature. Not in W3 Schema or RELAX NG (but maybe in DSDL) Good reason for sticking with DTDs – especially character entities. But, will make your data DTD-dependent In publishing, losing entities has not proved a problem (surprisingly)

20 Namespaces DTDs and Namespaces are uneasy partners –Prefix inflexibility –Conventions and kludges, not standard –Buggy software (microsoft parsers) Avoid using Namespaces with DTDs whenever possible

21 But if you must … Do not use #FIXED or default attributes in the DTD (tools will complain) Pre-pick your prefixes, and qualify the names of vocabularies within your DTD (e.g. m: for MathML) #REQUIRE the xmlns attribute(s) on your root elements, and use an external tool to enforce this

22 Example <!ATTLIST root xmlns CDATA #REQUIRED xmlns:m CDATA #REQUIRED> …

23 But if you must (2) This works with tools, and means your namespaces work with/without the DTD being present Don’t get stressed: remember XSLT

24 Defaulting DTDs provide the means to add items to the infoset – default attribute values So do W3 Schemas; RELAX NG does not * Using defaulting makes your document depend on your DTD/Schema; do not use it (remember XSLT)

25 Example Make the value inferable, and document it Again, remember XSLT

26 Off-the-shelf standards For XML: MathML, SVG, CALS or Exchange Tables, XHTML, etc. Forget XLink: much pain, no gain Remember there are standards for many things: country, language, date time, latitude/longtitude. Good DTDs leverage standards.

27 In Summary Pick good models Document your DTD and control its deployment Use Namespaces defensively Do not use entity (or notation) declarations Do not use attribute defaulting Use standards where possible

28 Thank You Any Questions ? alexb@griffinbrown.co.uk http://www.griffinbrown.co.uk/


Download ppt "Writing Your Last DTD ? Alex Brown Griffin Brown Digital Publishing Ltd."

Similar presentations


Ads by Google