Worshipping at the Shrine: Myths and Legends from comp.text.xml Kerry “the heretic” Raymond, CiTR.

2 XML is the new religion Why is XML so popular? No new idea, but a simple SGML Right place at the right time –HTML introduced mark-up to a wide audience –W3C “brandname” –“worse is better” or “simple better than perfect” Lots of XML-enabled products announced

3 So simple, anyone can do it! “I am a 45yr old medical Doctor (General Practice, England) who needs the challenge of a new career. Electronics/communications/computing have been a lifetime hobby interest. I propose to learn XML etc. and combine this with my people skills to work personally with customers to provide them with information management solutions, specialising in medical environments.”

4 So inadequate, please extend it? XML 1.0 not enough –XML Namespaces –XML Schema –XML Data –XML Link –XML Pointer –And so on …

5 What XML is? XML is a data format with a basic syntax and generic well-formedness rules DTD and Schema provide model-specific well-formedness rules Semantics of the model –Out of scope! Often confused with related APIs and tools, e.g. DOM

6 Ancient Myths: when comp.text.xml was new XML is compact XML is fast XML will replace word processors XML will replace relational databases XML will replace CORBA XML-enabled applications will interwork

7 Myth: XML is Compact XML is bigger than most proprietary format (for the same information content) –Content presented as text not binary –Markup is verbose … –Lots of nesting but little content Often more bytes in tags than in content –But does it matter? Only if short of space (e.g. floppy disk)

8 Myth: XML is fast How do you measure the speed of a data format anyway? Being both generic and heavily text-based (requiring lexical analysis and parsing), specialised “binary” formats will be “faster” Development of an XML-enabled tool is faster (XML parser is a generic COTS tool) Does it matter? Depends on your application.

9 Myth: XML will replace word processors A word processor is a program, not a data format XML is about semantic markup, word processors are mostly about presentation Word processors may use XML in addition to proprietary formats –But interchange will require an agreed DTD/schema –Interchange will address presentation not semantics Word processors may enable embedding of XML tags and become XML editors

10 Myth: XML will replace relational databases While XML can express the content of a (relational) database, it is not an efficient format for either query or update –Could build a database engine based on XML but performance will be an issue Use of XML may replace the use of small static databases XML has a role for interchange –But only if a DTD/schema is agreed

11 Myth: XML will replace CORBA XML could be used as the underlying message format (invisible to CORBA users) –But larger and slower than current format Could enable interaction with non-CORBA applications –Provided they were CORBA-DTD/Schema compliant and responded according to CORBA protocol So not CORBA but very CORBA-aware! XML can be carried as CORBA payload – but so can Shakespearean sonnets

12 Myth: XML-enabled applications will interwork Yes, to the extent of parsing an XML file Yes, to doing generic actions on that file –E.g. create a database with corresponding structure After that, you need semantic knowledge of the DTD/Schema (if any) –E.g. update an existing database

13 What XML isn’t! Compact Fast A program of any sort A communications protocol A solution to interoperability problems –But it can help in lots of ways

14 New Myths about XML! We need to define specialised network protocols for it –What’s wrong with SMTP, FTP, and HTTP? –XML is not small enough nor fast enough! Driven by desire to use XML in applications in small mobile devices –e.g. PDAs, mobile phones – low bandwidth and limited computation

15 WBXML : XML as binary “The binary format was designed to allow for compact transmission with no loss of functionality or semantic information … allowing more effective use of XML data on narrowband communication channels … The binary format encodes the parsed physical form of an XML document, i.e., the structure and content of the document entities.”

16 Disillusionment sets in … “ Specifically I am complaining that W3C taking years and years to release XML schema including simple and obvious things like data types, which are desperately needed by small business. … instead of meeting human needs, W3C includes an endless progression of enhancements that make XML too large and complex for production developers to economically traverse. … the reason … is corruption: too many individuals in the process are motivated to keep making XML complex, delaying competitors uptake, and intentionally preventing the general population from using XML for data interoperability in business.”

17 Conclusions on XML Use it Abuse it Just don’t worship it!

