We Need Smart XML Processing HTML has ultra-complex semantics XML has no semantics Something must bridge the gap A program ? A clear set of data semantics?
HTML’s Complex Semantics In an HTML file data semantics presentation semantics user interface semantics behavioural semantics are all mixed together.
Where has XML’s Complexity Gone? An XML file only has data semantics The remainder presentation semantics user interface semantics behavioural semantics must be mapped on somehow.
Who has the Complexity? CSS stylesheets provide a simple, linear transformation from data to presentation DOM programming provides an ad-hoc, completely flexible transformation. Either way, the complexity of the application is taken on by a program.
Where Should the Complexity Be? One approach is to have different standards which cleanly model the different semantics. Presentation - CSS User Interface - XForm Behaviour - XLink
Back to Square One with HTML? However, many of these new standards are under-developed Hence we resort to HTML as a unified presentation layer.
XML Processing: XSL XSL provides an XML vocabulary for specifying formatting semantics a language for transforming XML data The vocabulary is Formatting Objects DTD The language is XSL(T)
XSL: XML Stylesheets An XSL stylesheet describes how to transform your XML document into a different XML document (one that uses the formatting vocabulary)
Stylesheet Processing (theory) mydoc.xml mydoc.dtd style.xsl XSLT processor FOdoc.xml FO.dtd XSL processor Printed document Web page WAP page
Stylesheet Problems Pragmatics XSLT is a final W3C Recommendation XSL is still under discussion Few practical XSL implementations exist Many FO features are similar to CSS[2] IE5+ defines HTML as the formatting language to be used with XSLT.
Stylesheet Processing (practice) mydoc.xml mydoc.dtd style.xsl IE 5 / XSLT mydoc.html FO.dtd IE 5 Printed document Web page WAP page
XSL Stylesheet: Templates XSL stylesheet consists of a number of templates Each template matches an element in the original document specifies the new content to replace the element by
Simple Stylesheet A simple stylesheet might look like this... *!#$%!* replace with replace by censor all swear words XHTML
XSL Complications …but of course it's more complicated than that A stylesheet is a random mix of XSL native elements formatting language elements Namespaces must be used to distinguish the two. Also, no DTD is possible.
XSL Namespaces XSL processor will only treat element names in the correct namespace as significant. Standard namespace is Old Microsoft namespace was
Simple Stylesheet (2) So the stylesheet might look more like this... *!#$%!*
Simple Stylesheet (3) or this... *!#$%!*
XSL: Recursive Templates Of course the templates get more complex too “ ” apply-elements causes all children of the current element to be processed
XSL: Starting to Match at the Top The basis of any XSL style-sheet will be a rule that matches the document root Here is my document Contact Les Carr with problems
XSL: Matching with XPath A separate standard is used to specify the template matches XPath navigates around the elements in an XML document like a URL navigates around documents in the Web Also used in conjunction with new standards for queries and linking.
XSL: Common Elements evaluates to the contents of the selected elements (or attributes) … explicitly iterates over each of the selected elements (or attributes)
Invoking XSL To attach an XSL stylesheet to an XML file, use a processing instruction Alternatively, JavaScript the processing htmlString= xmlID.transformNode(xslID.XMLDocument);
Microsoft XSL Gotcha XSL defines two implied templates Recursively applied to all elements / text Microsoft misses these out! '/' is NOT equal to the top level element
XSL Worked Example Here is an XML Les Carr Hugh Davis Teaching Duties Using Java to teach Programming Principles seems to have worked well. The standard of software engineering seems to have increased.
Example: Outline First, define the basic outline Message Converted by XSLMail, Les Carr
Example: Result The message should appear as follows: From:Les Carr To:Hugh Davis Subject:Teaching Duties blah blah blah blah blah blah blah blah blah blah …. So we can use a table for alignment
Example: Root Template So here is the updated root template Message Converted by XSLMail, Les Carr
Example: From We want to convert the from element into a row in the table From: ( ) The first cell contains the label From: The next cell contains the contents of the name and address subelements
Example: To Similarly the to element becomes a row To: ( ) The first cell contains the label To: The next cell contains the contents of the name and address subelements
Example: Subject And the subject element Subject: The first cell contains the label Subject: The next cell contains the contents of this element
Example: Content And finally the content element There is only 1 cell which contains the contents of this element
Example: Final Touch But don't forget the gotcha The full version with namespaces added is in Worked.xsl and Worked.xml Worked.xsl Worked.xml