Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall XML Transformation: XSLT Semantic Web - Fall 2005 Computer Engineering Department Sharif University of Technology
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Outline Fundamentals of XSLT XPath Cocoon
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XSLT XSLT stands for Extensible Stylesheet Language Transformations It is used to transform XML documents into other kinds of documents, e.g. HTML, PDF, XML, … XSLT uses two input files: –The XML document containing the actual data –The XSL document containing both the “framework” in which to insert the data, and XSLT commands to do so
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XSLT Architecture Source XML doc XSL stylesheet XSL processor Target Document
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Some special transforms XML to HTML— for old browsers XML to LaTeX—for TeX layout XML to SVG—graphs, charts, trees XML to tab-delimited—for db/stat packages XML to plain-text—occasionally useful XML to XSL-FO formatting objects
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XSLT Data Model XSLT reads an XML documents as a source tree Transforms the documents into a result tree Transformations are specified in a stylesheet To navigate the tree XSLT uses XPath
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Introduction to XPath XPath is a syntax for addressing parts of an XML document by –describing paths through the document hierarchy –specifying constraints to match against the document's structure XSL uses XPath expressions to –determine which elements match a template –select nodes upon which to perform operations
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics XPath expressions superficially resemble UNIX pathnames, e.g. poem/stanza/line refers to "all line elements which are children of stanza elements which are children of poem elements" XPath expressions are evaluated relative to a "context node", which is analogous to the "current working directory" in UNIX or DOS. The XPath expression for this is "."
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: a Simple Example Consider the following XML document: Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: a Simple Example (cont.) The XPath " poem/stanza/line " selects Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: wildcards The XPath " poem/stanza/* " selects Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: descendants The XPath " poem//punch " selects: Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: sequencing " poem/stanza/line[1] " selects: Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: sequencing (cont.) " poem/stanza/line[position() = last()] " selects: Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: selecting text nodes " poem/author/text() " selects: Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: conditionals " poem/stanza[punch] " selects: Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall XPath Basics: conditionals: equality “ //line[text()="I'm a poet"] ” Roses Ima Poet Roses are red violets are blue I'm a poet and you're not!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall A simple XSL example File data.xml : Hello World! File render.xsl :
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Stylesheet (.xsl file) It is a well-formed XML document It is a collection of template rules A template rule consists of pattern and a template Pattern is specified in Xpath and locates the node of the XML tree. The located node is replaced by the template in the result tree
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall The.xsl file An XSLT document has the.xsl extension The XSLT document begins with: Contains one or more templates, such as:... And ends with:
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Finding the message text The template says to select the entire file –You can think of this as selecting the root node of the XML tree Inside this template, – selects the message child –Alternative Xpath expressions that would also work:./message /message/text()./message/text()
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Putting it together The XSL was: The chooses the root The is written to the output file The contents of message is written to the output file The is written to the output file The resultant file looks like: Hello World!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall How XSLT works The XML text document is read in and stored as a tree of nodes The template is used to select the entire tree The rules within the template are applied to the matching nodes, thus changing the structure of the XML tree –If there are other templates, they must be called explicitly from the main template Unmatched parts of the XML tree are not changed After the template is applied, the tree is written out again as a text document
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Where XSLT can be used A server can use XSLT to change XML files into HTML files before sending them to the client A modern browser can use XSLT to change XML into HTML on the client side –This is what we will mostly be doing in this class Most users seldom update their browsers –If you want “everyone” to see your pages, do any XSL processing on the server side
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Modern browsers Internet Explorer 6 best supports XML Netscape 6 supports some of XML Internet Explorer 5.x supports an obsolete version of XML –If you must use IE5, the initial PI is different (you can look it up if you ever need it)
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall xsl:value-of selects the contents of an element and adds it to the output stream –The select attribute is required –Notice that xsl:value-of is not a container, hence it needs to end with a slash Example (from an earlier slide):
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall xsl:for-each xsl:for-each is a kind of loop statement The syntax is Text to insert and rules to apply Example: to select every book ( //book ) and make an unordered list ( ) of their titles ( title ), use:
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Filtering output You can filter (restrict) output by adding a criterion to the select attribute’s value: This will select book titles by Terry Pratchett
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Filter details Here is the filter we just used: author is a sibling of title, so from title we have to go up to its parent, book, then back down to author This filter requires a quote within a quote, so we need both single quotes and double quotes Legal filter operators are: = != < > –Numbers should be quoted, but apparently don’t have to be
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall But it doesn’t work right! Here’s what we did: This will output and for every book, so we will get empty bullets for authors other than Terry Pratchett There is no obvious way to solve this with just xsl:value-of
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall xsl:if xsl:if allows us to include content if a given condition (in the test attribute) is true Example: This does work correctly!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall xsl:choose The xsl:choose... xsl:when... xsl:otherwise construct is XML’s equivalent of Java’s switch... case... default statement The syntax is:... some code some code... xsl:choose is often used within an xsl:for-each loop
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall xsl:sort You can place an xsl:sort inside an xsl:for-each The attribute of the sort tells what field to sort on Example: by –This example creates a list of titles and authors, sorted by author
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall xsl:text... helps deal with two common problems: –XSL isn’t very careful with whitespace in the document This doesn’t matter much for HTML, which collapses all whitespace anyway (though the HTML source may look ugly) gives you much better control over whitespace; it acts like the element in HTML –Since XML defines only five entities, you cannot readily put other entities (such as ) in your XSL almost works, but is visible on the page Here’s the secret formula for entities:
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Creating tags from XML data Suppose the XML contains Dr. Abolhassani's Home Page And you want to turn this into Dr. Abolhassani's Home Page We need additional tools to do this!
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Creating tags--solution 1 Suppose the XML contains Dr. Abolhassani's Home Page adds the named attribute to the enclosing tag The value of the attribute is the content of this tag Example: Result: Dr. Abolhassani's Home Page
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Creating tags--solution 2 Suppose the XML contains Dr. Abolhassani's Home Page An attribute value template (AVT) consists of braces { } inside the attribute value The content of the braces is replaced by its value Example: Result: Dr. Abolhassani's Home Page
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Modularization Modularization--breaking up a complex program into simpler parts--is an important programming tool –In programming languages modularization is often done with functions or methods –In XSL we can do something similar with xsl:apply-templates For example, suppose we have a DTD for book with parts titlePage, tableOfContents, chapter, and index –We can create separate templates for each of these parts
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Book example Table of Contents Etc.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall xsl:apply-templates The element applies a template rule to the current element or to the current element’s child nodes If we add a select attribute, it applies the template rule only to the child that matches If we have multiple elements with select attributes, the child nodes are processed in the same order as the elements
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Applying templates to children XML Gregory Brill by With this line: XML by Gregory Brill Without this line: XML
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Tools for XSL Development There are a number of free and commercial XSL tools available –XSLT processors: MSXML, which currently supports the latest XSLT specification (native Win32) Xalan from Apache (C++, Java) –Editors and browsers Internet Explorer 6.0 XML Spy (commercial)
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Cocoon Cocoon is Apache’s dynamic XML Publishing Framework. Cocoon uses XSLT. Cocoon allows separation of content, logic and presentation. making sure people can interact and collaborate on a project, without stepping on each other toes, and component-based web development. Cocoon is a web-application that runs using Apache Tomcat (Cocoon.war).
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall What Cocoon can do
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Cocoon Pipeline Cocoon introduced the idea of a pipeline to handle a request. A pipeline is a series of steps for processing a particular kind of content.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall Sitemap In Cocoon, configuration information for the pipelines that an application requires is defined in a file named sitemap.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Semantic web - Computer Engineering Dept. - Fall References Specifications: – – – – An excellent XSLT tutorial: – Another tutorial: – Microsoft (MSXML3): – Saxon: – Xalan: –