Transforming XML XMLNamespaces, XSLT
XML Namespaces Sometimes it is necessary to mix XML elements –Different types of content –Use of markup to convey meta-information Some documents combine markup from different XML languages But: –Elements and attributes from different XML languages may share the same name –Need to group elements for processing
XML Namespaces XML Namespaces is the xml standard for distinguishing xml elements Namespaces are represented by attributes Elements from the same namespace can be recognised by software as a group Unique namespaces are defined by a URI
URL, URN, URI URL: a Uniform Resource Locator specifies the mechanism by which a resource is accessed e.g. URN: a Uniform Resource Name a unique sequence of characters naming an internet resource e.g. urn:Turquoise.Inflatable.Walrus the name has persistence even if the resource becomes unavailable URI –uniform resource identifier a URL or a URN (see RFC 2396 at
the namespace prefix short string representing the namespace URI distinguishes element and attribute names defined using an xmlns:prefix attribute – a prefixed element name is called a qualified name, or QName, or a raw name QName syntax prefix:local_part
example SVG and MathML both contain a set element Both SVG and MathML can be embedded in XHTML documents prefixes svg and mathml are used to distinguish the set elements distinct from
example 2: xml with multiple namespaces <html xmlns= xmlns:xlink= Three Namespaces Ellipse and Rectangle xlink prefix associated to the xlink namespace everywhere within the root element xhtml namespace associated to root html element and all descendants (no prefix needed) all (blue) elements are in the xhtml namespace xml declaration
xml with multiple namespaces <svg xmlns= width = 12cm height = 10cm> <rect x=4cm y=1cm width=3cm height=6cm> svg namespace associated to root svg element and all descendants (no prefix needed) all (red) elements are in the svg namespace
xml with multiple namespaces <p xlink:type=simple xlink:href=ellipses.html> More about ellipses <p xlink:type=simple xlink:href=rectangles.html> More about rectangles Last Modified 7th October 2003 all (blue) elements are in the xhtml namespace prefixed QNamed attributes (green) are in the xlink namespace
more on namespaces namespace can be defined in the element where it is used or in the root namespaces are identified by the URI, not the prefix used in a particular document the parser doesnt look up the URI – it is only there as a unique identifier!
more on namespaces namespaces are completely independent of DTDs QNames, if used, must be defined as elements in the DTD for them to be valid –parameter entities are used to get round this ingenious but awkward kludge not required for this module! namespaces important in XSLT documents
Introduction to XSLT
what is XSL? XML & client/server model –XML sits on server but does not do anything –XSL provides client views of data XSL: eXtensible Stylesheet Language –two separate namespaces XSL-FO (Formatting Objects) XSLT (Transformations) –X-Path used to navigate XML defines rules for transforming a source XML document into a target document
what is XSLT? Transforms source tree to results tree by: –Selecting elements –Selecting attributes –Rearranging elements –Sorting elements –Applying conditional tests XML/XSTL Similar to HTML/CSS
the XSLT transformation process XSLT document XSLT processor XML source output document set of template rules match elements and replace using template rules
a simple XSLT example the source
Alan Turing computer scientist mathematician cryptographer
Richard P Feynman physicist playing the bongoes
a simple XSLT example the transforming stylesheet
<xsl:stylesheet version = 1.0 xmlns:xsl= http :// xsl prefix identifies xsl QNames as belonging in the XSLT namespace associated to the given URI the empty stylesheet contains no template rules will apply default rules (see later)
a simple XSLT example the output of the transform
Alan Turing computer scientist mathematician cryptographer Richard P Feynman physicist playing the bongoes default behaviour strips out the mark up and returns a text document that reproduces the content of the XML (including whitespace like tabs and carriage returns) to modify the default behaviour, we add template rules that describe how to transform elements of the source document
template rules a template rule is defined by an element the match attribute contains a pattern identifying the input to which the rule is applied the content of the element is a template for the output from the matched pattern template
example 2 <xsl:stylesheet version = 1.0 xmlns:xsl= A Person
Alan Turing computer scientist mathematician cryptographer Richard P Feynman physicist playing the bongoes
example 2 output A Person Each person element in the original document has been replaced entirely by the template. The whitespace outside each person element has been preserved
example 3 <xsl:stylesheet version = 1.0 xmlns:xsl= A Person elements used in a template must preserve well-formedness of the document
example 3 output A Person The and tags have also been copied over from the template The whitespace outside each person element has been preserved
xsl:value-of xsl element which extracts the string value of an element in the source XML –the string value is the text content after: all tags have been removed entity and character references have been resolved select attribute specifies the element whose value is taken
example 4 <xsl:stylesheet version = 1.0 xmlns:xsl=
Alan Turing computer scientist mathematician cryptographer Richard P Feynman physicist playing the bongoes
example 4 output Alan Turing Richard P Feynman the full text content of the element after the,, and tags have been stripped out The whitespace inside each name element has been preserved along with the rest of the text content
example 4a <xsl:stylesheet version = 1.0 xmlns:xsl= -
example 4a output Alan Turing Richard P Feynman the value of attribute associated with element is added to the output
xsl element that can affect the default order of processing –which elements should be processed next –process elements in the middle of processing another element –prevent particular elements from being processed select attribute contains a pattern identifying elements to be processed at that point
example 5 <xsl:stylesheet version = 1.0 xmlns:xsl=
example 5 output Turing, Alan Feynman, Richard The order of processing has been changed. The output for each consists of the full text content of the, followed by a comma and a new line, followed by the full text content of the. The and elements are never processed because bypasses them
XSLT processor a software component that reads a XML source document and a stylesheet applies the transformation rules outputs the transformed document standalone SAXON Apache Xalan (used in NetBeans) built into a browser or application server MSXML (built in to IE6) Apache Cocoon (built in to Apache server)
Stylesheet Example – XML (catalog.xml) Empire Burlesque Bob Dylan USA Columbia ESSSSSSSSS Bruce Uk Cola
Stylesheet Example –XSL (cdcatalog.xsl) My CD Collection Title: Artist:
Stylesheet Example –Linking XML to Stylesheet Empire Burlesque Bob Dylan USA Columbia ESSSSSSSSS Bruce Uk Cola
My CD Collection Title: Empire Burlesque Artist: Bob Dylan Title: ESSSSSSSSS Artist: Bruce Stylesheet Example – HTML Output
Summary Namespaces – allow elements from different XML languages to be included in same XML –xmlns:xlink= XSLT – –Create templates with –Select content with –Control content with
Useful websites Standards: – administrates xsl std Tutorials/Forums – – – –