SDPL 2011XSLT: Additional features & Computing1 4.1 Additional XSLT Features n XPath for arithmetics, cross-references, and string manipulation n Generating text –for content –for attribute values n Repetition, sorting and conditional processing n Numbering document contents 4.2 Computing with XSLT
SDPL 2011XSLT: Additional features & Computing2 XPath: Arithmetics Double-precision floating-point operators +, -, *, div, mod (same as % in Java) Double-precision floating-point operators +, -, *, div, mod (same as % in Java) »e.g. 2.3 mod 1.1 ≈ 0.1 Truncating numbers, and rounding up/to the closest int: floor(x), ceiling(x), round(x) Truncating numbers, and rounding up/to the closest int: floor(x), ceiling(x), round(x) Formatting numbers as strings (e.g.): Formatting numbers as strings (e.g.): »format-number( , " 0.0 " ) = " -1.3 " –XSLT 1.0 function; uses Java DecimalFormat patterns
SDPL 2011XSLT: Additional features & Computing3 Aggregate Functions n Counting nodes »count( node-set ) –and summing them as numbers »sum( node-set ) n Example: –Average of observed temps below current node: div count(.//obs)
SDPL 2011XSLT: Additional features & Computing4 Cross-referencing Function id selects elements by their unique ID Function id selects elements by their unique ID –NB: ID attributes must be declared in DTD (See an example later) n Examples: –id('sec:intro') selects the element with unique ID "sec:intro" –id('sec:intro')/auth[3] selects the third auth of the above element –id('sec1 sec2 sec3') selects 3 sections (with corresponding ID values)
SDPL 2011XSLT: Additional features & Computing5 String manipulation Equality and inequality of strings with operators = and != Equality and inequality of strings with operators = and != –"foo" = 'foo' ; (NB alternative quotes) –"foo" != "Foo" Testing for substrings: Testing for substrings: –starts-with("dogbert", "dog") = true() –contains("dogbert", "gbe") = true() n Concatenation (of two or more strings), –concat("dog", "bert") = "dogbert"
SDPL 2011XSLT: Additional features & Computing6 More XPath string functions –substring-before("ftp://a", "//") = substring-before("ftp://a", "/") = "ftp:" –substring-after("ftp://a", "/")= "/a" –string-length("dogbert")=7 –substring( string, start, length? ): »substring("dogbert", 1, 3) = "dog" »substring("dogbert", 3) = "gbert" –translate( Str, Replaced, Replacing ): »translate("doggy","dgo","Ssi") = "Sissy"
SDPL 2011XSLT: Additional features & Computing7 Generating Text Insert text nodes with a computed value in the result: Insert text nodes with a computed value in the result: –if Expr gives a node-set, the value of the first node in document order is used (in XSLT 2.0 all, space-separated) n Example: Transform elements like Charlie Parker Charlie Parker to the form Charlie ("Bird") Parker
SDPL 2011XSLT: Additional features & Computing8 Computing generated text (2) This can be specified by template rule This can be specified by template rule (" ") (" ") xsl:text is seldom needed, but useful for including whitespace text xsl:text is seldom needed, but useful for including whitespace text
SDPL 2011XSLT: Additional features & Computing9 Attribute value templates Expression string-values can be inserted in attribute values by enclosing expressions in braces { and } Expression string-values can be inserted in attribute values by enclosing expressions in braces { and } Example: Transform source element Example: Transform source element<photo> Mary.jpg Mary.jpg into form into form
SDPL 2011XSLT: Additional features & Computing10 Attribute value templates (2) Attribute value templates (2) This can be specified by template rule This can be specified by template rule Expressions {file} and are evaluated in the context of the current node (the photo element) Expressions {file} and are evaluated in the context of the current node (the photo element)
SDPL 2011XSLT: Additional features & Computing11 XSLT: Repetition Nodes can be "pulled" from source for processing using Template Nodes can be "pulled" from source for processing using Template –Template is applied to the selected nodelist, each node in turn as the current() node »in document order, unless sorted using xsl:sort instructions (see later)
SDPL 2011XSLT: Additional features & Computing12 Example (of for-each ) n Format the below document as HTML: ]> The Joy of XML Getting Started Helen Brown says that processing XML documents is fun. Dave Dobrik agrees. Family affairs Bob Brown is the husband of Helen Brown. Finishing Up As we discussed in, processing XML documents is fun. ]> The Joy of XML Getting Started Helen Brown says that processing XML documents is fun. Dave Dobrik agrees. Family affairs Bob Brown is the husband of Helen Brown. Finishing Up As we discussed in, processing XML documents is fun.
SDPL 2011XSLT: Additional features & Computing13 Example: Table of contents n A table of contents can be formed of section titles: Table of Contents Table of Contents
SDPL 2011XSLT: Additional features & Computing14 Example (cont; Cross references) Cross-refs can also be processed using for-each: Cross-refs can also be processed using for-each: Section (...) Section (...) With this rule the source fragment With this rule the source fragment As we discussed in As we discussed in becomes As we discussed in Section (Getting …)
SDPL 2011XSLT: Additional features & Computing15 XSLT Sorting Nodes can be processed in sorted order, by xsl:for- each or xls:apply-templates, with Nodes can be processed in sorted order, by xsl:for- each or xls:apply-templates, with controlled by attributes of xsl:sort, like controlled by attributes of xsl:sort, like –select : expression for the sort key (default: "." ) –data-type : "text" (default) or "number" –order : "ascending" (default) or "descending" The first xsl:sort specifies the primary sort key, the second one the secondary sort key, and so on. The first xsl:sort specifies the primary sort key, the second one the secondary sort key, and so on.
SDPL 2011XSLT: Additional features & Computing16 Example (cont; Sorted index of names) All names can be collected in a last-name-first-name order using the below template All names can be collected in a last-name-first-name order using the below template Index, Index, n This creates an UL list with items Brown, Bob Brown, Helen Brown, Helen Dobrik, Dave Brown, Bob Brown, Helen Brown, Helen Dobrik, Dave Possible to eliminate duplicates? Yes, but a bit tricky. See next
SDPL 2011XSLT: Additional features & Computing17 Conditional processing n A template can be instantiated or ignored with Template Template Example: a comma-separated list of names: Example: a comma-separated list of names:,,
SDPL 2011XSLT: Additional features & Computing18 An aside: Meaning of position() Evaluation wrt the current node list. The above applied to a b c d by invocation yields" a,b,c,d " (← single node list); Evaluation wrt the current node list. The above applied to a b c d by invocation yields" a,b,c,d " (← single node list); With invocation from we'd get " a,b " and " c,d " (Clever!) With invocation from we'd get " a,b " and " c,d " (Clever!)
SDPL 2011XSLT: Additional features & Computing19 Conditional processing (2) A case construct ( switch in Java): A case construct ( switch in Java): … … … … … </xsl:choose>
SDPL 2011XSLT: Additional features & Computing20 Example (cont; Eliminating duplicate names) Only the current() node accessible in current node list Only the current() node accessible in current node list –but can refer to nodes in the source tree –Process just the first one of duplicate name s:,, </xsl:for-each>
SDPL 2011XSLT: Additional features & Computing21 Numbering Document Contents Formatted numbers can be generated using Formatted numbers can be generated using –by the position of the current node in the source tree »or by an explicit value=" ArithmExpr " –nodes to be counted specified by a count pattern –supports common numbering schemes : single-level, hierarchical, and sequential through levels n Typical cases in following examples »(Complete specification is rather complex) n Example 1: Numbering list items
SDPL 2011XSLT: Additional features & Computing22 Generating numbers: Example 1 n n item itemitemolapricotbananacoconut apricotbananacoconut
SDPL 2011XSLT: Additional features & Computing23 Generating numbers: Example 2 n Hierarchical numbering (1, 1.1, 1.1.1, 1.1.2, …) for titles of chapters, titles of their sections, and titles of subsections:
SDPL 2011XSLT: Additional features & Computing24 Generating numbers: Example chapSweets title title Berries sect subsect Cherry titlechap titleVegetables... Sweets Sweets Berries Berries Cherry Cherry Vegetables Vegetablespart
SDPL 2011XSLT: Additional features & Computing25 Example 2: Variation n As above, but number titles within appendices with A, A.1, A.1.1, B.1 etc:
SDPL 2011XSLT: Additional features & Computing26 Example 2: Variation AA.1A.1.1B appendix Sweetstitletitle Berries sect subsect Cherry title titleVegetables... Sweets Sweets Berries Berries Cherry Cherry Vegetables Vegetablesappendix part
SDPL 2011XSLT: Additional features & Computing27 Generating numbers: Example 3 Sequential numbering of notes within chapters : (more precisely: starting anew at the start of any chapter) Sequential numbering of notes within chapters : (more precisely: starting anew at the start of any chapter)
SDPL 2011XSLT: Additional features & Computing28 Ex 3: Sequential numbering from chap chap Yes!note noteNo! sect Perhaps?note noteOK... (1)(2)(3)(1) Yes! Yes! No! No! Perhaps? Perhaps? OK OK chap
SDPL 2011XSLT: Additional features & Computing Computing with XSLT n XSLT is a declarative rule-based language –for XML transformations –Could we use it for general computing? –What is the exact computational power of XSLT? n We've seen some programming-like features: –iteration over source nodes ( xsl:for-each ) »XSLT 2.0 supports iteration over arbitrary sequences –conditional evaluation ( xsl:if and xsl:choose )
SDPL 2011XSLT: Additional features & Computing30 Computing with XSLT n Further programming-like features: –variables (names bound to non-updatable values): –variables (names bound to non-updatable values): –callable named templates with parameters: –callable named templates with parameters:
SDPL 2011XSLT: Additional features & Computing31 Result Tree Fragments Result tree fragments built by templates can be stored, too: BAR Result tree fragments built by templates can be stored, too: BAR They can only be used as string values substring($fooBar, 2, 2) = "AR" They can only be used as string values substring($fooBar, 2, 2) = "AR" or inserted in the result: or inserted in the result: (XSLT 2.0 allows unlimited accessing of (computed) sequences, e.g., with location paths) (XSLT 2.0 allows unlimited accessing of (computed) sequences, e.g., with location paths)
SDPL 2011XSLT: Additional features & Computing32 Visibility of Variable Bindings The binding is visible in following siblings of xsl:variable, and in their descendants: The binding is visible in following siblings of xsl:variable, and in their descendants:
SDPL 2011XSLT: Additional features & Computing33 A Real-Life Example We used LaTeX to format an XML article. For this, we needed to map source table structures... to corresponding LaTeX environments: \begin{tabular}{lll} %3 left-justified cols... \end{tabular} We used LaTeX to format an XML article. For this, we needed to map source table structures... to corresponding LaTeX environments: \begin{tabular}{lll} %3 left-justified cols... \end{tabular} n How to do this?
SDPL 2011XSLT: Additional features & Computing34 Possible solution (for up to 4 columns) \begin{tabular}{l } \begin{tabular}{l } \end{tabular} \end{tabular}</xsl:template> OK, but inelegant! How to support arbitrarily many columns? 1">l 1">l</xsl:if > 2">l 2">l</xsl:if > 3">l > 3">l
SDPL 2011XSLT: Additional features & Computing35 More General Solution (1/2) Pass the column-count to a named template which generates the requested number of ‘ l ’s: Pass the column-count to a named template which generates the requested number of ‘ l ’s: \begin{tabular}{ } \begin{tabular}{ } \end{tabular} \end{tabular}
SDPL 2011XSLT: Additional features & Computing36 Solution 2/2: Recursive gen-cols 0"> 0"> </xsl:template> formal parameters
SDPL 2011XSLT: Additional features & Computing37 Iteration in XSLT 2.0 XSLT 2.0 is more convenient, by allowing iteration over generated sequences, too: \begin{tabular}{ XSLT 2.0 is more convenient, by allowing iteration over generated sequences, too: \begin{tabular}{ l l } } \end{tabular} \end{tabular}
SDPL 2011XSLT: Additional features & Computing38 Stylesheet Parameters Stylesheet can get params from command line, or through JAXP with Transformer.setParameter(name, value) : Stylesheet can get params from command line, or through JAXP with Transformer.setParameter(name, value) : </xsl:transform> $ java -jar saxon.jar dummy.xml double.xslt In= default value
SDPL 2011XSLT: Additional features & Computing39 Generating XML Test Cases n XSLT is handy for generating XML test cases n An XSLT 2.0 script to generate n sections: <!DOCTYPE xsl:transform [ Lengthy XML fragment (excluded) " > Lengthy XML fragment (excluded) " >]> <xsl:transform version="2.0" <xsl:transform version="2.0" xmlns:xsl=" xmlns:xsl="
SDPL 2011XSLT: Additional features & Computing40 XSLT 2.0 script continues A doc with sections A doc with sections Section number Section number &CONT; &CONT; </xsl:transform> Extensions in XPath 2.0, to generate and access to generate and access non-node sequences
SDPL 2011XSLT: Additional features & Computing41 Invoking the Script n The script can be used as the required (dummy) input document: $ saxon9 genTest-iter.xslt genTest-iter.xslt n=1000 $ saxon9 genTest-iter.xslt genTest-iter.xslt n=1000 n XSLT 1.0 does not support iteration over generated sequences ; In XSLT 1.0 we must recurse: A doc with sections A doc with sections
SDPL 2011XSLT: Additional features & Computing42 Tail-recursive genSecs template Section Section &CONT; &CONT; </xsl:template>
SDPL Problem with Tail-Recursion The depth of tail-recursion is linear. This may lead to problems with large n (say, above a few thousands): The depth of tail-recursion is linear. This may lead to problems with large n (say, above a few thousands): Saxon: StackOverflowError Saxon: StackOverflowError xsltproc: "A potential infinite template recursion was detected. You can adjust xsltMaxDepth (--maxdepth) in order to raise the maximum number of nested template calls and variables/params (currently set to 3000)." xsltproc: "A potential infinite template recursion was detected. You can adjust xsltMaxDepth (--maxdepth) in order to raise the maximum number of nested template calls and variables/params (currently set to 3000)." Solution: Enter Algorithm Design! Solution: Enter Algorithm Design! SDPL XSLT: Additional features & Computing
SDPL Apply Divide-and-Conquer Split the problem/task in (equal-sized) sub- problems, and solve them recursively Split the problem/task in (equal-sized) sub- problems, and solve them recursively To generate secs $first,..., $last, compute their mid-point $mid, and generate the halves $first,..., $mid, and $mid+1,..., $last: To generate secs $first,..., $last, compute their mid-point $mid, and generate the halves $first,..., $mid, and $mid+1,..., $last: SDPL XSLT: Additional features & Computing
SDPL Divide-and-Conquer genSecs template Section number Section number &CONT; &CONT; SDPL XSLT: Additional features & Computing
SDPL Divide&Conquer genSecs (cont.) </xsl:template> SDPL XSLT: Additional features & Computing
SDPL Balanced D&C Recursion Tree (draw it!) SDPL XSLT: Additional features & Computing
SDPL Effect of Divide-and-Conquer The depth of balanced divide-and-conquer recursion is logarithmic only The depth of balanced divide-and-conquer recursion is logarithmic only E.g., log(10^6) ~ 20, log(10^9) ~ 30 E.g., log(10^6) ~ 20, log(10^9) ~ 30 Piece of cake! Piece of cake! Divide-and-conquer doubles the number of recursive calls above. Does this make a big difference? Divide-and-conquer doubles the number of recursive calls above. Does this make a big difference? No; See next No; See next SDPL XSLT: Additional features & Computing
SDPL XSLT 1.0 execution times $ time saxon genTest-tailrec.xslt genTest-tailrec.xslt n= > /dev/null real0m12.808s user0m12.924s sys0m0.104s $ time saxon genTest-tailrec.xslt genTest-dc.xslt n= > /dev/null real0m13.871s (~8% more) user0m14.016s sys0m0.088s SDPL XSLT: Additional features & Computing Div&Conquer script
SDPL Iteration vs Tail-recursion vs Div&Conq Some XSLT 2.0 execution times: $ time saxon9 dummy.xml genTest-iter.xslt n= real0m11.277s user0m10.118s $ time saxon9 dummy.xml genTest-tailrec.xslt n= real0m24.803s (~120% more than iteratively) user0m24.658s $ time saxon9 dummy.xml genTest-dc.xslt n= real0m29.779s (~164% more than iteratively) user0m30.178s SDPL XSLT: Additional features & Computing
SDPL Simulating DB Queries Relational DB queries can be simulated easily, with variables and nested for-each loops Relational DB queries can be simulated easily, with variables and nested for-each loops –(but XQuery is more suitable for such) Example: Example: – Join ”tuples” by matching values of their fields – Assume information of fathers' addresses and children: SDPL XSLT: Additional features & Computing
SDPL Example ”database” Contents of the input document: Contents of the input document: SDPL XSLT: Additional features & Computing
SDPL Requested Result Children joined with their (daddies') addresses: Children joined with their (daddies') addresses: SDPL XSLT: Additional features & Computing
SDPL Solution Join children with their daddies: Join children with their daddies: SDPL XSLT: Additional features & Computing
SDPL 2011XSLT: Additional features & Computing55 Computational Power of XSLT n XSLT seems quite powerful, but how powerful is it? –Implementations provide extension mechanisms, e.g., to call arbitrary Java methods –Are there limits to XSLT processing that we can do without extensions? n We’ll see that any algorithmic computation can be simulated with plain XSLT –shown indirectly, through simulating Turing machines by XSLT
SDPL 2011XSLT: Additional features & Computing56 Turing Machine n Alan Turing 1936/37 n formal model of algorithms n primitive but powerful enough to simulate any computation expressible in any algorithmic model (Church/Turing thesis) n Turing machine –A finite set of states –unlimited tape of cells for symbols, examined by a tape head
SDPL 2011XSLT: Additional features & Computing57 Illustration of a Turing Machine state-basedcontrol
SDPL 2011XSLT: Additional features & Computing58 Control of a Turing machine Control defined by a transition function (q, a) = (q´, b, d), where d {left, right} Control defined by a transition function (q, a) = (q´, b, d), where d {left, right} –Meaning: with current state q and tape symbol a, »move to new state q´ »replace symbol a in the current cell by ‘b’ »move tape-head one step in direction d Such control can be simulated in XSLT with a recursive named-template; Call it transition Such control can be simulated in XSLT with a recursive named-template; Call it transition
SDPL 2011XSLT: Additional features & Computing59 TM for recognizing {a,b} palindromes mark move_a move_btest_b test_a YESNO return a/#, R a/a, R b/b, R b/#, R a/a, R b/b, R #/#, L b/b, L a/a, L a/#, L b/#, L a/a, L b/b, L #/#, R
SDPL 2011XSLT: Additional features & Computing60 The “ transition ” template n Parameters: –state : the current state –left : contents of the tape up to the tape head –right : contents of the tape starting at the cell pointed by the tape head Template transition simulates a single transition step; calls itself with updated parameters Template transition simulates a single transition step; calls itself with updated parameters
SDPL 2011XSLT: Additional features & Computing61 Overall structure of TM simulation </xsl:choose>
SDPL 2011XSLT: Additional features & Computing62 Updating the TM tape representation For each right-move (q, a) = (q´, b, right), concatenate 'b' at the end of $left and drop the first character of $right For each right-move (q, a) = (q´, b, right), concatenate 'b' at the end of $left and drop the first character of $right Left-moves (q 1, a) = (q 2, b, left) in a similar manner: Left-moves (q 1, a) = (q 2, b, left) in a similar manner: –drop the last character of $left, and concatenate it in front of $right whose first character has been replaced by 'b' n Example: a TM for palindromes over alphabet {a, b}; ('#' used for denoting empty cells)
SDPL 2011XSLT: Additional features & Computing63 Simulating a single step (1/2)
SDPL 2011XSLT: Additional features & Computing64 Simulating a single step (2/2) </xsl:when>
SDPL 2011XSLT: Additional features & Computing65 Sample trace of TM simulation $ saxon dummy.xml tm-palindr.xsl input-tape=aba <ACCEPT/> state shown as tape-left[state]tape-right
SDPL 2011XSLT: Additional features & Computing66 Implications of TM simulation n XSLT has full algorithmic power –It is "Turing-complete" –Is this intentional? »Not a general-purpose programming language! –Impossible to recognise non-terminating transformations automatically ( the "halting problem" has no algorithmic solution) »Could try denial-of-service attacks through non- terminating style sheets