Presentation is loading. Please wait.

Presentation is loading. Please wait.

24/06/2004Programming Language Design and Implementation 1 Optimizations in XSLT tokyo.ac.jp/schuko/XSLT-opt.ppt 24/June/04.

Similar presentations


Presentation on theme: "24/06/2004Programming Language Design and Implementation 1 Optimizations in XSLT tokyo.ac.jp/schuko/XSLT-opt.ppt 24/June/04."— Presentation transcript:

1 24/06/2004Programming Language Design and Implementation 1 Optimizations in XSLT http://www-sato.cc.u- tokyo.ac.jp/schuko/XSLT-opt.ppt 24/June/04

2 24/06/2004Programming Language Design and Implementation 2 Evaluation of Stylesheets 1. Find the template which matches ‘/’ 2. Evaluate the template Evaluate each child in order if (child is LiteralResultElement) { Make the same node, and Evaluate Children as its children (Depends on Recursive Structure of TREEs) } else

3 24/06/2004Programming Language Design and Implementation 3 Need for Optimization Example: Matrix Multiplication (Itanium2 1.5GHz 6MB 3 rd Cache: Intel Fortran Ver. 8.0) Option-O0-O1-O2-O3 MFLOPS8.5394.8140.13762.2

4 24/06/2004Programming Language Design and Implementation 4 List of Optimizations -O2 –Architecture Specific Optimizations such as Global code scheduling, software pipelining, predication, speculation. –Inlining of intrinsics. –Architecture Independent Optimizations such as

5 24/06/2004Programming Language Design and Implementation 5 List of Optimizations (2) Higher Level Optimization –Constant propagation, copy propagation, dead-code elimination, global register allocation, global instruction scheduling, control speculation, loop unrolling, code selection, partial redundancy elimination, strength reduction, induction variable simplification, variable renaming, exception optimization, tail recursion elimination, peephole optimization, structure assignment lowering, dead store elimination

6 24/06/2004Programming Language Design and Implementation 6 List of Optimizations (3) -O3 –Prefetching, scalar replacement –Loop transformations  Source of Performance Gain in most Technical Computing.

7 24/06/2004Programming Language Design and Implementation 7 Points of Optimizations They are NEVER magic or ad-hoc technologies. –Program Analysis –Dataflow Equation based Global Analysis –Symbolic Evaluation/Partial Evaluation Semantics Based Optimizations.

8 24/06/2004Programming Language Design and Implementation 8 Points of Optimizations(2) Architecture Specific Optimizations –New Features of Architectures SuperScalar/VLIW Vector Processing Speculation Prefetching …

9 24/06/2004Programming Language Design and Implementation 9 Points of Optimizations(3) Source-to-source conversion Accelerator + meta instruction Algorithm transformation

10 24/06/2004Programming Language Design and Implementation 10 Source-to-Source Conversion do I do J do K c(I,j) = c(I,j) + a(I,k)*b(k,j) end do do J do K do I c(I,j)=c(I,j)+ a(I,k)*b(k,j) end do

11 24/06/2004Programming Language Design and Implementation 11 Accelerator + meta instruction do I do J do K c(I,j) = c(I,j) + a(I,k)*b(k,j) end do !$omp parallel do do I do J do K c(I,j) = c(I,j) + a(I,k)*b(k,j) end do

12 24/06/2004Programming Language Design and Implementation 12 Algorithm Transformation Call Bubblesort(a)Call quicksort(a) Needs to Ensure the transformation Preserves program semantics

13 24/06/2004Programming Language Design and Implementation 13 Optimization and Tuning Tune – Adjust an engine to run smoothly –Performance tuning –Human side Job Optimize – make the most effective use of –Performance or other metrics –Complicated in general – Computer side Job –A Kind of (automatic) Program transformation They are Very Similar. Profiling is Critical for Tuning

14 24/06/2004Programming Language Design and Implementation 14 Tuning Find HOT SPOT –Most resource consuming part Profiling Tools profiling by sampling –cc –p gcc –pg –prof gprof

15 24/06/2004Programming Language Design and Implementation 15 Tuning(2) Profiling with hardware support –Most modern processors instruction counts CPU time Cache miss rate/cache hit rate Hardware utilization (vector unit etc.)

16 24/06/2004Programming Language Design and Implementation 16 HOT SPOTS in XSLT Evaluation Tuning of XSLT Engine + Stylesheet Optimization  Performance Improvement HOT SPOTS in XSLT Evaluation = –Evaluation of XPATH Expression –Template Instantiation (IN GENERAL, WHERE LOOP EXISTS)

17 24/06/2004Programming Language Design and Implementation 17 Template Instantiation Evaluation of For each (target node) 1. select matching template (query required) 2. make frame 3. call template

18 24/06/2004Programming Language Design and Implementation 18 Procedure Call Optimization Interprocedural Optimization –Dataflow across Calls Inlining (Inline Expansion, Procedure Integration) –Save Call Overhead Tail Call Elimination –Save Frame Overhead

19 24/06/2004Programming Language Design and Implementation 19 Inlining void a() { b(2); } void b(int x) { printf(“%d\n”, x+1); } Void a() { printf(“%d\n”, 2+1); } Void b(int x) { printf(“%d\n”, x+1); }

20 24/06/2004Programming Language Design and Implementation 20 Effect of Inlining Call Overhead Reduction –Execution of Call: Arguments  Stack Return address  Stack Address of subroutine  Program Counter Make frame Save registers Execute Destroy frame (stack unreel) Return address  Program Counter Destroy arguments Heavy

21 24/06/2004Programming Language Design and Implementation 21 Effect of Inlining(2) Very Effective for small functions –Methods –Inline prefix in C++ Similar to macros

22 24/06/2004Programming Language Design and Implementation 22 Effect of Inlining(3) Further Optimization across Calls –Most optimizations are done within a procedure. Code Size Increase (Drawback) Harder Program Analysis (Drawback)

23 24/06/2004Programming Language Design and Implementation 23 Effect of Inlining(4) Alias problem in Fortran –Fortran does not assume aliases among arguments (exists in reality, though) –If inlined, Compiler must check if there is not any alias among arguments (often fails) –Then, poor code may be generated.

24 24/06/2004Programming Language Design and Implementation 24 Note on Inlining Note that You must not do manual inlining. Be sure to write a program for inlining.

25 24/06/2004Programming Language Design and Implementation 25 Inlining in XSLT … … / … / …

26 24/06/2004Programming Language Design and Implementation 26 Tail Call Elimination Observation:

27 24/06/2004Programming Language Design and Implementation 27 Tail Call Elimination(2) Return from template a  immediate return –Tail Call Return from template x  immediate return –Tail Recursion –Used as LOOP. In this case, caller’s frame can be destroyed at calls of a or x. However, Call is done, and a new frame is allocated for a and x.

28 24/06/2004Programming Language Design and Implementation 28 Tail Call Elimination(3) Ordinary… … destroy frame of x call of a jump to a make frame of a execute destroy frame of a return to x return from a destroy frame of x = return from x. return from x.

29 24/06/2004Programming Language Design and Implementation 29 Tail Call Elimination(4) Ret address frame call Ret address Before Elimination After Elimination jump frame

30 24/06/2004Programming Language Design and Implementation 30 Tail Call Elimination(5) This Optimization Requires a Jump to a procedure –Low Level Stack Manipulation such as Frame Destruction Return Address Identification –Are Required.

31 24/06/2004Programming Language Design and Implementation 31 Tail Recursion Elimination Tail Recursion ⊆ Tail Call Tail Call to Self. Most Programming Languages have goto construct Can do Source-to-source conversion Frame creation/destruction = rewrite local variables

32 24/06/2004Programming Language Design and Implementation 32 Tail Recursion Elimination(2) Observation: int f(int n) int ff(int n) { {top: if (n==0) return 0; else f(n-1); else {n=n-1; } goto top; } Jump Frame destruction/ creation

33 24/06/2004Programming Language Design and Implementation 33 Tail Recursion Elimination(3) Before Optimization After Optimization Ret address n=… Ret address n=… call Ret address n=n-1

34 24/06/2004Programming Language Design and Implementation 34 Effect of Tail Call Elimination Save Frame Creation/Destruction Cost Save Space for Frame Creation –Significant when LOOP is implemented as Tail Recursion (XSLT, most Functional Languages).

35 24/06/2004Programming Language Design and Implementation 35 Tail Call Elimination in XSLT No GOTO Construct –× Source-to-source conversion Optimization of XSLT Engine –Recognition of Tail Call/Recursion. –Frame Adjustment, and Jump.

36 24/06/2004Programming Language Design and Implementation 36 Tail Recursion Elimination in General int fact(int n) { if (n == 0) return 1; else return n * fact(n-1); } Not Tail Recursion

37 24/06/2004Programming Language Design and Implementation 37 Tail Recursion Elimination in General(2) int fact2(int n, int res) { if (n==0) return res; else return fact2(n-1, n * res); } Tail Recursion

38 24/06/2004Programming Language Design and Implementation 38 Tail Recursion Elimination in General(3) int fact2(int n, int res) { if (n==0) return res; else return fact2(n-1, n*res); } int fact(int n) { return fact2(n, 1);} int fact2(int n, int res) {top: if (n==0) return res; else { n = n-1; res = n * res; goto top; } int fact(int n) { return fact2(n,1);}

39 24/06/2004Programming Language Design and Implementation 39 Tail Recursion Elimination in General(4) How to Rewrite non tail-recursion to tail- recursion. –Commutative, associative operations –Some Linearity in Call –Introduction of Intermediate Variables.

40 24/06/2004Programming Language Design and Implementation 40 Accumulator type fun f(n) … top: call f(n-1); … some-instructions(n); r *=some-instruction(n) return; n  n-1; goto top;

41 24/06/2004Programming Language Design and Implementation 41 Accumulator type(2) If call graph is of the type f  f  f  f  … (linear), then, we can use r as Accumulator: r  insn(n); r  r*insn(n-1);r  r*insn(n-2); …

42 24/06/2004Programming Language Design and Implementation 42 XPATH Expression Optimization Loop is a major source for improving performance. In XSLT, We have Loop in Recursion and Xpath Expressions.

43 24/06/2004Programming Language Design and Implementation 43 XML Tree Structure Not the same as Unix file system. x b a a bb

44 24/06/2004Programming Language Design and Implementation 44 Simple Evaluator Evaluate(current, a/b/c): S  φ; for-each x (child-node of current) { if (name(x) == a) { S  S ∪ Evaluate (x, b/c); } Loop

45 24/06/2004Programming Language Design and Implementation 45 Menu of Optimizations Partial Evaluation/Symbolic Evaluation –Statically Obtain Result before Evaluation. Dataflow Equation Based Optimization –Solve Equation for Optimality in Dataflow Redundancy Elimination

46 24/06/2004Programming Language Design and Implementation 46 Menu of Optimizations(2) Loop Optimization Memory Hierarchy Optimization Hardware Resource Utilization Semantics Based Optimization

47 24/06/2004Programming Language Design and Implementation 47 Partial Evaluation/ Symbolic Evaluation Definition: Specialize Code by Replacing a Part of Code by Statically Evaluated Code. Static Evaluation and Specialization are Essential.

48 24/06/2004Programming Language Design and Implementation 48 Example of Specialization f(n, t) { if (t == 0) return g(n); else return h(n); } p(n) { return f(n, 0); } p(n) { return f0(n); } f0(n) { h(n); }

49 24/06/2004Programming Language Design and Implementation 49 Partial Evaluation in General Strictly, Partial Evaluation is a Specialization. However, together with Symbolic Evaluation, constant propagation and constant folding are also classified as Partial Evaluation.

50 24/06/2004Programming Language Design and Implementation 50 Constant Propagation/Folding a = 1; if (a+1 == 1) return 1; else return 2; a=1; if (1+1==1) return 1; else return 2;  return 2;

51 24/06/2004Programming Language Design and Implementation 51 PE/SE in XSLT Evaluation of Variables, or Predicates –a/b/[position() > = 0] – … $a…  “1”

52 24/06/2004Programming Language Design and Implementation 52 Redundancy Elimination Eliminate Redundant Computation in Dataflow. A=x+3+y; B=x+3+y;

53 24/06/2004Programming Language Design and Implementation 53 Redundancy Elimination(2) Redundancy: Dataflow Analysis Required. –Compute the Same Expression –Compute the Known Value

54 24/06/2004Programming Language Design and Implementation 54 Redundancy Elimination in XPATH Common SubExpression Elimination –A/B|A/C  A/(B|C) –Interpreted in operational semantics: for-each x (node ∈ current) If (name()==A) { for-each y (node ∈ child of x) { if (name() == B or C) { Sol = Sol ∪ {y} }

55 24/06/2004Programming Language Design and Implementation 55 Redundancy Elimination in XPATH(2) Loop Invariant Hoisting –A/B[../@category = ‘fiction’]  A[@category = ‘fiction’]/B –Interpreted in operational semantics: for-each x (node=current) if (name() = A) { for-each y (node ∈ child of x) { if (name() = B) { if (parent(y).@category = ‘fiction’) { Sol = Sol ∪ {y} } }}}

56 24/06/2004Programming Language Design and Implementation 56 Redundancy Elimination in XPATH(3) for-each x (node=current) if (name() = A) { if (x.@category = ‘fiction’) { for-each y (node ∈ child of x) { if (name() = B) { Sol = Sol ∪ {y} } }}}

57 24/06/2004Programming Language Design and Implementation 57 Redundancy Elimination in XPATH(4) Value Number –DAG representation of expressions: a + a * (b – c) + (b – c) * a * c + + * a - cb * *

58 24/06/2004Programming Language Design and Implementation 58 DAG  Instructions 1  mknode(id, a) 2  mknode(id, a) = 1 3  mknode(id, b) 4  mknode(id, c) 5  mknode(-, 3, 4) 6  mknode(*, 2=1, 5) 7  mknode(+, 1, 6) 8  mknode(id, b) = 3 9  mknode(id, c) = 4 10  mknode(-, 8=3,9=4)=5 11  mknode(id, a) = 1 12  mknode(*,10=5,11=1) 13  mknode(id, c) = 4 14  mknode(*,12,13=4) 15  mknode(+,7, 14) + + * a - cb * *

59 24/06/2004Programming Language Design and Implementation 59 Instructions  Register Transfer Load a, r1 Load b, r3 Load c, r4 iSub r3, r4, r5 Imul r1, r5, r6 Iadd r1, r6, r7 Imul r5, r1, r12 Imul r12,r4, r14 Iadd r7, r14, r15 1  mknode(id, a) 2  mknode(id, a) = 1 3  mknode(id, b) 4  mknode(id, c) 5  mknode(-, 3, 4) 6  mknode(*, 2=1, 5) 7  mknode(+, 1, 6) 8  mknode(id, b) = 3 9  mknode(id, c) = 4 10  mknode(-, 8=3,9=4)=5 11  mknode(id, a) = 1 12  mknode(*,10=5,11=1) 13  mknode(id, c) = 4 14  mknode(*,12,13=4) 15  mknode(+,7, 14)

60 24/06/2004Programming Language Design and Implementation 60 Redundancy Elimination in XPATH Dead Code Elimination

61 24/06/2004Programming Language Design and Implementation 61 Redundancy Elimination in XPATH(4) There can be many other optimizations of Evaluation of Xpath Expressions. Node-set calculation includes loops, which are major source for performance improvement.

62 24/06/2004Programming Language Design and Implementation 62 Other Optimizations Dataflow Equations Type Checking Semantics Based Optimizations


Download ppt "24/06/2004Programming Language Design and Implementation 1 Optimizations in XSLT tokyo.ac.jp/schuko/XSLT-opt.ppt 24/June/04."

Similar presentations


Ads by Google