Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus.

Slides:



Advertisements
Similar presentations
Virtual Machines Matthew Dwyer 324E Nichols Hall
Advertisements

1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Exercise 1 Generics and Assignments. Language with Generics and Lots of Type Annotations Simple language with this syntax types:T ::= Int | Bool | T =>
A Program Transformation For Faster Goal-Directed Search Akash Lal, Shaz Qadeer Microsoft Research.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus.
Program Representations. Representing programs Goals.
Optimization Compiler Baojian Hua
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Recap from last time Saw several examples of optimizations –Constant folding –Constant Prop –Copy Prop –Common Sub-expression Elim –Partial Redundancy.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
Hardware-Software Interface Machine Program Performance = t cyc x CPI x code size X Available resources statically fixed Designed to support wide variety.
Intermediate Code. Local Optimizations
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
1 Languages and Compilers (SProg og Oversættere) Lecture 15 (1) Compiler Optimizations Bent Thomsen Department of Computer Science Aalborg University With.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
CSc 453 Interpreters & Interpretation Saumya Debray The University of Arizona Tucson.
Code Generation Introduction. Compiler (scalac, gcc) Compiler (scalac, gcc) machine code (e.g. x86, arm, JVM) efficient to execute i=0 while (i < 10)
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Compiler Construction Lecture 17 Mapping Variables to Memory.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
What’s in an optimizing compiler?
Java Bytecode What is a.class file anyway? Dan Fleck George Mason University Fall 2007.
Java Programming Introduction & Concepts. Introduction to Java Developed at Sun Microsystems by James Gosling in 1991 Object Oriented Free Compiled and.
1 Introduction to JVM Based on material produced by Bill Venners.
CS 147 June 13, 2001 Levels of Programming Languages Svetlana Velyutina.
Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Programming Languages
CS 153: Concepts of Compiler Design November 25 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.
Building a Java Interpreter CS 142 (b) 01/14. Lab Sessions Monday, Wednesday – 10am – noon – ICS 189 Send an or make an appointment with the TA.
 Control Flow statements ◦ Selection statements ◦ Iteration statements ◦ Jump statements.
1 Languages and Compilers (SProg og Oversættere) Compiler Optimizations Bent Thomsen Department of Computer Science Aalborg University With acknowledgement.
Code generation exercises. Function body Transform the following code into java bytecode: def middle(small: Int, big: Int): Int = { val mid = small +
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Recap: Printing Trees into Bytecodes To evaluate e 1 *e 2 interpreter –evaluates e 1 –evaluates e 2 –combines the result using * Compiler for e 1 *e 2.
RealTimeSystems Lab Jong-Koo, Lim
CS 536 © CS 536 Spring Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 15.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
More Code Generation and Optimization Pat Morin COMP 3002.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
Code Optimization.
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Spring 2017.
Expressions and Assignment
Optimization Code Optimization ©SoftMoore Consulting.
Optimizing Transformations Hal Perkins Autumn 2011
CSc 453 Interpreters & Interpretation
Compiler Design 21. Intermediate Code Generation
CS 153: Concepts of Compiler Design November 6 Class Meeting
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Spring 2016.
Languages and Compilers (SProg og Oversættere) Compiler Optimizations
Building a Java Interpreter
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Intermediate Code Generation
Compiler Design 21. Intermediate Code Generation
A Tour of Language Implementation
CMPE 152: Compiler Design April 16 Class Meeting
CSc 453 Interpreters & Interpretation
CMPE 152: Compiler Design April 30 Class Meeting
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Spring 2019.
Presentation transcript:

Compilation 2007 Optimization Michael I. Schwartzbach BRICS, University of Aarhus

2 Optimization Optimization  The optimizer aims at: reducing the runtime reducing the code size  These goals often conflict, since a larger program may in fact be faster  The best optimizations achive both goals  An optimizer may also have more esoteric aims: reducing energy consumption reducing chip area

3 Optimization Optimizations for Space  Were historically important, because memory was small and expensive  When memory became large and cheap, optimizing compilers traded space for time  Java compilers do not optimize much, but JVM bytecodes are designed to be small  When Java is targeted at mobile devices, space optimizations are again important

4 Optimization Optimizations for Speed  Were historically important to gain acceptance for the introduction of high-level languages  Are still important, since the software always strains the limits of the hardware  Are challenged by ever higher abstractions in programming languages and must constantly adapt to changing microprocessor architecures  Java compilers do not optimize much, since the JVM kicks in with the JIT compiler

5 Optimization Opportunities for Optimization  At the source code level  At an intermediate low level  At the binary machine code level  At runtime (JIT compilers)  At the hardware level  An aggressive optimization requires many small contributions from all levels

6 Optimization Optimizers Must Undo Abstractions  Variables abstract away from registers, so the optimizer must find an efficient mapping  Control structures abstract away from gotos, so the optimizer must simplify a goto graph  Data structures abstract away from memory, so the optimizer must find an efficient layout...  Method invocations abstract away from procedure calls, so the optimizer must efficiently determine the intended implementation

7 Optimization Difficult Compromises  A high abstraction level makes the development time cheaper, but the runtime more expensive  An optimizing compiler makes runtime more efficient, but compile time less efficient  Optimizations for speed and size may conflict  Different applications may require different choices at different times

8 Optimization Examples of Optimizations  Strength reduction  Loop unrolling  Common subexpression elimination  Loop invariant code motion  Inline expansion  These may take place either at the source level or at the bytecode level  Most require information from static analyses

9 Optimization Strength Reduction  Replace expensive operations with cheap ones: for (i = 0; i < a.length; i++) a[i] = a[i] + i/4; for (i = 0; i < a.length; i++) a[i] += (i >> 2);

10 Optimization Loop Unrolling  Unfold a loop to save condition tests: for (i = 0; i < 100; i++) g(i); for (i = 0; i < 100; i += 2) { g(i); g(i+1); }

11 Optimization Common Subexpression Elimination  Avoid redundant computations: double d = a * Math.sqrt(c); double e = b * Math.sqrt(c); double tmp = Math.sqrt(c); double d = a * tmp; double e = b * tmp;

12 Optimization Loop Invariant Code Motion  Move constant valued expressions outside loops: for (i = 0; i < a.length; i++) b[i] = a[i] + c * d; int tmp1 = a.length; int tmp2 = c * d; for (i = 0; i < tmp1; i++) b[i] = a[i] + tmp2;

13 Optimization Inline Expansion  Replace method invocations with copies: int pred(int x) { if (x == 0) return x; else return x-1; } int f(int y) { return pred(y) + pred(0) + pred(y+1); } int f(int y) { int tmp = 0; if (y == 0) tmp += 0; else tmp += y-1; if (0 == 0) tmp += 0; else tmp += 0-1; if (y+1 ==0 ) tmp += 0; else tmp += (y+1)-1; return tmp; }

14 Optimization Collaborating Optimizations  Optimizations may enable other optimizations: int f(int y) { int tmp = 0; if (y == 0) tmp += 0; else tmp += y-1; if (0 == 0) tmp += 0; else tmp += 0-1; if (y+1 == 0) tmp += 0; else tmp += (y+1)-1; return tmp; } int f(int y) { if (y == 0) return 0; else if (y == -1) return -2; else return y+y-1; }

15 Optimization Optimization in Joos public int foo(int a, int b, int c) { c = a*b+c; if (c<a) a = a+b*113; while (b>0) { a = a*c; b = b-1; } return a; } iload_1 iload_2 imul iload_3 iadd dup istore_3 pop iload_3 iload_1 if_icmplt true1 iconst_0 goto end2 true1: iconst_1 end2: ifeq false0 iload_1 iload_2 imul iload_3 iadd istore_3 iload_3 iload_1 if_icmpge cond4 iload_1 iload_2 bipush 113 imul iadd istore_1 goto cond4 loop3: iload_1 iload_3 imul istore_1 iinc 2 -1 cond4: iload_2 ifgt loop3 iload_1 ireturn iload_1 iload_2 bipush 113 imul iadd dup istore_1 pop false0: goto cond4 loop3: iload_1 iload_3 imul dup istore_1 pop iload_2 iconst_1 isub dup istore_2 pop cond4: iload_2 iconst_0 if_icmpgt true5 iconst_0 goto end6 true5: iconst_1 end6: ifne loop3 iload_1 ireturn 52 bytecodes 27 bytecodes

16 Optimization Peephole Optimizations  Make local improvements in bytecode sequences  The optimizers considers only finite windows of the sequence  When the pattern "clicks", the optimizer rewrites a part of the code using a template: dup istore 3 istore 3 pop

17 Optimization Peephole Transitions  Let P be a collection of peephole patterns  It defines a transition relation on sequences of bytecodes: B 1 B 2 meaning that p  P clicked at some position in the sequence B 1 and produced the sequence B 2 p

18 Optimization Termination  A collection of peephole patterns must terminate  This means that for the collection P, there must not exist an infinite sequence: B 0 B 1 B 2 B 3... for any B 0 and p i  P p1p1 p2p2 p3p3 p4p4

19 Optimization Soundness (1/2)  Every peephole pattern must preserve semantics  Assume the pattern p transforms a bytecode sequence B 1 into the sequence B 2  Consider now any bytecode context C  If C[B 1 ] emits the verifiable code E 1, then C[B 2 ] must emit some verifiable code E 2 with the same semantics

20 Optimization Soundness (2/2)   C  B 1 :C B 1 B 2 E 1 E 2 p emit 

21 Optimization A Peephole Pattern Language  Joos has a domain-specific language for specifying peephole patterns  The Joos compiler contains an interpreter for this peephole language  It is invoked with the option -O patternfile  It will try all patterns in an unspecified order until no pattern clicks anywhere

22 Optimization Pattern Syntax pattern → pattern name var : exp -> intconst templates  The exp determines whether the pattern clicks  The intconst tells how many bytecodes to replace  The template specifies the new bytecodes  The evaluation of exp produces a set of bindings that may be used inside the templates and later in the expression

23 Optimization Expression Types  The following types are possible results: int label type-signature field-signature method-signature string condition bytecodes boolean  The notation inst(σ 1,..., σ k ) means that the given instruction has these arguments in the JVM specification

24 Optimization exp intop exp | exp intcomp exp | exp comp exp | exp ~ peepholes | ! exp | exp && exp | exp || exp | intconst | condconst Pattern Expressions exp →var | degree var | target var | formals var | returns var | negate exp | commute exp |

25 Optimization Peepholes and Templates peepholes →peephole* peephole →instruction | instruction ( vars ) | * |(any single instruction) var : (label binder) template →template* template →instruction | instruction ( exps ) condconst → eq | ne | lt | le | gt | ge | aeq | ane intop → + | - | * | / | % intcomp → | >= comp → == | !=

26 Optimization Peephole Judgements  The judgement: |- E: σ[  →  '] means that the expression E: evaluates to a result of type σ consumes the bindings  produces the bindings  '  The judgement: |- X: [  →  '] similarly describes peepholes, templates, and patterns

27 Optimization Expression Well-Formedness (1/5)  (x) = σ |- x: σ[  →  ]  (x) = label |- degree x: int[  →  ]  (x) = label |- target x: bytecodes[  →  ]  (x) = method-signature |- formals x: int[  →  ]

28 Optimization Expression Well-Formedness (2/5) |- E: condition[  →  '] |- negate E: condition[  →  ']  (x)= method-signature |- returns x: int[  →  ] |- E: condition[  →  '] |- commute E: condition[  →  ']

29 Optimization Expression Well-Formedness (3/5) |- E 1 : int[  →  '] |- E 2 : int[  ' →  ''] |- E 1 intop E 2 : int[  →  ''] |- E 1 : int[  →  '] |- E 2 : int[  '→  ''] |- E 1 intcomp E 2 : boolean[  →  ''] |- E 1 : σ[  →  '] |- E 2 : σ[  '→  ''] |- E 1 comp E 2 : boolean[  →  '']

30 Optimization Expression Well-Formedness (4/5) |- E: bytecodes[  →  '] |- P[  '→  ''] |- E ~ P: boolean[  →  ''] |- E: boolean[  →  '] |- ! E: boolean[  →  ] |- E 1 : boolean[  →  '] |- E 2 : boolean[  '→  ''] |- E 1 && E 2 : boolean[  →  '']

31 Optimization Expression Well-Formedness (5/5) |- E 1 : boolean[  →  '] |- E 2 : boolean[  →  '']  x:  '(x)=  '   ''(x)=  ''   ' =  '' |- E 1 || E 2 : boolean[  →  '   ''] |- k: int[  →  ] |- cond: condition[  →  ]

32 Optimization Peephole Well-Formedness (1/2) |- P i [  i →  i+1 ] |- P 1 P 2...P k [  1 →  k+1 ] |- inst: [  →  ] x i ≠ x j x i  inst(σ 1,..., σ k ) |- inst(x 1,...,x k )[  →  [x i →σ i ]]

33 Optimization Peephole Well-Formedness (2/2) |- * : [  →  ] |- x : : [  →  [x→label]] |- label (x) : [  →  [x→label]]

34 Optimization Template Well-Formedness |- T i : [  i →  i+1 ] |- T 1 T 2...T k : [  1 →  k+1 ] |- inst: [  →  ] |- E i : σ i [  i →  i+1 ] inst(σ 1,..., σ k ) |- inst(E 1,...,E k )[  1 →  k+1 ] |- E: label [  1 →  2 ] |- E:inst: [  1 →  2 ]

35 Optimization Pattern Well-Formedness |- E: boolean[[x→bytecodes] →  ] |- T[  →  '] |- pattern n x : E -> k T: [[]→  ']

36 Optimization Pattern Examples (1/4) pattern dup_istore_pop x: x ~ dup istore (i0) pop -> 3 istore (i0)  This pattern is relevant for code like: x = a*b;

37 Optimization Pattern Examples (2/4) pattern goto_label x: x ~ goto (l1) label (l2) && l1 == l2 -> 1  This pattern arises during optimization of nested control structures

38 Optimization Pattern Examples (3/4) pattern constant_iadd_residue x: x ~ ldc_int (i0) iadd ldc_int (i1) iadd -> 4 ldc_int (i0+i1) iadd  This pattern is relevant for code like: a+5+7

39 Optimization Pattern Examples (4/4) pattern goto_goto x: x ~ goto (l0) && target l0 ~ goto (l1) && ! (target l1 ~ goto (l2)) && ! (target l1 ~ label (l3)) -> 1 goto (l1)  This pattern arises during optimization of nested control structures

40 Optimization Proving Termination  We want to avoid infinite sequences like: B 0 B 1 B 2 B 3...  Define an integer valued function  such that:  B:  (B)  0  p  P: B 1 B 2   (B 2 ) <  (B 1 ) p1p1 p2p2 p3p3 p4p4 p

41 Optimization Termination Function Example  For our 4 example patterns we define:  (B) = # dup + # goto + # iadd + ???  What gets smaller in the goto_goto pattern?

42 Optimization Termination Function Example  For our 4 example patterns we define:  (B) = # dup + # goto + # iadd + ???  What gets smaller in the goto_goto pattern? label (l 1 )  B l 1 → l 2 → l 3 →... → l k l i ≠ l j  goto k

43 Optimization A Non-Terminating Pattern pattern bad_goto_goto x: x ~ goto (l0) && target l0 ~ goto (l1) -> 1 goto (l1) foo: goto bar bar: goto foo

44 Optimization Proving Soundness  A formal proof of soundness for a collection of patterns requires a full formal semantics of: bytecode sequences peephole patterns bytecode contexts code emission the complete JVM  The pitfall is usually the universal quantification of contexts: does this really always work?

45 Optimization An Unsound Pattern pattern idiv_pop x: x ~ idiv pop -> 1 pop  This pattern may actually click  And the resulting bytecode will always verify  But the semantics is not preserved, since it may remove a java.lang.ArithmeticException