Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compilation 0368-3133 (Semester A, 2013/14) Lecture 8: Activation Records + Optimized IR Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv and.

Similar presentations


Presentation on theme: "Compilation 0368-3133 (Semester A, 2013/14) Lecture 8: Activation Records + Optimized IR Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv and."— Presentation transcript:

1 Compilation 0368-3133 (Semester A, 2013/14) Lecture 8: Activation Records + Optimized IR Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv and Eran Yahav

2 What is a Compiler? “A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language). The most common reason for wanting to transform source code is to create an executable program.” --Wikipedia 2

3 Conceptual Structure of a Compiler 3 Executable code exe Source text txt Semantic Representation Backend Compiler Frontend Lexical Analysis Syntax Analysis Parsing Semantic Analysis Intermediate Representation (IR) Code Generation

4 From scanning to parsing 4 ((23 + 7) * x) )x*)7+23(( RPIdOPRPNumOPNumLP Lexical Analyzer program text token stream Parser Grammar: E ... | Id Id  ‘a’ |... | ‘z’ Op(*) Id(b) Num(23)Num(7) Op(+) Abstract Syntax Tree valid syntax error

5 Context Analysis 5 Op(*) Id(b) Num(23)Num(7) Op(+) Abstract Syntax Tree E1 : int E2 : int E1 + E2 : int Type rules Semantic ErrorValid + Symbol Table

6 Code Generation 6 Op(*) Id(b) Num(23)Num(7) Op(+) Valid Abstract Syntax Tree Symbol Table … Verification (possible runtime) Errors/Warnings Intermediate Representation (IR) Executable Codeinputoutput

7 Code Generation: IR Translating from abstract syntax (AST) to intermediate representation (IR) –Three-Address Code … 7 Source code (program) Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation Source code (executable)

8 8 C++ IR Pentium Java bytecode Sparc Pyhton Java optimize LoweringCode Gen. Intermediate representation A language that is between the source language and the target language – not specific to any machine Goal 1: retargeting compiler components for different source languages/target machines Goal 2: machine-independent optimizer –Narrow interface: small number of node types (instructions)

9 Multiple IRs Some optimizations require high-level structure Others more appropriate on low-level code Solution: use multiple IR stages ASTLIR Pentium Java bytecode Sparc optimize HIR optimize 9

10 Compile Time vs Runtime Compile time: Data structures used during program compilation Runtime: Data structures used during program execution 10

11 Three-Address Code IR A popular form of IR High-level assembly where instructions have at most three operands 11 Chapter 8

12 Variable assignments var = constant ; var 1 = var 2 ; var 1 = var 2 op var 3 ; var 1 = constant op var 2 ; var 1 = var 2 op constant ; var = constant 1 op constant 2 ; Permitted operators are +, -, *, /, % 12

13 Representing 3AC Quadruple (op,arg1,arg2,result) Result of every instruction is written into a new temporary variable Generates many variable names Can move code fragments without complicated renaming Alternative representations may be more compact 13 at5t5 =:(5) t5t5 t4t4 t2t2 +(4) t4t4 t3t3 b*(3) t3t3 cuminus(2) t2t2 t1t1 b*(1) t1t1 cuminus(0) resultarg 2arg 1op t 1 = - c t 2 = b * t 1 t 3 = - c t 4 = b * t 3 t 5 = t 2 * t 4 a = t 5

14 From AST to TAC 14

15 Generating IR code Option 1 Accumulate code in AST attributes Option 2 Emit IR code to a file during compilation –If for every production the code of the left- hand-side is constructed from a concatenation of the code of the RHS in some fixed order 15

16 TAC generation At this stage in compilation, we have –an AST –annotated with scope information –and annotated with type information To generate TAC for the program, we do recursive tree traversal –Generate TAC for any subexpressions or substatements –Using the result, generate TAC for the overall expression 16

17 TAC generation for expressions Define a function cgen(expr) that generates TAC that computes an expression, stores it in a temporary variable, then hands back the name of that temporary –Define cgen directly for atomic expressions (constants, this, identifiers, etc.) Define cgen recursively for compound expressions (binary operators, function calls, etc.) 17

18 Booleans Boolean variables are represented as integers that have zero or nonzero values In addition to the arithmetic operator, TAC supports <, ==, ||, and && How might you compile the following? 18 b = (x <= y);_t0 = x < y; _t1 = x == y; b = _t0 || _t1;

19 cgen example 19 cgen(5 + x) = { Choose a new temporary t Let t 1 = cgen(5) Let t 2 = cgen(x) Emit( t = t 1 + t 2 ) Return t {

20 Num val = 5 AddExpr leftright Ident name = x visit visit (left) visit (right) cgen(5 + x) t1 = 5; t2 = x; t = t1 + t2; t1 = 5t2 = x t = t1 + t2 cgen as recursive AST traversal 20

21 Naïve cgen for expressions Maintain a counter for temporaries in c Initially: c = 0 cgen(e 1 op e 2 ) = { Let A = cgen(e 1 ) c = c + 1 Let B = cgen(e 2 ) c = c + 1 Emit( _tc = A op B; ) Return _tc } 21

22 Naive cgen for expressions Maintain a counter for temporaries in c Initially: c = 0 cgen(e 1 op e 2 ) = { Let A = cgen(e 1 ) c = c + 1 Let B = cgen(e 2 ) c = c + 1 Emit( _tc = A op B; ) Return _tc } Observation: temporaries in cgen(e 1 ) can be reused in cgen(e 2 ) 22

23 Improving cgen for expressions Observation – naïve translation needlessly generates temporaries for leaf expressions Observation – temporaries used exactly once –Once a temporary has been read it can be reused for another sub-expression cgen(e 1 op e 2 ) = { Let _t1 = cgen(e 1 ) Let _t2 = cgen(e 2 ) Emit( _t =_t1 op _t2; ) Return t } Temporaries cgen(e 1 ) can be reused in cgen(e 2 ) 23

24 Sethi-Ullman translation Algorithm by Ravi Sethi and Jeffrey D. Ullman to emit optimal TAC –Minimizes number of temporaries Main data structure in algorithm is a stack of temporaries –Stack corresponds to recursive invocations of _t = cgen(e) –All the temporaries on the stack are live Live = contain a value that is needed later on 24

25 Weighted register allocation Can save registers by re-ordering subtree computations Label each node with its weight –Weight = number of registers needed –Leaf weight known –Internal node weight w(left) > w(right) then w = left w(right) > w(left) then w = right w(right) = w(left) then w = left + 1 Choose heavier child as first to be translated WARNING: have to check that no side-effects exist before attempting to apply this optimization (pre-pass on the tree) 25

26 Weighted reg. alloc. example b 5c * array access + a w=0 w=1 w=0 w=1 Phase 1: - check absence of side-effects in expression tree - assign weight to each AST node 26 _t0 = cgen( a+b[5*c] ) base index

27 Weighted reg. alloc. example b 5c * array access + a base index w=0 w=1 w=0 w=1 27 _t0 = cgen( a+b[5*c] ) Phase 2: - use weights to decide on order of translation _t0 Heavier sub-tree _t0 = _t1 * _t0 _t0 = _t1[_t0] _t0 = _t1 + _t0 _t0_t1 _t0 = c _t1 = 5 _t1 = b _t1 = a

28 TAC Generation for Memory Access Instructions Copy instruction: a = b Load/store instructions: a = *b*a = b Address of instruction a=&b Array accesses: a = b[i]a[i] = b Field accesses: a = b[f]a[f] = b Memory allocation instruction: a = alloc(size) –Sometimes left out (e.g., malloc is a procedure in C) 28

29 Array operations 29 t1 := &y ; t1 = address-of y t2 := t1 + i ; t2 = address of y[i] x := *t2 ; value stored at y[i] t1 := &x ; t1 = address-of x t2 := t1 + i ; t2 = address of x[i] *t2 := y ; store through pointer x := y[i] x[i] := y

30 TAC Generation for Control Flow Statements Label introduction _label_name: Indicates a point in the code that can be jumped to Unconditional jump: go to instruction following label L Goto L; Conditional jump: test condition variable t; if 0, jump to label L IfZ t Goto L; Similarly : test condition variable t; if 1, jump to label L IfNZ t Goto L; 30

31 Control-flow example – conditions int x; int y; int z; if (x < y) z = x; else z = y; z = z * z; _t0 = x < y; IfZ _t0 Goto _L0; z = x; Goto _L1; _L0: z = y; _L1: z = z * z; 31

32 cgen for if-then-else 32 cgen(if (e) s 1 else s 2 ) Let _t = cgen(e) Let L false be a new label Let L after be a new label Emit( IfZ _t Goto L false ; ) cgen(s 1 ) Emit( Goto L after ; ) Emit( L false : ) cgen(s 2 ) Emit( Goto L after ;) Emit( L after : )

33 Control-flow example – loops int x; int y; while (x < y) { x = x * 2; { y = x; _L0: _t0 = x < y; IfZ _t0 Goto _L1; x = x * 2; Goto _L0; _L1: y = x; 33

34 cgen for while loops 34 cgen(while (expr) stmt) Let L before be a new label. Let L after be a new label. Emit( L before : ) Let t = cgen(expr) Emit( IfZ t Goto L after ; ) cgen(stmt) Emit( Goto L before ; ) Emit( L after : )

35 Optimization points 35 source code Front end IR Code generator target code User profile program change algorithm Compiler intraprocedural IR Interprocedural IR IR optimizations Compiler register allocation instruction selection peephole transformations today

36 Interprocedural IR Compile time generation of code for procedure invocations Activation Records (aka Frames) 36

37 Intro: Procedures / Functions Store local variables/temporaries in a stack A function call instruction pushes arguments to stack and jumps to the function label A statement x=f(a1,…,an); looks like Push a1; … Push an; Call f; Pop x; // copy returned value Returning a value is done by pushing it to the stack ( return x; ) Push x; Return control to caller (and roll up stack) Return; 37

38 Intro: Memory Layout (popular convention) 38 Global Variables Stack Heap High addresses Low addresses

39 Intro: A Logical Stack Frame 39 Param N Param N-1 … Param 1 _t0 … _tk x … y Parameters (actual arguments) Locals and temporaries Stack frame for function f(a1,…,aN)

40 Intro: Functions Example int SimpleFn(int z) { int x, y; x = x * y * z; return x; } void main() { int w; w = SimpleFunction(137); } _SimpleFn: _t0 = x * y; _t1 = _t0 * z; x = _t1; Push x; Return; main: _t0 = 137; Push _t0; Call _SimpleFn; Pop w; 40

41 Supporting Procedures New computing environment –at least temporary memory for local variables Passing information into the new environment –Parameters Transfer of control to/from procedure Handling return values 41

42 What Can We Do with Procedures? Declarations & Definitions Call & Return Jumping out of procedures Passing & Returning procedures as parameters 42

43 Design Decisions Scoping rules –Static scoping vs. dynamic scoping Caller/callee conventions –Parameters –Who saves register values? Allocating space for local variables 43

44 Static (lexical) Scoping 44 main ( ) { int a = 0 ; int b = 0 ; { int b = 1 ; { int a = 2 ; printf (“%d %d\n”, a, b) } { int b = 3 ; printf (“%d %d\n”, a, b) ; } printf (“%d %d\n”, a, b) ; } printf (“%d %d\n”, a, b) ; } B0B0 B1B1 B3B3 B3B3 B2B2 DeclarationScopes a=0B0,B1,B3 b=0B0 b=1B1,B2 a=2B2 b=3B3 a name refers to its (closest) enclosing scope known at compile time

45 Dynamic Scoping Each identifier is associated with a global stack of bindings When entering scope where identifier is declared –push declaration on identifier stack When exiting scope where identifier is declared –pop identifier stack Evaluating the identifier in any context binds to the current top of stack Determined at runtime 45

46 Example What value is returned from main? Static scoping? Dynamic scoping? 46 int x = 42; int f() { return x; } int g() { int x = 1; return f(); } int main() { return g(); }

47 Why do we care? We need to generate code to access variables Static scoping –Identifier binding is known at compile time –Address of the variable is known at compile time –Assigning addresses to variables is part of code generation –No runtime errors of “access to undefined variable” –Can check types of variables 47

48 Variable addresses for static scoping: first attempt 48 int x = 42; int f() { return x; } int g() { int x = 1; return f(); } int main() { return g(); } identifieraddress x (global)0x42 x (inside g)0x73

49 Variable addresses for static scoping: first attempt 49 int a [11] ; void quicksort(int m, int n) { int i; if (n > m) { i = partition(m, n); quicksort (m, i-1) ; quicksort (i+1, n) ; } main() {... quicksort (1, 9) ; } what is the address of the variable “i” in the procedure quicksort?

50 Compile-Time Information on Variables Name Type Scope –when is it recognized Duration –Until when does its value exist Size –How many bytes are required at runtime Address –Fixed –Relative –Dynamic 50

51 Activation Record (Stack Frames) separate space for each procedure invocation managed at runtime –code for managing it generated by the compiler desired properties –efficient allocation and deallocation procedures are called frequently –variable size different procedures may require different memory sizes 51

52 Memory Layout (popular convention) 52 Global Variables Stack Heap High addresses Low addresses

53 Memory Hierarchy 53 Global Variables Stack Heap High addresses Low addresses CPUMain Memory Register 00 Register 01 Register xx …

54 A Logical Stack Frame (Simplified) 54 Param N Param N-1 … Param 1 _t0 … _tk x … y Parameters (actual arguments) Locals and temporaries Stack frame for function f(a1,…,aN)

55 Activation Record (frame) 55 parameter k parameter 1 lexical pointer return information dynamic link registers & misc local variables temporaries next frame would be here … administrative part high addresses low addresses frame pointer stack pointer incoming parameters stack grows down

56 Runtime Stack Stack of activation records Call = push new activation record Return = pop activation record Only one “active” activation record – top of stack How do we handle recursion? 56

57 Runtime Stack SP – stack pointer – top of current frame FP – frame pointer – base of current frame –Sometimes called BP (base pointer) 57 Current frame …… Previous frame SP FP stack grows down

58 Pentium Runtime Stack 58 Pentium stack registers Pentium stack and call/ret instructions RegisterUsage ESPStack pointer EBPBase pointer InstructionUsage push, pusha,…push on runtime stack pop,popa,…Base pointer calltransfer control to called routine returntransfer control back to caller

59 Call Sequences The processor does not save the content of registers on procedure calls So who will? –Caller saves and restores registers –Callee saves and restores registers –But can also have both save/restore some registers 59

60 Call Sequences 60 call caller callee return caller Caller push code Callee push code (prologue) Callee pop code (epilogue) Caller pop code Push caller-save registers Push actual parameters (in reverse order) push return address Jump to call address Push current base-pointer bp = sp Push local variables Push callee-save registers Pop callee-save registers Pop callee activation record Pop old base-pointer pop return address Jump to address Pop parameters Pop caller-save registers

61 Call Sequences – Foo(42,21) 61 call caller callee return caller push %ecx push $21 push $42 call _foo push %ebp mov %esp, %ebp sub %8, %esp push %ebx pop %ebx mov %ebp, %esp pop %ebp ret add $8, %esp pop %ecx Push caller-save registers Push actual parameters (in reverse order) push return address Jump to call address Push current base-pointer bp = sp Push local variables Push callee-save registers Pop callee-save registers Pop callee activation record Pop old base-pointer pop return address Jump to address Pop caller-save registers Pop parameters

62 “To Callee-save or to Caller-save?” Callee-saved registers need only be saved when callee modifies their value some heuristics and conventions are followed 62

63 Caller-Save and Callee-Save Registers Callee-Save Registers –Saved by the callee before modification –Values are automatically preserved across calls Caller-Save Registers –Saved (if needed) by the caller before calls –Values are not automatically preserved across calls Usually the architecture defines caller-save and callee- save registers Separate compilation Interoperability between code produced by different compilers/languages But compiler writers decide when to use calller/callee registers 63

64 Callee-Save Registers Saved by the callee before modification Usually at procedure prolog Restored at procedure epilog Hardware support may be available Values are automatically preserved across calls 64.global _foo Add_Constant -K, SP //allocate space for foo Store_Local R5, -14(FP) // save R5 Load_Reg R5, R0; Add_Constant R5, 1 JSR f1 ; JSR g1; Add_Constant R5, 2; Load_Reg R5, R0 Load_Local -14(FP), R5 // restore R5 Add_Constant K, SP; RTS // deallocate int foo(int a){ int b=a+1; f1(); g1(b); return(b+2); }

65 Caller-Save Registers Saved by the caller before calls when needed Values are not automatically preserved across calls 65.global _bar Add_Constant -K, SP //allocate space for bar Add_Constant R0, 1 JSR f2 Load_Constant 2, R0 ; JSR g2; Load_Constant 8, R0 ; JSR g2 Add_Constant K, SP // deallocate space for bar RTS void bar (int y) { int x=y+1; f2(x); g2(2); g2(8); }

66 Parameter Passing 1960s – In memory No recursion is allowed 1970s – In stack 1980s – In registers – First k parameters are passed in registers (k=4 or k=6) –Where is time saved? 66 Most procedures are leaf procedures Interprocedural register allocation Many of the registers may be dead before another invocation Register windows are allocated in some architectures per call (e.g., sun Sparc)

67 L-Values of Local Variables The offset in the stack is known at compile time L-val(x) = FP+offset(x) x = 5  Load_Constant 5, R3 Store R3, offset(x)(FP) 67

68 Code Blocks Programming language provide code blocks void foo() { int x = 8 ; y=9;//1 { int x = y * y ;//2 } { int x = y * 7 ;//3} x = y + 1; } 68 adminstrative x1 y1 x2 x3 …

69 Accessing Stack Variables Use offset from FP (%ebp) Remember – stack grows downwards Above FP = parameters Below FP = locals Examples –%ebp + 4 = return address –%ebp + 8 = first parameter –%ebp – 4 = first local 69 …… SP FP Return address Param n … param1 Local 1 … Local n Previous fp Param n … param1 FP+8 FP-4

70 Factorial – fact(int n) 70 fact: pushl %ebp # save ebp movl %esp,%ebp # ebp=esp pushl %ebx # save ebx movl 8(%ebp),%ebx # ebx = n cmpl $1,%ebx # n = 1 ? jle.lresult # then done leal -1(%ebx),%eax # eax = n-1 pushl %eax # call fact # fact(n-1) imull %ebx,%eax # eax=retv*n jmp.lreturn #.lresult: movl $1,%eax # retv.lreturn: movl -4(%ebp),%ebx # restore ebx movl %ebp,%esp # restore esp popl %ebp # restore ebp ESP EBP Return address old %ebx Previous fp n EBP+8 EBP-4 old %ebp old %ebx (stack in intermediate point) (disclaimer: real compiler can do better than that)

71 Windows Exploit(s) Buffer Overflow void foo (char *x) { char buf[2]; strcpy(buf, x); } int main (int argc, char *argv[]) { foo(argv[1]); }./a.out abracadabra Segmentation fault 71 Stack grows this way Memory addresses Previous frame Return address Saved FP char* x buf[2] … ab ra ca da br

72 Buffer overflow 72 int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16]; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(int argc, char *argv[]) {if(argc < 2) { printf("Usage: %s \n", argv[0]); exit(0); } if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");} else { printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2 nd Ed”)

73 Buffer overflow 73 int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16]; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(int argc, char *argv[]) {if(argc < 2) { printf("Usage: %s \n", argv[0]); exit(0);} if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");} else { printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2 nd Ed”)

74 Buffer overflow 74 int check_authentication(char *password) { char password_buffer[16]; int auth_flag = 0; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(int argc, char *argv[]) {if(argc < 2) { printf("Usage: %s \n", argv[0]); exit(0);} if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");} else { printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2 nd Ed”)

75 Buffer overflow 75 0x08048529 :movl $0x8048647,(%esp) 0x08048530 :call 0x8048394 0x08048535 :movl $0x8048664,(%esp) 0x0804853c :call 0x8048394 0x08048541 :movl $0x804867a,(%esp) 0x08048548 :call 0x8048394 0x0804854d :jmp 0x804855b 0x0804854f :movl $0x8048696,(%esp) 0x08048556 :call 0x8048394

76 Nested Procedures For example – Pascal Any routine can have sub-routines Any sub-routine can access anything that is defined in its containing scope or inside the sub-routine itself –“non-local” variables 76

77 Example: Nested Procedures program p; var x: Integer; procedure a var y: Integer; procedure b begin… b … end; function c var z: Integer; procedure d begin… d … end; begin… c …end; begin… a … end; begin… p … end. 77 possible call sequence: p  a  a  c  b  c  d what is the address of variable “y” in procedure d?

78 Nested Procedures can call a sibling, ancestor when “c” uses (non-local) variables from “a”, which instance of “a” is it? how do you find the right activation record at runtime? 78 possible call sequence: p  a  a  c  b  c  d a b P c c d a

79 Nested Procedures goal: find the closest routine in the stack from a given nesting level if we reached the same routine in a sequence of calls –routine of level k uses variables of the same nesting level, it uses its own variables –if it uses variables of nesting level j < k then it must be the last routine called at level j If a procedure is last at level j on the stack, then it must be ancestor of the current routine 79 possible call sequence: p  a  a  c  b  c  d a b P c c d a

80 Nested Procedures problem: a routine may need to access variables of another routine that contains it statically solution: lexical pointer (a.k.a. access link) in the activation record lexical pointer points to the last activation record of the nesting level above it –in our example, lexical pointer of d points to activation records of c lexical pointers created at runtime number of links to be traversed is known at compile time 80

81 Lexical Pointers program p; var x: Integer; procedure a var y: Integer; procedure b begin… b … end; function c var z: Integer; procedure d begin… d … end; begin… c …end; begin… a … end; begin… p … end. 81 a a c b c d y y z z possible call sequence: p  a  a  c  b  c  d a b P c c d a

82 Procedure Q Calls R 82

83 Nesting Depth The semantic analysis identifies the static nesting hierarchy A possible implementation –Assign integers to functions and variables –Defined inductively The main is at level 0 Updated when new function begins/ends 83

84 Supporting Static Scoping References to non-local variables Language rules –No nesting of functions C, C++, Java –Non-local references are bounded to the most recently enclosed declared procedure and “die” when the procedure end Algol, Pascal, Scheme Simplest implementation Pass the lexical pointer as an extra argument to functions –Scope rules guarantee that this can be done Generate code to traverse the frames 84

85 Routine Descriptor for Languages with nested scopes 85 Lexical pointer routine address

86 Calculating L-Values 86 int i; void level_0(void){ int j; void level_1(void) { int k; void level_2(void) { int l; k=l; j =l; }}} offset(lexical_pointer) + FP *( offset(k) + 0 1 2 3

87 Code for the k=l 87 int i; void level_0(void){ int j; void level_1(void) { int k; void level_2(void) { int l; k=l; j =l; }}}

88 Code for the j=l 88 int i; void level_0(void){ int j; void level_1(void) { int k; void level_2(void) { int l; k=l; j =l; }}}

89 Other Implementations of Static Scoping Display –An array of lexical pointers –d[i] is lexical pointer nesting level i –Can be stored in the stack lambda-lifting – Pass non-local variables as extra parameters 89

90 Machine Registers Every year –CPUs are improving by 50%-60% –Main memory speed is improving by 10% Machine registers allow efficient accesses –Utilized by the compiler Other memory units exist –Cache 90

91 RISC vs. CISC Machines 91 FeatureRISCCISC Registers  32 6, 8, 16 Register ClassesOneSome Arithmetic OperandsRegistersMemory+Registers Instructions3-addr2-addr Addressing Modes r M[r+c] (l,s) several Instruction Length32 bitsVariable Side-effectsNoneSome Instruction-Cost“Uniform”Varied

92 Activation Records: Remarks 92

93 Non-Local goto in C syntax 93

94 Non-local gotos in C setjmp remembers the current location and the stack frame longjmp jumps to the current location (popping many activation records) 94

95 Non-Local Transfer of Control in C 95

96 Stack Frames Allocate a separate space for every procedure incarnation Relative addresses Provide a simple mean to achieve modularity Supports separate code generation of procedures Naturally supports recursion Efficient memory allocation policy –Low overhead –Hardware support may be available LIFO policy Not a pure stack –Non local references –Updated using arithmetic 96

97 The Frame Pointer The caller –the calling routine The callee –the called routine caller responsibilities: –Calculate arguments and save in the stack –Store lexical pointer call instruction: M[--SP] := RA PC := callee callee responsibilities: –FP := SP –SP := SP - frame-size Why use both SP and FP? 97

98 Variable Length Frame Size C allows allocating objects of unbounded size in the stack void p() { int i; char *p; scanf(“%d”, &i); p = (char *) alloca(i*sizeof(int)); } Some versions of Pascal allows conformant array value parameters 98

99 Limitations The compiler may be forced to store a value on a stack instead of registers The stack may not suffice to handle some language features 99

100 Frame-Resident Variables A variable x cannot be stored in register when: –x is passed by reference – Address of x is taken (&x) – is addressed via pointer arithmetic on the stack-frame (C varags) –x is accessed from a nested procedure – The value is too big to fit into a single register – The variable is an array – The register of x is needed for other purposes – Too many local variables An escape variable: –Passed by reference –Address is taken –Addressed via pointer arithmetic on the stack-frame –Accessed from a nested procedure 100

101 The Frames in Different Architectures PentiumMIPSSparc InFrame(8)InFrame(0)InFrame(68) InFrame(12)InReg(X 157 ) InFrame(16)InReg(X 158 ) M[sp+0]  fp fp  sp sp  sp-K M[sp+K+0]  r 2 X 157  r4 X 158  r5 save %sp, -K, %sp M[fp+68]  i 0 X 157  i 1 X 158  i 2 101 g(x, y, z) where x escapes x y z View Change

102 The Need for Register Copies 102 void m(int x, int y) { h(y, y); h(x, x); }

103 Limitations of Stack Frames A local variable of P cannot be stored in the activation record of P if its duration exceeds the duration of P Example 1: Static variables in C (own variables in Algol) void p(int x) { static int y = 6 ; y += x; } Example 2: Features of the C language int * f() { int x ; return &x ; } Example 3: Dynamic allocation int * f() { return (int *) malloc(sizeof(int)); } 103

104 Compiler Implementation Hide machine dependent parts Hide language dependent part Use special modules 104

105 Basic Compiler Phases 105 Source program (string).EXE lexical analysis syntax analysis semantic analysis Code generation Assembler/Linker Tokens Abstract syntax tree Assembly Frame manager Control Flow Graph

106 Hidden in the frame ADT Word size The location of the formals Frame resident variables Machine instructions to implement “shift- of-view” (prologue/epilogue) The number of locals “allocated” so far The label in which the machine code starts 106

107 Invocations to Frame “Allocate” a new frame “Allocate” new local variable Return the L-value of local variable Generate code for procedure invocation Generate prologue/epilogue Generate code for procedure return 107

108 Activation Records: Summary compile time memory management for procedure data works well for data with well-scoped lifetime –deallocation when procedure returns 108

109 IR Optimization Making code better 109

110 IR Optimization Making code “better” 110

111 “Optimized” evaluation b 5c * array access + a base index w=0 w=1 w=0 w=1 111 _t0 = cgen( a+b[5*c] ) Phase 2: - use weights to decide on order of translation _t0 Heavier sub-tree _t0 = _t1 * _t0 _t0 = _t1[_t0] _t0 = _t1 + _t0 _t0_t1 _t0 = c _t1 = 5 _t1 = b _t1 = a

112 But what about… a := 1 + 2; y := a + b; x := a + b + 8; z := b + a; a := a + 1; w:= a + b; 112

113 Overview of IR optimization Formalisms and Terminology –Control-flow graphs –Basic blocks Local optimizations –Speeding up small pieces of a proceudre Global optimizations –Speeding up procedure as a whole The dataflow framework –Defining and implementing a wide class of optimizations 113

114 Program Analysis In order to optimize a program, the compiler has to be able to reason about the properties of that program An analysis is called sound if it never asserts an incorrect fact about a program All the analyses we will discuss in this class are sound –(Why?) 114

115 Soundness 115 int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, x holds some integer value”

116 Soundness 116 int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, x is either 137 or 42”

117 Soundness 117 int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, x is 137”

118 Soundness 118 int x; int y; if (y < 5) x = 137; else x = 42; Print(x); “At this point in the program, x is either 137, 42, or 271”

119 Semantics-preserving optimizations An optimization is semantics-preserving if it does not alter the semantics of the original program Examples: –Eliminating unnecessary temporary variables –Computing values that are known statically at compile-time instead of runtime –Evaluating constant expressions outside of a loop instead of inside Non-examples: –Replacing bubble sort with quicksort (why?) –The optimizations we will consider in this class are all semantics-preserving 119

120 A formalism for IR optimization Every phase of the compiler uses some new abstraction: –Scanning uses regular expressions –Parsing uses CFGs –Semantic analysis uses proof systems and symbol tables –IR generation uses ASTs In optimization, we need a formalism that captures the structure of a program in a way amenable to optimization 120

121 Visualizing IR 121 main: _tmp0 = Call _ReadInteger; a = _tmp0; _tmp1 = Call _ReadInteger; b = _tmp1; _L0: _tmp2 = 0; _tmp3 = b == _tmp2; _tmp4 = 0; _tmp5 = _tmp3 == _tmp4; IfZ _tmp5 Goto _L1; c = a; a = b; _tmp6 = c % a; b = _tmp6; Goto _L0; _L1: Push a; Call _PrintInt;

122 Visualizing IR 122 main: _tmp0 = Call _ReadInteger; a = _tmp0; _tmp1 = Call _ReadInteger; b = _tmp1; _L0: _tmp2 = 0; _tmp3 = b == _tmp2; _tmp4 = 0; _tmp5 = _tmp3 == _tmp4; IfZ _tmp5 Goto _L1; c = a; a = b; _tmp6 = c % a; b = _tmp6; Goto _L0; _L1: Push a; Call _PrintInt;

123 Visualizing IR 123 main: _tmp0 = Call _ReadInteger; a = _tmp0; _tmp1 = Call _ReadInteger; b = _tmp1; _L0: _tmp2 = 0; _tmp3 = b == _tmp2; _tmp4 = 0; _tmp5 = _tmp3 == _tmp4; IfZ _tmp5 Goto _L1; c = a; a = b; _tmp6 = c % a; b = _tmp6; Goto _L0; _L1: Push a; Call _PrintInt; _tmp0 = Call _ReadInteger; a = _tmp0; _tmp1 = Call _ReadInteger; b = _tmp1; _tmp2 = 0; _tmp3 = b == _tmp2; _tmp4 = 0; _tmp5 = _tmp3 == _tmp4; IfZ _tmp5 Goto _L1; c = a; a = b; _tmp6 = c % a; b = _tmp6; Goto _L0; Push a; Call _PrintInt; start end

124 Basic blocks A basic block is a sequence of IR instructions where –There is exactly one spot where control enters the sequence, which must be at the start of the sequence –There is exactly one spot where control leaves the sequence, which must be at the end of the sequence Informally, a sequence of instructions that always execute as a group 124

125 Control-Flow Graphs A control-flow graph (CFG) is a graph of the basic blocks in a function The term CFG is overloaded – from here on out, we'll mean “control-flow graph” and not “context free grammar” Each edge from one basic block to another indicates that control can flow from the end of the first block to the start of the second block There is a dedicated node for the start and end of a function 125

126 Types of optimizations An optimization is local if it works on just a single basic block An optimization is global if it works on an entire control-flow graph An optimization is interprocedural if it works across the control-flow graphs of multiple functions –We won't talk about this in this course 126

127 Basic blocks exercise 127 int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; } START: _t0 = 137; y = _t0; IfZ x Goto _L0; t1 = y; z = _t1; Goto END: _L0: _t2 = y; x = _t2; END: Divide the code into basic blocks

128 Control-flow graph exercise 128 START: _t0 = 137; y = _t0; IfZ x Goto _L0; t1 = y; z = _t1; Goto END: _L0: _t2 = y; x = _t2; END: Draw the control-flow graph int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

129 Control-flow graph exercise 129 _t0 = 137; y = _t0; IfZ x Goto _L0; start _t1 = y; z = _t1; _t2 = y; x = _t2; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

130 Local optimizations 130 _t0 = 137; y = _t0; IfZ x Goto _L0; start _t1 = y; z = _t1; _t2 = y; x = _t2; start int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

131 Local optimizations 131 _t0 = 137; y = _t0; IfZ x Goto _L0; start _t1 = y; z = _t1; _t2 = y; x = _t2; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

132 Local optimizations 132 y = 137; IfZ x Goto _L0; start _t1 = y; z = _t1; _t2 = y; x = _t2; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

133 Local optimizations 133 y = 137; IfZ x Goto _L0; start _t1 = y; z = _t1; _t2 = y; x = _t2; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

134 Local optimizations 134 y = 137; IfZ x Goto _L0; start z = y; _t2 = y; x = _t2; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

135 Local optimizations 135 y = 137; IfZ x Goto _L0; start z = y; _t2 = y; x = _t2; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

136 Local optimizations 136 y = 137; IfZ x Goto _L0; start z = y;x = y; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

137 Global optimizations 137 y = 137; IfZ x Goto _L0; start z = y;x = y; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

138 Global optimizations 138 y = 137; IfZ x Goto _L0; start z = y;x = y; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

139 Global optimizations 139 y = 137; IfZ x Goto _L0; start z = 137;x = 137; End int main() { int x; int y; int z; y = 137; if (x == 0) z = y; else x = y; }

140 Local Optimizations 140

141 Optimization path 141 IR Control-Flow Graph CFG builder Program Analysis Annotated CFG Optimizing Transformation Target Code Code Generation (+optimizations) done with IR optimizations IR optimizations

142 Common subexpression elimination 142 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = 4; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = a + b; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

143 Common subexpression elimination 143 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = 4; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = a + b; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

144 Common subexpression elimination 144 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = 4; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = _tmp4; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

145 Common subexpression elimination 145 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = 4; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = _tmp4; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

146 Common subexpression elimination 146 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = _tmp4; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

147 Common subexpression elimination 147 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = _tmp4; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

148 Common subexpression elimination 148 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = c; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

149 Common Subexpression Elimination If we have two variable assignments v1 = a op b … v2 = a op b and the values of v1, a, and b have not changed between the assignments, rewrite the code as v1 = a op b … v2 = v1 Eliminates useless recalculation Paves the way for later optimizations 149

150 Copy Propagation 150 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = c; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

151 Copy Propagation 151 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = c; _tmp6 = *(x); _tmp7 = *(_tmp6); Push _tmp5; Push x; Call _tmp7;

152 Copy Propagation 152 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = c; _tmp6 = *(_tmp1); _tmp7 = *(_tmp6); Push _tmp5; Push _tmp1; Call _tmp7;

153 Copy Propagation 153 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = a + b; c = _tmp4; _tmp5 = c; _tmp6 = *(_tmp1); _tmp7 = *(_tmp6); Push _tmp5; Push _tmp1; Call _tmp7;

154 Copy Propagation 154 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = *(_tmp1); _tmp7 = *(_tmp6); Push _tmp5; Push _tmp1; Call _tmp7;

155 Copy Propagation 155 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = *(_tmp1); _tmp7 = *(_tmp6); Push _tmp5; Push _tmp1; Call _tmp7;

156 Copy Propagation 156 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = *(_tmp1); _tmp7 = *(_tmp6); Push c; Push _tmp1; Call _tmp7;

157 Copy Propagation 157 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = *(_tmp1); _tmp7 = *(_tmp6); Push c; Push _tmp1; Call _tmp7;

158 Copy Propagation 158 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = _tmp2; _tmp7 = *(_tmp6); Push c; Push _tmp1; Call _tmp7;

159 Copy Propagation Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = _tmp2; _tmp7 = *(_tmp6); Push c; Push _tmp1; Call _tmp7; 159

160 Copy Propagation Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = _tmp2; _tmp7 = *(_tmp2); Push c; Push _tmp1; Call _tmp7; 160

161 Copy Propagation 161 Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp3; _tmp4 = _tmp3 + b; c = _tmp4; _tmp5 = c; _tmp6 = _tmp2; _tmp7 = *(_tmp2); Push c; Push _tmp1; Call _tmp7;

162 Copy Propagation Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp0; _tmp4 = _tmp0 + b; c = _tmp4; _tmp5 = c; _tmp6 = _tmp2; _tmp7 = *(_tmp2); Push c; Push _tmp1; Call _tmp7; 162

163 Copy Propagation If we have a variable assignment v1 = v2 then as long as v1 and v2 are not reassigned, we can rewrite expressions of the form a = … v1 … as a = … v2 … provided that such a rewrite is legal 163

164 Dead Code Elimination Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp0; _tmp4 = _tmp0 + b; c = _tmp4; _tmp5 = c; _tmp6 = _tmp2; _tmp7 = *(_tmp2); Push c; Push _tmp1; Call _tmp7; 164

165 Dead Code Elimination Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; x = _tmp1; _tmp3 = _tmp0; a = _tmp0; _tmp4 = _tmp0 + b; c = _tmp4; _tmp5 = c; _tmp6 = _tmp2; _tmp7 = *(_tmp2); Push c; Push _tmp1; Call _tmp7; 165 values never read

166 Dead Code Elimination Object x; int a; int b; int c; x = new Object; a = 4; c = a + b; x.fn(a + b); _tmp0 = 4; Push _tmp0; _tmp1 = Call _Alloc; Pop tmp2; *(_tmp1) = _tmp2; _tmp4 = _tmp0 + b; c = _tmp4; _tmp7 = *(_tmp2); Push c; Push _tmp1; Call _tmp7; 166

167 Dead Code Elimination An assignment to a variable v is called dead if the value of that assignment is never read anywhere Dead code elimination removes dead assignments from IR Determining whether an assignment is dead depends on what variable is being assigned to and when it's being assigned 167

168 Applying local optimizations The different optimizations we've seen so far all take care of just a small piece of the optimization Common subexpression elimination eliminates unnecessary statements Copy propagation helps identify dead code Dead code elimination removes statements that are no longer needed To get maximum effect, we may have to apply these optimizations numerous times 168

169 Applying local optimizations example 169 b = a * a; c = a * a; d = b + c; e = b + b;

170 Applying local optimizations example 170 b = a * a; c = a * a; d = b + c; e = b + b; Which optimization should we apply here?

171 Applying local optimizations example 171 b = a * a; c = b; d = b + c; e = b + b; Common sub-expression elimination Which optimization should we apply here?

172 Applying local optimizations example 172 b = a * a; c = b; d = b + c; e = b + b; Which optimization should we apply here?

173 Applying local optimizations example 173 b = a * a; c = b; d = b + b; e = b + b; Which optimization should we apply here? Copy propagation

174 Applying local optimizations example 174 b = a * a; c = b; d = b + b; e = b + b; Which optimization should we apply here?

175 Applying local optimizations example 175 b = a * a; c = b; d = b + b; e = d; Which optimization should we apply here? Common sub-expression elimination (again)

176 Other types of local optimizations Arithmetic Simplification –Replace “hard” operations with easier ones –e.g. rewrite x = 4 * a; as x = a << 2; Constant Folding –Evaluate expressions at compile-time if they have a constant value. –e.g. rewrite x = 4 * 5; as x = 20; 176

177 Optimizations and analyses Most optimizations are only possible given some analysis of the program's behavior In order to implement an optimization, we will talk about the corresponding program analyses 177

178 The End 178


Download ppt "Compilation 0368-3133 (Semester A, 2013/14) Lecture 8: Activation Records + Optimized IR Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv and."

Similar presentations


Ads by Google