Compilation 0368-3133 (Semester A, 2013/14) Lecture 9: Activation Records + Register Allocation Noam Rinetzky Slides credit: Roman Manevich, Mooly Sagiv.

Slides:



Advertisements
Similar presentations
The University of Adelaide, School of Computer Science
Advertisements

CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
Procedures in more detail. CMPE12cGabriel Hugh Elkaim 2 Why use procedures? –Code reuse –More readable code –Less code Microprocessors (and assembly languages)
Lecture 08a – Backpatching & Recap Eran Yahav 1 Reference: Dragon 6.2,6.3,6.4,6.6.
1 Compiler Construction Intermediate Code Generation.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
The University of Adelaide, School of Computer Science
CSE 5317/4305 L7: Run-Time Storage Organization1 Run-Time Storage Organization Leonidas Fegaras.
 Procedures (subroutines) allow the programmer to structure programs making them : › easier to understand and debug and › allowing code to be reused.
Compiler Construction Intermediate Representation III Activation Records Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Compiler Principles Fall Compiler Principles Lecture 7: Intermediate Representation Roman Manevich Ben-Gurion University.
Lecture 10 – Activation Records Eran Yahav 1 Reference: Dragon 7.1,7.2. MCD 6.3,
1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:
Procedures in more detail. CMPE12cCyrus Bazeghi 2 Procedures Why use procedures? Reuse of code More readable Less code Microprocessors (and assembly languages)
Register Allocation Mooly Sagiv html://
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
1 Handling nested procedures Method 1 : static (access) links –Reference to the frame of the lexically enclosing procedure –Static chains of such links.
Activation Records Mooly Sagiv html:// Chapter 6.3.
CS 536 Spring Code generation I Lecture 20.
Activation Records Mooly Sagiv html:// Chapter 6.3.
Activation Records Mooly Sagiv html:// Chapter 6.3.
Run time vs. Compile time
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
Compiler Summary Mooly Sagiv html://
Compiler Construction Recap Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Presented by Dr Ioanna Dionysiou
Code Generation for Control Flow Mooly Sagiv html:// Chapter 6.4.
Chapter 8 :: Subroutines and Control Abstraction
Chapter 7: Runtime Environment –Run time memory organization. We need to use memory to store: –code –static data (global variables) –dynamic data objects.
Compilation /15a Lecture 7 Getting into the back-end Noam Rinetzky 1.
Runtime Environments What is in the memory? Runtime Environment2 Outline Memory organization during program execution Static runtime environments.
13/02/2009CA&O Lecture 04 by Engr. Umbreen Sabir Computer Architecture & Organization Instructions: Language of Computer Engr. Umbreen Sabir Computer Engineering.
Runtime Environments Compiler Construction Chapter 7.
Compiler Run-time Organization Lecture 7. 2 What we have covered so far… We have covered the front-end phases –Lexical analysis –Parsing –Semantic analysis.
Programming Language Principles Lecture 24 Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Subroutines.
Compiler Construction
CSc 453 Runtime Environments Saumya Debray The University of Arizona Tucson.
Copyright © 2005 Elsevier Chapter 8 :: Subroutines and Control Abstraction Programming Language Pragmatics Michael L. Scott.
Languages and the Machine Chapter 5 CS221. Topics The Compilation Process The Assembly Process Linking and Loading Macros We will skip –Case Study: Extensions.
Activation Records CS 671 February 7, CS 671 – Spring The Compiler So Far Lexical analysis Detects inputs with illegal tokens Syntactic analysis.
Compilation /15a Lecture 8 IR for Procedure calls + Optimizations Noam Rinetzky 1.
1 Control Abstraction (Section ) CSCI 431 Programming Languages Fall 2003 A compilation of material developed by Felix Hernandez-Campos and Michael.
COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan.
Activation Records (in Tiger) CS 471 October 24, 2007.
Run-Time Storage Organization Compiler Design Lecture (03/23/98) Computer Science Rensselaer Polytechnic.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
Runtime Organization (Chapter 6) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
CSC 8505 Compiler Construction Runtime Environments.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Intermediate Representation II Storage Allocation and Management EECS 483 – Lecture 18 University of Michigan Wednesday, November 8, 2006.
Compilation /15a Lecture 6 Getting into the back-end Noam Rinetzky 1.
Compiler Construction Code Generation Activation Records
Compilation (Semester A, 2013/14) Lecture 8: Activation Records + Optimized IR Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv and.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Code Generation Instruction Selection Higher level instruction -> Low level instruction Register Allocation Which register to assign to hold which items?
Run-Time Environments Chapter 7
Mooly Sagiv html://
Compilation /15a Lecture 7
Activation Records Mooly Sagiv
Introduction to Compilers Tim Teitelbaum
An Overview to Compiler Design
Chapter 9 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
Getting into the back-end Noam Rinetzky
Code generation for procedure calls Noam Rinetzky
Runtime Environments What is in the memory?.
Where is all the knowledge we lost with information? T. S. Eliot
Presentation transcript:

Compilation (Semester A, 2013/14) Lecture 9: Activation Records + Register Allocation Noam Rinetzky Slides credit: Roman Manevich, Mooly Sagiv and Eran Yahav 1

What is a Compiler? “A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language). The most common reason for wanting to transform source code is to create an executable program.” --Wikipedia 2

Conceptual Structure of a Compiler Executable code exe Source text txt Semantic Representation Backend Compiler Frontend Lexical Analysis Syntax Analysis Parsing Semantic Analysis Intermediate Representation (IR) Code Generation 3

From scanning to parsing ((23 + 7) * x) )x*)7+23(( RPIdOPRPNumOPNumLP Lexical Analyzer program text token stream Parser Grammar: E ... | Id Id  ‘a’ |... | ‘z’ Op(*) Id(b) Num(23)Num(7) Op(+) Abstract Syntax Tree valid syntax error 4

From scanning to parsing ((23 + 7) * x) )x*)7+23(( RPIdOPRPNumOPNumLP Lexical Analyzer program text token stream Parser Grammar: E ... | Id Id  ‘a’ |... | ‘z’ Op(*) Id(b) Num(23)Num(7) Op(+) Abstract Syntax Tree valid syntax error 5

Context Analysis Op(*) Id(b) Num(23)Num(7) Op(+) Abstract Syntax Tree E1 : int E2 : int E1 + E2 : int Type rules Semantic ErrorValid + Symbol Table 6

Code Generation Op(*) Id(b) Num(23)Num(7) Op(+) Valid Abstract Syntax Tree Symbol Table cgen Frame Manager Verification (possible runtime) Errors/Warnings Intermediate Representation (IR) Executable Codeinputoutput 7

Code Generation: IR Translating from abstract syntax (AST) to intermediate representation (IR) –Three-Address Code Primitive statements, control flow, procedure calls Source code (program) Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation Source code (executable) 8

C++ IR Pentium Java bytecode Sparc Pyhton Java optimize LoweringCode Gen. Intermediate representation A language that is between the source language and the target language – not specific to any machine Goal 1: retargeting compiler components for different source languages/target machines Goal 2: machine-independent optimizer –Narrow interface: small number of node types (instructions) 9

Three-Address Code IR A popular form of IR High-level assembly where instructions have at most three operands Chapter 8 10

Heap Abstract Register Machine Global Variables Stack Low addresses CPUMain Memory Register 00 Register 01 Register xx … 11 Register PC Control registers (data) registers General purpose … Code High addresses

Abstract Activation Record Stack Stack frame for procedure Proc k+1 (a 1,…,a N ) 12 Proc k Proc k+2 … … Proc k+1

Abstract Stack Frame Param N Param N-1 … Param 1 _t0 … _tk x … y Parameters (actual arguments) Locals and temporaries 13 Proc k Proc k+2 … … Stack frame for procedure Proc k+1 (a 1,…,a N )

“Abstract” Code Memory load/store –Load: Memory  Register –Store: Register  Memory Operation: Between registers* –R1 = R2 + R3 14 Fixed number of Registers

TAC generation for expressions cgen(atomic_expr) directly generates TAC for atomic expressions –Constants, identifiers,… cgen(compund_expr) recursively generates TAC for compound expressions –binary operators, procedure calls, … –use temporary variables (registers) to store values of intermediate expressions 15

cgen example cgen(5 + x) = { Choose a new temporary t Let t 1 = cgen(5) Let t 2 = cgen(x) Emit( t = t 1 + t 2 ) Return t } 16

Naïve cgen for expressions Maintain a counter for temporaries in c Initially: c = 0 cgen(e 1 op e 2 ) = { Let A = cgen(e 1 ) c = c + 1 Let B = cgen(e 2 ) c = c + 1 Emit( _tc = A op B; ) Return _tc } 17 Why Naïve?

Heap Abstract Register Machine Global Variables Stack Low addresses CPUMain Memory 18 Register PC Control registers (data) registers General purpose … Code High addresses Register 00 Register 01 Register xx …

TAC Generation for Control Flow Statements Label introduction _label_name: Indicates a point in the code that can be jumped to Unconditional jump: go to instruction following label L Goto L; Conditional jump: test condition variable t; if 0, jump to label L IfZ t Goto L; Similarly : test condition variable t; if 1, jump to label L IfNZ t Goto L; 19

Control-flow example – conditions int x; int y; int z; if (x < y) z = x; else z = y; z = z * z; _t0 = x < y; IfZ _t0 Goto _L0; z = x; Goto _L1; _L0: z = y; _L1: z = z * z; 20

cgen for if-then-else cgen(if (e) s 1 else s 2 ) Let _t = cgen(e) Let L false be a new label Let L after be a new label Emit( IfZ _t Goto L false ; ) cgen(s 1 ) Emit( Goto L after ; ) Emit( L false : ) cgen(s 2 ) Emit( Goto L after ;) Emit( L after : ) 21 Is “new label” Naïve?

Interprocedural IR: Using a Stack Stack of activation records –One activation record per procedure invocation Call  push new activation record Return  pop activation record Only one “active” activation record – top of stack 22

Mostly-Abstract Handling Procedures Store local parameters/variables/temporaries in the stack procedure call pushes arguments to stack and jumps to the function label –x=f(a1,…,an)  Push a 1 ; … Push a n ; return_address Jump to first address of f; Pop x; // copy returned value Procedure return clean stack, store return value, return control to caller –return x  Pop temp 1,…,temp k,var 1,…,var m,param 1, …, param n Push x; Jump to return_address 23

Semi-Abstract Register Machine Global Variables Stack Heap High addresses Low addresses CPUMain Memory Register 00 Register 01 Register xx … 24 Register PC Control registers (data) registers General purpose … ebp esp … registers Stack

Runtime Stack SP – stack pointer – top of current frame FP – frame pointer – base of current frame –Sometimes called BP (base pointer) Current frame …… Previous frame SP FP stack grows down 25

L-Values of Local Variables The offset in the stack is known at compile time L-val(x) = FP+offset(x) x = 5  Load_Constant 5, R3 Store R3, offset(x)(FP) 26

What’s the Problem with This Picture? Global Variables Stack Heap High addresses Low addresses CPUMain Memory Register 00 Register 01 Register xx … 27 Register PC Control registers (data) registers General purpose … ebp esp … registers Stack

What’s the Problem with This Picture? Global Variables Stack Heap High addresses Low addresses CPUMain Memory 28 Register PC Control registers … ebp esp … registers Stack (data) registers General purpose Register 00 Register 01 Register xx …

Saving and Restoring Registers The processor does not save the content of registers on procedure calls So who will? –Caller saves and restores registers –Callee saves and restores registers –But can also have both save/restore some registers 29

Activation Record (Frame) parameter k parameter 1 lexical pointer return information dynamic link registers & misc local variables temporaries next frame would be here … administrative part high addresses low addresses frame pointer stack pointer incoming parameters stack grows down 30

Pentium Runtime Stack Pentium stack registers Pentium stack and call/ret instructions RegisterUsage ESPStack pointer EBPBase pointer InstructionUsage push, pusha,…push on runtime stack pop,popa,…Base pointer calltransfer control to called routine returntransfer control back to caller 31

Accessing Stack Variables Use offset from FP (%ebp) Remember – stack grows downwards Above FP = parameters Below FP = locals Examples –%ebp + 4 = return address –%ebp + 8 = first parameter –%ebp – 4 = first local …… SP FP Return address Param n … param1 Local 1 … Local n Previous fp Param n … param1 FP+8 FP-4 32

Call Sequences call caller callee return caller Caller push code Callee push code (prologue) Callee pop code (epilogue) Caller pop code Push caller-save registers Push actual parameters (in reverse order) push return address Jump to call address Push current base-pointer bp = sp Push local variables Push callee-save registers Pop callee-save registers Pop callee activation record Pop old base-pointer Put return value in register %eax Pop return address Jump to address Get return value from register %eax Pop parameters Pop caller-save registers 33

“To Callee-save or to Caller-save?” Callee-saved registers need only be saved when callee modifies their value some heuristics and conventions are followed 34

Caller-Save and Callee-Save Registers Callee-Save Registers –Saved by the callee before modification –Values are automatically preserved across calls Caller-Save Registers –Saved (if needed) by the caller before calls –Values are not automatically preserved across calls Architecture defines caller- & callee- save registers –Separate compilation –Interoperability between code produced by different compilers/languages –But compiler writers decide when to use calller/callee registers 35

Callee-Save Registers Saved by the callee before modification Usually at procedure prolog Restored at procedure epilog Hardware support may be available Values are automatically preserved across calls.global _foo Add_Constant -K, SP //allocate space for foo Store_Local R5, -14(FP) // save R5 Load_Reg R5, R0; Add_Constant R5, 1 JSR f1 ; JSR g1; Add_Constant R5, 2; Load_Reg R5, R0 Load_Local -14(FP), R5 // restore R5 Add_Constant K, SP; RTS // deallocate int foo(int a){ int b=a+1; f1(); g1(b); return(b+2); } 36

Caller-Save Registers Saved by the caller before calls when needed Values are not automatically preserved across calls.global _bar Add_Constant -K, SP //allocate space for bar Add_Constant R0, 1 JSR f2 Load_Constant 2, R0 ; JSR g2; Load_Constant 8, R0 ; JSR g2 Add_Constant K, SP // deallocate space for bar RTS void bar (int y) { int x=y+1; f2(x); g2(2); g2(8); } 37

Parameter Passing 1960s – In memory No recursion is allowed 1970s – In stack 1980s – In registers – First k parameters are passed in registers (k=4 or k=6) –Where is time saved? Most procedures are leaf procedures Interprocedural register allocation Many of the registers may be dead before another invocation Register windows are allocated in some architectures per call (e.g., sun Sparc) 38

Activation Records & Language Design 39

Static (lexical) Scoping main ( ) { int a = 0 ; int b = 0 ; { int b = 1 ; { int a = 2 ; printf (“%d %d\n”, a, b) } { int b = 3 ; printf (“%d %d\n”, a, b) ; } printf (“%d %d\n”, a, b) ; } printf (“%d %d\n”, a, b) ; } B0B0 B1B1 B3B3 B3B3 B2B2 DeclarationScopes a=0B0,B1,B3 b=0B0 b=1B1,B2 a=2B2 b=3B3 a name refers to its (closest) enclosing scope known at compile time 40

Dynamic Scoping Each identifier is associated with a global stack of bindings When entering scope where identifier is declared –push declaration on identifier stack When exiting scope where identifier is declared –pop identifier stack Evaluating the identifier in any context binds to the current top of stack Determined at runtime 41

Compile-Time Information on Variables Name, type, size Address kind –Fixed (global) –Relative (local) –Dynamic (heap) Scope –when is it recognized Duration –Until when does its value exist 42

Scoping What value is returned from main? Static scoping? Dynamic scoping? int x = 42; int f() { return x; } int g() { int x = 1; return f(); } int main() { return g(); } 43

Nested Procedures For example – Pascal Any routine can have sub-routines Any sub-routine can access anything that is defined in its containing scope or inside the sub-routine itself –“non-local” variables 44

Lexical Pointers 45 a a c b c d y y z z Possible call sequence: p  a  a  c  b  c  d a b P c c d a program p(){ int x; procedure a(){ int y; procedure b(){ c() }; procedure c(){ int z; procedure d(){ y := x + z }; … b() … d() … } … a() … c() … } a() }

Non-Local goto in C syntax 46

Non-local gotos in C setjmp remembers the current location and the stack frame longjmp jumps to the current location (popping many activation records) 47

Non-Local Transfer of Control in C 48

Variable Length Frame Size C allows allocating objects of unbounded size in the stack void p() { int i; char *p; scanf(“%d”, &i); p = (char *) alloca(i*sizeof(int)); } Some versions of Pascal allows conformant array value parameters 49

Limitations The compiler may be forced to store a value on a stack instead of registers The stack may not suffice to handle some language features 50

Frame-Resident Variables A variable x cannot be stored in register when: –x is passed by reference – Address of x is taken (&x) – is addressed via pointer arithmetic on the stack-frame (C varags) –x is accessed from a nested procedure – The value is too big to fit into a single register – The variable is an array – The register of x is needed for other purposes – Too many local variables An escape variable: –Passed by reference –Address is taken –Addressed via pointer arithmetic on the stack-frame –Accessed from a nested procedure 51

Limitations of Stack Frames A local variable of P cannot be stored in the activation record of P if its duration exceeds the duration of P Example 1: Static variables in C (own variables in Algol) void p(int x) { static int y = 6 ; y += x; } Example 2: Features of the C language int * f() { int x ; return &x ; } Example 3: Dynamic allocation int * f() { return (int *) malloc(sizeof(int)); } 52

Compiler Implementation Hide machine dependent parts Hide language dependent part Use special modules 53

Basic Compiler Phases Source program (string).EXE lexical analysis syntax analysis semantic analysis Code generation Assembler/Linker Tokens Abstract syntax tree Assembly Frame manager Control Flow Graph 54

Hidden in the frame ADT Word size The location of the formals Frame resident variables Machine instructions to implement “shift- of-view” (prologue/epilogue) The number of locals “allocated” so far The label in which the machine code starts 55

Invocations to Frame “Allocate” a new frame “Allocate” new local variable Return the L-value of local variable Generate code for procedure invocation Generate prologue/epilogue Generate code for procedure return 56

The Frames in Different Architectures PentiumMIPSSparc InFrame(8)InFrame(0)InFrame(68) InFrame(12)InReg(X 157 ) InFrame(16)InReg(X 158 ) M[sp+0]  fp fp  sp sp  sp-K M[sp+K+0]  r 2 X 157  r4 X 158  r5 save %sp, -K, %sp M[fp+68]  i 0 X 157  i 1 X 158  i 2 g(x, y, z) where x escapes x y z View Change 57

Activation Records: Summary compile time memory management for procedure data works well for data with well-scoped lifetime –deallocation when procedure returns 58

One More Thing* 59

Windows Exploit(s) Buffer Overflow void foo (char *x) { char buf[2]; strcpy(buf, x); } int main (int argc, char *argv[]) { foo(argv[1]); }./a.out abracadabra Segmentation fault Stack grows this way Memory addresses Previous frame Return address Saved FP char* x buf[2] … ab ra ca da br 60

Buffer overflow int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16]; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(int argc, char *argv[]) {if(argc < 2) { printf("Usage: %s \n", argv[0]); exit(0);} if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");} else { printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2 nd Ed”) 61

Buffer overflow int check_authentication(char *password) { int auth_flag = 0; char password_buffer[16]; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(int argc, char *argv[]) {if(argc < 2) { printf("Usage: %s \n", argv[0]); exit(0);} if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");} else { printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2 nd Ed”) 62

Buffer overflow int check_authentication(char *password) { char password_buffer[16]; int auth_flag = 0; strcpy(password_buffer, password); if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1; if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1; return auth_flag; } int main(int argc, char *argv[]) {if(argc < 2) { printf("Usage: %s \n", argv[0]); exit(0);} if(check_authentication(argv[1])) { printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); printf(" Access Granted.\n"); printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");} else { printf("\nAccess Denied.\n"); } (source: “hacking – the art of exploitation, 2 nd Ed”) 63

Buffer overflow 0x :movl $0x ,(%esp) 0x :call 0x x :movl $0x ,(%esp) 0x c :call 0x x :movl $0x804867a,(%esp) 0x :call 0x x d :jmp 0x804855b 0x f :movl $0x ,(%esp) 0x :call 0x

Example: Nested Procedures program p; var x: Integer; procedure a var y: Integer; procedure b begin… b … end; function c var z: Integer; procedure d begin… d … end; begin… c …end; begin… a … end; begin… p … end. possible call sequence: p  a  a  c  b  c  d what is the address of variable “y” in procedure d? 65

The End (Activation Records) 66