Compiler Construction Code Generation I Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.

Slides:



Advertisements
Similar presentations
Intermediate Representation III. 2 PAs PA2 deadline is
Advertisements

COMP 2003: Assembly Language and Digital Logic
C Programming and Assembly Language Janakiraman V – NITK Surathkal 2 nd August 2014.
Compiler Construction Intermediate Representation III Activation Records Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Assembly Language for Intel-Based Computers Chapter 8: Advanced Procedures Kip R. Irvine.
Lecture 10 – Activation Records Eran Yahav 1 Reference: Dragon 7.1,7.2. MCD 6.3,
Lecture 11 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD
University of Washington Last Time For loops  for loop → while loop → do-while loop → goto version  for loop → while loop → goto “jump to middle” version.
Compiler Construction Code Generation II Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Machine-Level Programming III: Procedures Apr. 17, 2006 Topics IA32 stack discipline Register saving conventions Creating pointers to local variables CS213.
Assembly Language for Intel-Based Computers Chapter 5: Procedures Kip R. Irvine.
PC hardware and x86 3/3/08 Frans Kaashoek MIT
Accessing parameters from the stack and calling functions.
– 1 – , F’02 ICS05 Instructor: Peter A. Dinda TA: Bin Lin Recitation 4.
Compiler Construction Recap Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Assembly תרגול 8 פונקציות והתקפת buffer.. Procedures (Functions) A procedure call involves passing both data and control from one part of the code to.
Compiler Construction Activation records Code Generation Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Stack Activation Records Topics IA32 stack discipline Register saving conventions Creating pointers to local variables February 6, 2003 CSCE 212H Computer.
David Evans CS201j: Engineering Software University of Virginia Computer Science Lecture 18: 0xCAFEBABE (Java Byte Codes)
CEG 320/520: Computer Organization and Assembly Language ProgrammingIntel Assembly 1 Intel IA-32 vs Motorola
6.828: PC hardware and x86 Frans Kaashoek
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD
Today’s topics Parameter passing on the system stack Parameter passing on the system stack Register indirect and base-indexed addressing modes Register.
University of Washington Today More on procedures, stack etc. Lab 2 due today!  We hope it was fun! What is a stack?  And how about a stack frame? 1.
Code Generation Gülfem Savrun Yeniçeri CS 142 (b) 02/26/2013.
Fabián E. Bustamante, Spring 2007 Machine-Level Programming III - Procedures Today IA32 stack discipline Register saving conventions Creating pointers.
Assembly Language for Intel-Based Computers, 6 th Edition Chapter 8: Advanced Procedures (c) Pearson Education, All rights reserved. You may.
Code Generation III. PAs PA4 5.1 – 7.2 PA5 (bonus) 24.1 –
Winter Compiler Construction T12 – Code Generation Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University.
The x86 Architecture Lecture 15 Fri, Mar 4, 2005.
Recitation 2: Outline Assembly programming Using gdb L2 practice stuff Minglong Shao Office hours: Thursdays 5-6PM Wean Hall.
Assembly תרגול 5 תכנות באסמבלי. Assembly vs. Higher level languages There are NO variables’ type definitions.  All kinds of data are stored in the same.
Winter Compiler Construction T11 – Activation records + Introduction to x86 assembly Mooly Sagiv and Roman Manevich School of Computer Science.
26-Nov-15 (1) CSC Computer Organization Lecture 6: Pentium IA-32.
Machine-level Programming III: Procedures Topics –IA32 stack discipline –Register saving conventions –Creating pointers to local variables.
Compiler Construction Code Generation Activation Records
University of Amsterdam Computer Systems – the instruction set architecture Arnoud Visser 1 Computer Systems The instruction set architecture.
1 Assembly Language: Function Calls Jennifer Rexford.
CSC 221 Computer Organization and Assembly Language Lecture 16: Procedures.
Code Generation II. 2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation.
Winter Compiler Construction T10 – IR part 3 + Activation records Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University.
IA32 Stack –Region of memory managed with stack discipline –Grows toward lower addresses –Register %esp indicates lowest stack address address of top element.
A job ad at a game programming company
Assembly function call convention
Reading Condition Codes (Cont.)
Assembly language.
Credits and Disclaimers
C function call conventions and the stack
143A: Principles of Operating Systems Lecture 4: Calling conventions
Aaron Miller David Cohen Spring 2011
Introduction to Compilers Tim Teitelbaum
Assembly IA-32.
Recitation 2 – 2/4/01 Outline Machine Model
Machine-Level Programming 1 Introduction
Machine-Level Programming 4 Procedures
Condition Codes Single Bit Registers
Procedures – Overview Lecture 19 Mon, Mar 28, 2005.
Machine-Level Programming III: Procedures Sept 18, 2001
MIPS Procedure Calls CSE 378 – Section 3.
Machine-Level Programming: Introduction
Multi-modules programming
Miscellaneous Topics.
X86 Assembly Review.
CSC 497/583 Advanced Topics in Computer Security
“Way easier than when we were students”
Credits and Disclaimers
Computer Architecture and System Programming Laboratory
Presentation transcript:

Compiler Construction Code Generation I Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University

2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation IC compiler We saw: Activation records Today: X86 assembly Code generation Runtime checks

3 PA4 PA4 is up Submission deadline 09/03/2009

4 x86 assembly AT&T syntax and Intel syntax We’ll be using AT&T syntax Work with GNU Assembler (GAS)GNU Assembler AT&TIntel Order of operandsop a,b means b = a op b (second operand is destination) op a, b means a = a op b (first operand is destination) Memory addressingdisp(base, offset, scale)[base + offset * scale + disp] Size of memory operandsinstruction suffixes (b,w,l) (e.g., movb, movw, movl) operand prefixes (e.g., byte ptr, word ptr, dword ptr) Registers%eax, %ebx, etc.eax, ebx, etc. Constants$4, etc4, etc Summary of differences

5 IA-32 Eight 32-bit general-purpose registers EAX, EBX, ECX, EDX, ESI, EDI EBP – stack frame (base) pointer ESP – stack pointer EFLAGS register info on results of arithmetic operations EIP (instruction pointer) register Machine-instructions add, sub, inc, dec, neg, mul, …

6 Immediate and register operands Immediate Value specified in the instruction itself Preceded by $ Example: add $4,%esp Register Register name is used Preceded by % Example: mov %esp,%ebp

7 Reminder: accessing variables Use offset from frame pointer Above FP = parameters Below FP = locals (and spilled LIR registers) Examples %ebp + 4 = return address %ebp + 8 = first parameter %ebp – 4 = first local …… SP FP Return address local 1 … local n Previous fp param n … param 1 FP+8 FP-4

8 Memory and base displacement operands Memory operands Obtain value at given address Example: mov (%eax), %eax Base displacement Obtain value at computed address Syntax: disp(base,index,scale) offset = base + (index * scale) + displacement Example: mov $42, 2(%eax) Example: mov $42, (%eax,%ecx,4)

9 Reminder: accessing variables Use offset from frame pointer Above FP = parameters Below FP = locals (and spilled LIR registers) Examples %ebp + 8 = first parameter %eax = %ebp + 8 (%eax) = the value 572 8(%ebp) = the value 572 …… SP FP Return address local 1 … local n Previous fp param n … 572 %eax,FP+8 FP-4

10 Representing strings and arrays Array preceded by a word indicating the length of the array Project-wise String literals allocated statically, concatenation using __stringCat __allocateArray allocates arrays Hello world\13 String reference n 1

11 Base displacement addressing mov (%ecx,%ebx,4), %eax 7 Array base reference %ecx = base %ebx = 3 offset = base + (index * scale) + displacement offset = %ecx + (3*4) + 0 = %ecx + 12 (%ecx,%ebx,4)

12 Instruction examples Translate a=p+q into mov 16(%ebp),%ecx (load p) add 8(%ebp),%ecx (arithmetic p + q) mov %ecx,-8(%ebp) (store a) Accessing strings: str:.string “Hello world!” push $str

13 Instruction examples Array access: a[i]=1 mov -4(%ebp),%ebx (load a ) mov -8(%ebp),%ecx (load i ) mov $1,(%ebx,%ecx,4) (store into the heap) Jumps: Unconditional: jmp label2 Conditional: cmp $0, %ecx jnz cmpFailLabel

14 LIR to assembly Need to know how to translate: Function bodies Translation for each kind of LIR instruction Calling sequences Correctly access parameters and variables Compute offsets for parameter and variables Dispatch tables String literals Runtime checks Error handlers

15 Reminder: accessing variables Use offset from frame pointer Above FP = parameters Below FP = locals (and spilled LIR registers) Examples %ebp + 4 = return address %ebp + 8 = first parameter %ebp – 4 = first local …… SP FP Return address local 1 … local n Previous fp param n … param 1 FP+8 FP-4

16 Translating LIR instructions Translate function bodies: 1. Compute offsets for: Local variables (-4,-8,-12,…) LIR registers (considered extra local variables) Function parameters (+8,+12,+16,…) Take this parameter into account 2. Translate instruction list for each function Local translation for each LIR instruction Local (machine) register allocation

17 Memory offsets implementation // MethodLayout instance per function declaration class MethodLayout { // Maps variables/parameters/LIR registers to // offsets relative to frame pointer (BP) Map memoryToOffset; } void foo(int x, int y) { int z = x + y; g = z; // g is a field Library.printi(z); } virtual function takes one extra parameter: this MethodLayout for foo MemoryOffset this +8 x +12 y +16 z -4 R0 -8 R1 -12 _A_foo: Move x,R0 Add y,R0 Move R0,z Move this,R1 MoveField R0,R1.1 Library __printi(R0),Rdummy (manual) LIR translation 1 PA4 PA5

18 Memory offsets example MethodLayout for foo _A_foo: Move x,R0 Add y,R0 Move R0,z Move this,R1 MoveField R0,R1.1 Library __printi(R0),Rdummy _A_foo: push %ebp # prologue mov %esp,%ebp mov 12(%ebp),%eax # Move x,R0 mov %eax,-8(%ebp) mov 16(%ebp),%eax # Add y,R0 add -8(%ebp),%eax mov %eax,-8(%ebp) mov -8(%ebp),%eax # Move R0,z mov %eax,-4(%ebp) mov 8(%ebp),%eax # Move this,R1 mov %eax,-12(%ebp) mov -8(%ebp),%eax # MoveField R0,R1.1 mov -12(%ebp),%ebx mov %eax,8(%ebx) mov -8(%ebp),%eax # Library __printi(R0) push %eax call __printi add $4,%esp _A_foo_epilogoue: mov %ebp,%esp # epilogoue pop %ebp ret LIR translation Translation to x86 assembly MemoryOffset this +8 x +12 y +16 z -4 R0 -8 R

19 Instruction-specific register allocation Non-optimized translation Each non-call instruction has fixed number of variables/registers Naïve (very inefficient) translation Use direct algorithm for register allocation Example: Move x,R1 translates into move x offset (%ebp),%ebx move %ebx,R1 offset (%ebp) Register hard-coded in translation

20 Translating instructions 1 LIR InstructionTranslation MoveArray R1[R2],R3mov -8(%ebp),%ebx # -8(%ebp)=R1 mov -12(%ebp),%ecx # -12(%ebp)=R2 mov (%ebx,%ecx,4),%ebx mov %ebx,-16(%ebp) # -16(%ebp)=R3 MoveField x,R2.3mov -12(%ebp),%ebx # -12(%ebp)=R2 mov -8(%ebp),%eax # -12(%ebp)=x mov %eax,12(%ebx) # 12=3*4 MoveField _DV_A,R1.0movl $_DV_A,(%ebx) # (%ebx)=R1.0 ( movl means move 4 bytes) ArrayLength y,R1mov -8(%ebp),%ebx # -8(%ebp)=y mov -4(%ebx),%ebx # load size mov %ebx,-12(%ebp) # -12(%ebp)=R1 Add R1,R2mov -16(%ebp),%eax # -16(%ebp)=R1 add -20(%ebp),%eax # -20(%ebp)=R2 mov %eax,-20(%ebp) # store in R2

21 Translating instructions 2 LIR InstructionTranslation Mul R1,R2mov -8(%ebp),%eax # -8(%ebp)=R2 imul -4(%ebp),%eax # -4(%ebp)=R1 mov %eax,-8(%ebp) Div R1,R2 (idiv divides EDX:EAX stores quotient in EAX stores remainder in EDX) mov $0,%edx mov -8(%ebp),%eax # -8(%ebp)=R2 mov -4(%ebp),%ebx # -4(%ebp)=R1 idiv %ebx mov %eax,-8(%ebp) # store in R2 Mod R1,R2mov $0,%edx mov -8(%ebp),%eax # -8(%ebp)=R2 mov -4(%ebp),%ebx # -4(%ebp)=R1 idiv %ebx mov %edx,-8(%ebp) Compare R1,xmov -4(%ebp),%eax # -4(%ebp)=x cmp -8(%ebp),%eax # -8(%ebp)=R1 Return R1 (returned value stored in EAX register) mov -8(%ebp),%eax # -8(%ebp)=R1 jmp _A_foo_epilogue Return Rdummy # return; jmp _A_foo_epilogue

22 Calls/returns Direct function call syntax: call name Example: call __println Return instruction: ret

23 Handling functions Need to implement call sequence Caller code: Pre-call code: Push caller-save registers Push parameters Call (special treatment for virtual function calls) Post-call code: Copy returned value (if needed) Pop parameters Pop caller-save registers Callee code Each function has prologue and epilogue