Download presentation
Presentation is loading. Please wait.
Published byMarvin Riley Modified over 9 years ago
1
Lecture 10 – Code Generation Eran Yahav 1 Reference: Dragon 8. MCD 4.2.4
2
2 You are here Executable code exe Source text txt Compiler Lexical Analysis Syntax Analysis Parsing Semantic Analysis Inter. Rep. (IR) Code Gen.
3
Last Week: Runtime Part II Nested procedures Object layout Inheritance Multiple inheritance 3
4
Today Runtime checks Garbage collection Generating assembly code 4
5
5 Runtime checks generate code for checking attempted illegal operations Null pointer check MoveField, MoveArray, ArrayLength, VirtualCall Reference arguments to library functions should not be null Array bounds check Array allocation size check Division by zero … If check fails jump to error handler code that prints a message and gracefully exists program
6
6 Null pointer check # null pointer check cmp $0,%eax je labelNPE labelNPE: push $strNPE# error message call __println push $1# error code call __exit Single generated handler for entire program
7
7 Array bounds check # array bounds check mov -4(%eax),%ebx # ebx = length mov $0,%ecx # ecx = index cmp %ecx,%ebx jle labelABE # ebx <= ecx ? cmp $0,%ecx jl labelABE # ecx < 0 ? labelABE: push $strABE # error message call __println push $1 # error code call __exit Single generated handler for entire program
8
8 Array allocation size check # array size check cmp $0,%eax# eax == array size jle labelASE # eax <= 0 ? labelASE: push $strASE # error message call __println push $1 # error code call __exit Single generated handler for entire program
9
Automatic Memory Management automatically free memory when it is no longer needed not limited to OO programs, we show it here because it is prevalent in OO languages such as Java also in functional languages approximate reasoning about object liveness use reachability to approximate liveness assume reachable objects are live non-reachable objects are dead Three classical garbage collection techniques reference counting mark and sweep copying 9
10
GC using Reference Counting add a reference-count field to every object how many references point to it when (rc==0) the object is non reachable non reachable => dead can be collected (deallocated) 10
11
Managing Reference Counts Each object has a reference count o.RC A newly allocated object o gets o.RC = 1 why? write-barrier for reference updates update(x,old,new) { old.RC--; new.RC++; if (old.RC == 0) collect(old); } collect(old) will decrement RC for all children and recursively collect objects whose RC reached 0. 11
12
Cycles! cannot identify non-reachable cycles reference counts for nodes on the cycle will never decrement to 0 several approaches for dealing with cycles ignore periodically invoke a tracing algorithm to collect cycles specialized algorithms for collecting cycles 12
13
GC Using Mark & Sweep Marking phase mark roots trace all objects transitively reachable from roots mark every traversed object Sweep phase scan all objects in the heap collect all unmarked objects 13
14
14 mark_sweep() { for Ptr in Roots mark(Ptr) sweep() } mark(Obj) { if mark_bit(Obj) == unmarked { mark_bit(Obj)=marked for C in Children(Obj) mark(C) } Sweep() { p = Heap_bottom while (p < Heap_top) if (mark_bit(p) == unmarked) then free(p) else mark_bit(p) = unmarked; p=p+size(p) } GC Using Mark & Sweep
15
Copying GC partition the heap into two parts: old space, new space GC copy all reachable objects from old space to new space swap roles of old/new space 15
16
Example 16 oldnew Roots A D C B E
17
Example 17 oldnew Roots A D C B E A C
18
Summary How objects are organized in memory Automatic management of memory Coming up… Generating assembly code 18
19
target languages 19 Absolute machine code Code Gen. Relative machine code Assembly IR + Symbol Table
20
From IR to ASM: Challenges mapping IR to ASM operations what instruction(s) should be used to implement an IR operation? how do we translate code sequences call/return of routines managing activation records memory allocation register allocation optimizations 20
21
Intel IA-32 Assembly Going from Assembly to Binary… Assembling Linking AT&T syntax vs. Intel syntax We will use AT&T syntax matches GNU assembler (GAS) 21
22
AT&T vs. Intel Syntax AttributeAT&TIntel Parameter order Source comes before the destination Destination before Parameter Size Mnemonics are suffixed with a letter indicating the size of the operands (e.g., "q" for qword, "l" for dword, "w" for word, and "b" for byte) Derived from the name of the register that is used Immediate value signals Prefixed with a "$", and registers must be prefixed with a "%” The assembler automatically detects the type of symbols; i.e., if they are registers, constants or something else. Effective addresses General syntax DISP(BASE,INDEX,SCALE) Example: movl mem_location(%ebx,%ecx,4), %eax Use variables, and need to be in square brackets; additionally, size keywords like byte, word, or dword have to be used. [1] Example: mov eax, dword [ebx + ecx*4 + mem_location] 22
23
23 IA-32 Registers Eight 32-bit general-purpose registers EAX – accumulator for operands and result data. Used to return value from function calls. EBX – pointer to data. Often use as array-base address ECX – counter for string and loop operations EDX – I/O pointer (GP for us) ESI – GP and source pointer for string operations EDI – GP and destination pointer for string operations EBP – stack frame (base) pointer ESP – stack pointer EFLAGS register EIP (instruction pointer) register Six 16-bit segment registers … (ignore the rest for our purposes)
24
24 Not all registers are born equal EAX Required operand of MUL,IMUL,DIV and IDIV instructions Contains the result of these operations EDX Stores remainder of a DIV or IDIV instruction (EAX stores quotient) ESI, EDI ESI – required source pointer for string instructions EDI – required destination pointer for string instructions Destination Registers of Arithmetic operations EAX, EBX, ECX, EDX EBP – stack frame (base) pointer ESP – stack pointer
25
25 IA-32 Addressing Modes Machine-instructions take zero or more operands Source operand Immediate Register Memory location (I/O port) Destination operand Register Memory location (I/O port)
26
Immediate and Register Operands Immediate Value specified in the instruction itself GAS syntax – immediate values preceded by $ add $4, %esp Register Register name is used GAS syntax – register names preceded with % mov %esp,%ebp 26
27
Memory and Base Displacement Operands Memory operands Value at given address GAS syntax - parentheses mov (%eax), %eax Base displacement Value at computed address Address computed out of base register, index register, scale factor, displacement offset = base + (index*scale) + displacement Syntax: disp(base,index,scale) movl $42, $2(%eax) movl $42, $1(%eax,%ecx,4) 27
28
28 Base Displacement Addressing Mov (%ecx,%ebx,4), %eax 7 Array Base Reference 44 0245671 444444 %ecx = base %ebx = 3 offset = base + (index*scale) + displacement offset = base + (3*4) + 0 = base + 12 (%ecx,%ebx,4)
29
How do we generate the code? break the IR into basic blocks basic block is a sequence of instructions with single entry (to first instruction), no jumps to the middle of the block single exit (last instruction) code execute as a sequence from first instruction to last instruction without any jumps edge from one basic block B1 to another block B2 when the last statement of B1 may jump to B2 29
30
Example 30 False B1B1 B2B2 B3B3 B4B4 True t 1 := 4 * i t 2 := a [ t 1 ] if t 2 <= 20 goto B 3 t 5 := t 2 * t 4 t 6 := prod + t 5 prod := t 6 goto B 4 t 7 := i + 1 i := t 2 Goto B 5 t 3 := 4 * i t 4 := b [ t 3 ] goto B 4
31
creating basic blocks Input: A sequence of three-address statements Output: A list of basic blocks with each three- address statement in exactly one block Method Determine the set of leaders (first statement of a block) The first statement is a leader Any statement that is the target of a conditional or unconditional jump is a leader Any statement that immediately follows a goto or conditional jump statement is a leader For each leader, its basic block consists of the leader and all statements up to but not including the next leader or the end of the program 31
32
control flow graph A directed graph G=(V,E) nodes V = basic blocks edges E = control flow (B1,B2) E when control from B1 flows to B2 32 B1B1 B2B2 t 1 := 4 * i t 2 := a [ t 1 ] t 3 := 4 * i t 4 := b [ t 3 ] t 5 := t 2 * t 4 t 6 := prod + t 5 prod := t 6 t 7 := i + 1 i := t 7 if i <= 20 goto B 2 prod := 0 i := 1
33
example 1) i = 1 2) j =1 3) t1 = 10*I 4) t2 = t1 + j 5) t3 = 8*t2 6) t4 = t3-88 7) a[t4] = 0.0 8) j = j + 1 9) if j <= 10 goto (3) 10) i=i+1 11) if i <= 10 goto (2) 12) i=1 13) t5=i-1 14) t6=88*t5 15) a[t6]=1.0 16) i=i+1 17) if I <=10 goto (13) 33 i = 1 j = 1 t1 = 10*I t2 = t1 + j t3 = 8*t2 t4 = t3-88 a[t4] = 0.0 j = j + 1 if j <= 10 goto B3 i=i+1 if i <= 10 goto B2 i = 1 t5=i-1 t6=88*t5 a[t6]=1.0 i=i+1 if I <=10 goto B6 B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 for i from 1 to 10 do for j from 1 to 10 do a[i, j] = 0.0; for i from 1 to 10 do a[i, i] = 1.0; sourceIR CFG
34
Variable Liveness A statement x = y + z defines x uses y and z A variable x is live at a program point if its value is used at a later point 34 y = 42 z = 73 x = y + z print(x); x is live, y dead, z dead x undef, y live, z live x undef, y live, z undef x is dead, y dead, z dead (showing state after the statement)
35
Computing Liveness Information between basic blocks – dataflow analysis (next lecture) within a single basic block? idea use symbol table to record next-use information scan basic block backwards update next-use for each variable 35
36
Computing Liveness Information INPUT: A basic block B of three-address statements. symbol table initially shows all non-temporary variables in B as being live on exit. OUTPUT: At each statement i: x = y + z in B, liveness and next-use information of x, y, and z at i. Start at the last statement in B and scan backwards At each statement i: x = y + z in B, we do the following: 1. Attach to i the information currently found in the symbol table regarding the next use and liveness of x, y, and z. 2. In the symbol table, set x to "not live" and "no next use.“ 3. In the symbol table, set y and z to "live" and the next uses of y and z to i 36
37
Computing Liveness Information Start at the last statement in B and scan backwards At each statement i: x = y + z in B, we do the following: 1. Attach to i the information currently found in the symbol table regarding the next use and liveness of x, y, and z. 2. In the symbol table, set x to "not live" and "no next use.“ 3. In the symbol table, set y and z to "live" and the next uses of y and z to i 37 can we change the order between 2 and 3? x = 1 y = x + 3 z = x * 3 x = x * z
38
common-subexpression elimination common-subexpression elimination 38 a = b + c b = a – d c = b + c d = a - d a = b + c b = a – d c = b + c d = b
39
DAG Representation of Basic Blocks 39 a = b + c b = a - d c = b + c d = a - d b0c0 + d0 - + a b,d c
40
DAG Representation of Basic Blocks 40 a = b + c b = b - d c = c + d e = b + c b0c0 + d0 - + a bc + e
41
algebraic identities 41 a = x^2 b = x*2 c = x/2 d = 1*x a = x*x b = x+x c = x*0.5 d = x
42
coming up next register allocation 42
43
The End 43
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.