Code Generation Lecture 12 CS 164 Lecture 14 Fall 2004.

Slides:

Advertisements

Similar presentations

1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.

Advertisements

Review of the MIPS Instruction Set Architecture. RISC Instruction Set Basics All operations on data apply to data in registers and typically change the.

The University of Adelaide, School of Computer Science

ECE 232 L6.Assemb.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 6 MIPS Assembly.

10/6: Lecture Topics Procedure call Calling conventions The stack

1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.

Procedures in more detail. CMPE12cGabriel Hugh Elkaim 2 Why use procedures? –Code reuse –More readable code –Less code Microprocessors (and assembly languages)

Procedure Calls Prof. Sirer CS 316 Cornell University.

Computer Architecture CSCE 350

CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.

Ch. 8 Functions.

Apr. 12, 2000Systems Architecture I1 Systems Architecture I (CS ) Lecture 6: Branching and Procedures in MIPS* Jeremy R. Johnson Wed. Apr. 12, 2000.

The University of Adelaide, School of Computer Science

Informationsteknologi Saturday, September 29, 2007 Computer Architecture I - Class 41 Today’s class More assembly language programming.

1 Storage Registers vs. memory Access to registers is much faster than access to memory Goal: store as much data as possible in registers Limitations/considerations:

Procedures in more detail. CMPE12cCyrus Bazeghi 2 Procedures Why use procedures? Reuse of code More readable Less code Microprocessors (and assembly languages)

1 Function Calls Professor Jennifer Rexford COS 217 Reading: Chapter 4 of “Programming From the Ground Up” (available online from the course Web site)

CS 536 Spring Code generation I Lecture 20.

Intro to Computer Architecture

CS 536 Spring Code Generation II Lecture 21.

4/6/08Prof. Hilfinger CS164 Lecture 291 Code Generation Lecture 29 (based on slides by R. Bodik)

Code Generation CS 480. Can be complex To do a good job of teaching about code generation I could easily spend ten weeks But, don’t have ten weeks, so.

Linked Lists in MIPS Let’s see how singly linked lists are implemented in MIPS on MP2, we have a special type of doubly linked list Each node consists.

13/02/2009CA&O Lecture 04 by Engr. Umbreen Sabir Computer Architecture & Organization Instructions: Language of Computer Engr. Umbreen Sabir Computer Engineering.

Compiler Run-time Organization Lecture 7. 2 What we have covered so far… We have covered the front-end phases –Lexical analysis –Parsing –Semantic analysis.

Slides revised 3/25/2014 by Patrick Kelley. 2 Procedures Higher Level languages have adopted a standard Referred to as C-style calling Uses the stack.

Lecture 18: 11/5/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.

Procedures. Why use procedures? ? Microprocessors (and assembly languages) provide only minimal support for procedures Must build a standard form for.

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 7: MIPS Instructions III

Lecture 19: 11/7/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.

Procedure Basics Computer Organization I 1 October 2009 © McQuain, Feng & Ribbens Procedure Support From previous study of high-level languages,

Lecture 4: MIPS Instruction Set

CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.

Computer Architecture CSE 3322 Lecture 4 Assignment: 2.4.1, 2.4.4, 2.6.1, , Due 2/10/09

CS 164 Lecture 151 Code Generation (I) ICOM4029 Lecture 9.

Function Calls in Assembly MIPS R3000 Language (extensive use of stack) Updated 7/11/2013.

Chapter 14 Functions.

Computer Science 210 Computer Organization

Computer structure: Procedure Calls

© Craig Zilles (adapted from slides by Howard Huang)

Run-time organization

RISC Concepts, MIPS ISA Logic Design Tutorial 8.

Introduction to Compilers Tim Teitelbaum

Procedures (Functions)

Procedures (Functions)

Instructions - Type and Format

Calling Conventions Hakim Weatherspoon CS 3410, Spring 2012

Lecture 4: MIPS Instruction Set

Prof. Hakim Weatherspoon

Chap. 8 :: Subroutines and Control Abstraction

Chap. 8 :: Subroutines and Control Abstraction

Lecture 30 (based on slides by R. Bodik)

MIPS Instructions.

The University of Adelaide, School of Computer Science

ECE232: Hardware Organization and Design

Machine-Level Programming III: Procedures Sept 18, 2001

MIPS Procedure Calls CSE 378 – Section 3.

The University of Adelaide, School of Computer Science

Lecture 5: Procedure Calls

Topic 3-a Calling Convention 1/10/2019.

10/4: Lecture Topics Overflow and underflow Logical operations

Program and memory layout

Operating Systems Run-time Organization

Procedures and Calling Conventions

COMS 361 Computer Organization

Computer Architecture

Where is all the knowledge we lost with information? T. S. Eliot

Program and memory layout

© Craig Zilles (adapted from slides by Howard Huang)

Topic 2b ISA Support for High-Level Languages

Presentation transcript:

Code Generation Lecture 12 CS 164 Lecture 14 Fall 2004

The remote testing experiment. It works! CS 164 Lecture 14 Fall 2004

In PA1, if you achieved best coverage, you also got best score! Remote testing In PA1, if you achieved best coverage, you also got best score! 44 0.814286 43 0.8 41 38 CS 164 Lecture 14 Fall 2004

From the cs164 newsgroup  > I need a small website made. I'm willing to pay for the work too. > So... if anyone is interested e-mail me at [deleted]@berkeley.edu. CS164's guide to create a website: 1) Write a lexer 2) Write a lexer spec for some unknown, obscure language 3) Write a parser 4) Write the grammar for that obscure language 5) Write a code generator that generates HTML 6) ... 7) Profit! Now only you can maintain that website! CS 164 Lecture 14 Fall 2004

The moral Essentially, the recommended strategy is to goal: no one can maintain your programs means: develop an obscure language for your programs But if this is your goal, why a new language? tons of unmaintainable Java programs written some even submitted as cs164 projects  I am sure you can succeed with just Java, too. A better road to profit develop a language: can be obscure, even horrible, but make sure it’s horibly useful, too (ex.: perl, C++, Visual Basic, latex) then publish books on this language  CS 164 Lecture 14 Fall 2004

The MIPS assembly language The x86 assembly language Lecture Outline Stack machines The MIPS assembly language The x86 assembly language A simple source language Stack-machine implementation of the simple language CS 164 Lecture 14 Fall 2004

A simple evaluation model No variables or registers Stack Machines A simple evaluation model No variables or registers A stack of values for intermediate results CS 164 Lecture 14 Fall 2004

Example of a Stack Machine Program Consider two instructions push i - place the integer i on top of the stack add - pop two elements, add them and put the result back on the stack A program to compute 7 + 5: push 7 push 5 add CS 164 Lecture 14 Fall 2004

Å Stack Machine. Example Each instruction: 5 7 12 7 … … stack … … push 7 push 5 add Each instruction: Takes its operands from the top of the stack Removes those operands from the stack Computes the required operation on them Pushes the result on the stack CS 164 Lecture 14 Fall 2004

This means a uniform compilation scheme Why Use a Stack Machine ? Each operation takes operands from the same place and puts results in the same place This means a uniform compilation scheme And therefore a simpler compiler CS 164 Lecture 14 Fall 2004

Location of the operands is implicit Why Use a Stack Machine ? Location of the operands is implicit Always on the top of the stack No need to specify operands explicitly No need to specify the location of the result Instruction “add” as opposed to “add r1, r2” Þ Smaller encoding of instructions Þ More compact programs This is one reason why Java Bytecodes use a stack evaluation model CS 164 Lecture 14 Fall 2004

Optimizing the Stack Machine The add instruction does 3 memory operations Two reads and one write to the stack The top of the stack is frequently accessed Idea: keep the top of the stack in a register (called accumulator) Register accesses are faster The “add” instruction is now acc ¬ acc + top_of_stack Only one memory operation! CS 164 Lecture 14 Fall 2004

Stack Machine with Accumulator Invariants The result of computing an expression is always in the accumulator For an operation op(e1,…,en) push the accumulator on the stack after computing each of e1,…,en-1 The result of en is in the accumulator before op After the operation pop n-1 values After computing an expression the stack is as before CS 164 Lecture 14 Fall 2004

Stack Machine with Accumulator. Example Compute 7 + 5 using an accumulator acc … 7 acc ¬ 7 push acc 5 7 … acc ¬ 5 12 … Å acc ¬ acc + top_of_stack pop stack … CS 164 Lecture 14 Fall 2004

A Bigger Example: 3 + (7 + 5) Code Acc Stack acc ¬ 3 3 <init> push acc 3 3, <init> acc ¬ 7 7 3, <init> push acc 7 7, 3, <init> acc ¬ 5 5 7, 3, <init> acc ¬ acc + top_of_stack 12 7, 3, <init> pop 12 3, <init> acc ¬ acc + top_of_stack 15 3, <init> pop 15 <init> CS 164 Lecture 14 Fall 2004

Notes It is very important that the stack is preserved across the evaluation of a subexpression Stack before the evaluation of 7 + 5 is 3, <init> Stack after the evaluation of 7 + 5 is 3, <init> The first operand is on top of the stack CS 164 Lecture 14 Fall 2004

From Stack Machines to MIPS The compiler generates code for a stack machine with accumulator We want to run the resulting code on an x86 or MIPS processor (or simulator) We implement stack machine instructions using MIPS instructions and registers CS 164 Lecture 14 Fall 2004

MIPS assembly vs. x86 assembly In PA4 and PA5, you will generate x86 code because we have no MIPS machines around and using a MIPS simulator is less exciting In this lecture, we will use MIPS assembly it’s somewhat more readable than x86 assembly e.g. in x86, both store and load are called movl translation from MIPS to x86 trivial see the translation table in a few slides CS 164 Lecture 14 Fall 2004

Simulating a Stack Machine… The accumulator is kept in MIPS register $a0 in x86, it’s in %eax The stack is kept in memory The stack grows towards lower addresses standard convention on both MIPS and x86 The address of the next location on the stack is kept in MIPS register $sp The top of the stack is at address $sp + 4 in x86, its’ %esp CS 164 Lecture 14 Fall 2004

MIPS Assembly MIPS architecture Prototypical Reduced Instruction Set Computer (RISC) architecture Arithmetic operations use registers for operands and results Must use load and store instructions to use operands and results in memory 32 general purpose registers (32 bits each) We will use $sp, $a0 and $t1 (a temporary register) CS 164 Lecture 14 Fall 2004

A Sample of MIPS Instructions lw reg1 offset(reg2) Load 32-bit word from address reg2 + offset into reg1 add reg1 reg2 reg3 reg1 ¬ reg2 + reg3 sw reg1 offset(reg2) Store 32-bit word in reg1 at address reg2 + offset addiu reg1 reg2 imm reg1 ¬ reg2 + imm “u” means overflow is not checked li reg imm reg ¬ imm CS 164 Lecture 14 Fall 2004

x86 Assembly x86 architecture Complex Instruction Set Computer (CISC) architecture Arithmetic operations can use both registers and memory for operands and results So, you don’t have to use separate load and store instructions to operate on values in memory CISC gives us more freedom in selecting instructions (hence, more powerful optimizations) but we’ll use a simple RISC subset of x86 so translation from MIPS to x86 will be easy CS 164 Lecture 14 Fall 2004

x86 has two-operand instructions: x86 assembly x86 has two-operand instructions: ex.: ADD dest, src dest := dest + src in MIPS: dest := src1 + src2 An annoying fact to remember  different x86 assembly versions exists one important difference: order of operands the manuals assume ADD dest, src the gcc assembler we’ll use uses opposite order ADD src, dest CS 164 Lecture 14 Fall 2004

Sample x86 instructions (gcc order of operands) movl offset(reg2), reg1 Load 32-bit word from address reg2 + offset into reg1 add reg2, reg1 reg1 ¬ reg1 + reg2 movl reg1 offset(reg2) Store 32-bit word in reg1 at address reg2 + offset add imm, reg1 reg1 ¬ reg1 + imm use this for MIPS’ addiu movl imm, reg reg ¬ imm CS 164 Lecture 14 Fall 2004

MIPS to x86 translation MIPS x86 lw reg1 offset(reg2) movl offset(reg2), reg1 add reg1 reg1 reg2 add reg2, reg1 sw reg1 offset(reg2) movl reg1 offset(reg2) addiu reg1 reg1 imm add imm, reg1 li reg imm movl imm, reg CS 164 Lecture 14 Fall 2004

x86 vs. MIPS registers MIPS x86 $a0 %eax $sp %esp $fp %ebp $t %ebx CS 164 Lecture 14 Fall 2004

The stack-machine code for 7 + 5 in MIPS: MIPS Assembly. Example. The stack-machine code for 7 + 5 in MIPS: acc ¬ 7 push acc acc ¬ 5 acc ¬ acc + top_of_stack pop li $a0 7 sw $a0 0($sp) addiu $sp $sp -4 li $a0 5 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 We now generalize this to a simple language… CS 164 Lecture 14 Fall 2004

We define the following abbreviation push $t sw $a0 0($sp) Some Useful Macros We define the following abbreviation push $t sw $a0 0($sp) addiu $sp $sp -4 pop addiu $sp $sp 4 $t Ã top lw $t 4($sp) CS 164 Lecture 14 Fall 2004

A language with integers and integer operations A Small Language A language with integers and integer operations P ® D; P | D D ® def id(ARGS) = E; ARGS ® id, ARGS | id E ® int | id | if E1 = E2 then E3 else E4 | E1 + E2 | E1 – E2 | id(E1,…,En) CS 164 Lecture 14 Fall 2004

A Small Language (Cont.) The first function definition f is the “main” routine Running the program on input i means computing f(i) Program for computing the Fibonacci numbers: def fib(x) = if x = 1 then 0 else if x = 2 then 1 else fib(x - 1) + fib(x – 2) CS 164 Lecture 14 Fall 2004

Code Generation Strategy For each expression e we generate MIPS code that: Computes the value of e in $a0 Preserves $sp and the contents of the stack We define a code generation function cgen(e) whose result is the code generated for e CS 164 Lecture 14 Fall 2004

Code Generation for Constants The code to evaluate a constant simply copies it into the accumulator: cgen(i) = li $a0 i Note that this also preserves the stack, as required CS 164 Lecture 14 Fall 2004

Code Generation for Add cgen(e1 + e2) = cgen(e1) push $a0 cgen(e2) $t1 Ã top add $a0 $t1 $a0 pop Possible optimization: Put the result of e1 directly in register $t1 ? CS 164 Lecture 14 Fall 2004

Code Generation for Add. Wrong! Optimization: Put the result of e1 directly in $t1? cgen(e1 + e2) = cgen(e1) move $t1 $a0 cgen(e2) add $a0 $t1 $a0 Try to generate code for : 3 + (7 + 5) CS 164 Lecture 14 Fall 2004

Stack-machine code generation is recursive Code Generation Notes The code for + is a template with “holes” for code for evaluating e1 and e2 Stack-machine code generation is recursive Code for e1 + e2 consists of code for e1 and e2 glued together Code generation can be written as a recursive-descent of the AST At least for expressions CS 164 Lecture 14 Fall 2004

Code Generation for Sub and Constants New instruction: sub reg1 reg2 reg3 Implements reg1 ¬ reg2 - reg3 cgen(e1 - e2) = cgen(e1) push $a0 cgen(e2) $t1 Ã top sub $a0 $t1 $a0 pop CS 164 Lecture 14 Fall 2004

Code Generation for Conditional We need flow control instructions New instruction: beq reg1 reg2 label Branch to label if reg1 = reg2 x86: cmpl reg1 reg2 je label New instruction: b label Unconditional jump to label x86: jmp label CS 164 Lecture 14 Fall 2004

Code Generation for If (Cont.) cgen(if e1 = e2 then e3 else e4) = cgen(e1) push $a0 cgen(e2) $t1 Ã top pop beq $a0 $t1 true_branch false_branch: cgen(e4) b end_if true_branch: cgen(e3) end_if: CS 164 Lecture 14 Fall 2004

A very simple AR suffices for this language: The Activation Record Code for function calls and function definitions depends on the layout of the activation record A very simple AR suffices for this language: The result is always in the accumulator No need to store the result in the AR The activation record holds actual parameters For f(x1,…,xn) push xn,…,x1 on the stack These are the only variables in this language CS 164 Lecture 14 Fall 2004

The Activation Record (Cont.) The stack discipline guarantees that on function exit $sp is the same as it was on function entry No need to save $sp We need the return address It’s handy to have a pointer to start of the current activation This pointer lives in register $fp (frame pointer) Reason for frame pointer will be clear shortly CS 164 Lecture 14 Fall 2004

Consider a call to f(x,y), The AR will be: The Activation Record Summary: For this language, an AR with the caller’s frame pointer, the actual parameters, and the return address suffices Consider a call to f(x,y), The AR will be: SP, FP x AR of f y old FP stack growth CS 164 Lecture 14 Fall 2004

Code Generation for Function Call The calling sequence is the instructions (of both caller and callee) to set up a function invocation New instruction: jal label Jump to label, save address of next instruction in $ra x86: the return address is stored on the stack by the call label instruction CS 164 Lecture 14 Fall 2004

Code Generation for Function Call (Cont.) cgen(f(e1,…,en)) = push $fp cgen(en) push $a0 … cgen(e1) jal f_entry The caller saves its value of the frame pointer Then it saves the actual parameters in reverse order The caller saves the return address in register $ra The AR so far is 4*n+4 bytes long CS 164 Lecture 14 Fall 2004

Code Generation for Function Definition New instruction: jr reg Jump to address in register reg cgen(def f(x1,…,xn) = e) = move $fp $sp push $ra cgen(e) $ra Ã top addiu $sp $sp z lw $fp 0($sp) jr $ra Note: The frame pointer points to the top, not bottom of the frame The callee pops the return address, the actual arguments and the saved value of the frame pointer z = 4*n + 8 CS 164 Lecture 14 Fall 2004

Calling Sequence. Example for f(x,y). Before call On entry In body After call SP … RA SP FP x x y y old FP old FP SP SP FP FP FP CS 164 Lecture 14 Fall 2004

Code Generation for Variables Variable references are the last construct The “variables” of a function are just its parameters They are all in the AR Pushed by the caller Problem: Because the stack grows when intermediate results are saved, the variables are not at a fixed offset from $sp CS 164 Lecture 14 Fall 2004

Code Generation for Variables (Cont.) Solution: use a frame pointer Always points to the return address on the stack Since it does not move it can be used to find the variables Let xi be the ith (i = 1,…,n) formal parameter of the function for which code is being generated CS 164 Lecture 14 Fall 2004

Code Generation for Variables (Cont.) Example: For a function def f(x1,x2) = e the activation and frame pointer are set up as follows: SP x1 is at fp + 4 x2 is at fp + 8 Thus: cgen(xi) = lw $a0 z($fp) ( z = 4*i ) … RA FP x1 x2 old FP CS 164 Lecture 14 Fall 2004

Code generation can be done by recursive traversal of the AST Summary The activation record must be designed together with the code generator Code generation can be done by recursive traversal of the AST We recommend you use a stack machine for your Decaf compiler (it’s simple) CS 164 Lecture 14 Fall 2004

See the PA4 starter kit for a large code generation example Summary See the PA4 starter kit for a large code generation example Production compilers do different things Emphasis is on keeping values (esp. current stack frame) in registers Intermediate results are laid out in the AR, not pushed and popped from the stack CS 164 Lecture 14 Fall 2004

Allocating Temporaries in the AR CS 164 Lecture 14 Fall 2004

Review The stack machine has activation records and intermediate results interleaved on the stack AR Intermediates CS 164 Lecture 14 Fall 2004

Advantage: Very simple code generation Disadvantage: Very slow code Review (Cont.) Advantage: Very simple code generation Disadvantage: Very slow code Storing/loading temporaries requires a store/load and $sp adjustment CS 164 Lecture 14 Fall 2004

Idea: Keep temporaries in the AR A Better Way Idea: Keep temporaries in the AR The code generator must assign a location in the AR for each temporary CS 164 Lecture 14 Fall 2004

def fib(x) = if x = 1 then 0 else if x = 2 then 1 else Example def fib(x) = if x = 1 then 0 else if x = 2 then 1 else fib(x - 1) + fib(x – 2) What intermediate values are placed on the stack? How many slots are needed in the AR to hold these values? CS 164 Lecture 14 Fall 2004

Let NT(e) = # of temps needed to evaluate e NT(e1 + e2) How Many Temporaries? Let NT(e) = # of temps needed to evaluate e NT(e1 + e2) Needs at least as many temporaries as NT(e1) Needs at least as many temporaries as NT(e2) + 1 Space used for temporaries in e1 can be reused for temporaries in e2 CS 164 Lecture 14 Fall 2004

Is this bottom-up or top-down? What is NT(…code for fib…)? The Equations NT(e1 + e2) = max(NT(e1), 1 + NT(e2)) NT(e1 - e2) = max(NT(e1), 1 + NT(e2)) NT(if e1 = e2 then e3 else e4) = max(NT(e1),1 + NT(e2), NT(e3), NT(e4)) NT(id(e1,…,en) = max(NT(e1),…,NT(en)) NT(int) = 0 NT(id) = 0 Is this bottom-up or top-down? What is NT(…code for fib…)? CS 164 Lecture 14 Fall 2004

The Revised AR For a function definition f(x1,…,xn) = e the AR has 2 + n + NT(e) elements Return address Frame pointer n arguments NT(e) locations for intermediate results CS 164 Lecture 14 Fall 2004

Picture SP Temp NT(e) . . . Temp 1 RA FP x1 . . . xn Old FP CS 164 Lecture 14 Fall 2004

Revised Code Generation Code generation must know how many temporaries are in use at each point Add a new argument to code generation: the position of the next available temporary CS 164 Lecture 14 Fall 2004

Code Generation for + (original) cgen(e1 + e2) = cgen(e1) sw $a0 0($sp) addiu $sp $sp -4 cgen(e2) lw $t1 4($sp) add $a0 $t1 $a0 addiu $sp $sp 4 CS 164 Lecture 14 Fall 2004

Code Generation for + (revised) cgen(e1 + e2, nt) = cgen(e1, nt) sw $a0 -nt($fp) cgen(e2, nt + 4) lw $t1 -nt($fp) add $a0 $t1 $a0 CS 164 Lecture 14 Fall 2004

The temporary area is used like a small, fixed-size stack Notes The temporary area is used like a small, fixed-size stack Exercise: Write out cgen for other constructs CS 164 Lecture 14 Fall 2004

Code Generation for Object-Oriented Languages CS 164 Lecture 14 Fall 2004

OO implementation = Stuff from last lecture + More stuff Object Layout OO implementation = Stuff from last lecture + More stuff OO Slogan: If B is a subclass of A, then an object of class B can be used wherever an object of class A is expected This means that code in class A works unmodified for an object of class B CS 164 Lecture 14 Fall 2004

How are objects represented in memory? Two Issues How are objects represented in memory? How is dynamic dispatch implemented? CS 164 Lecture 14 Fall 2004

Object Layout Example class A { int a = 0; int d = 1; int f() { return a = a + d } } class B extends A { int b = 2; int f() { return a } //override int g() { return a = a - b } class C extends A { int c = 3; int h() { return a = a * c } } CS 164 Lecture 14 Fall 2004

Attributes a and d are inherited by classes B and C Object Layout (Cont.) Attributes a and d are inherited by classes B and C All methods in all classes refer to a For A methods to work correctly in A, B, and C objects, attribute a must be in the same “place” in each object CS 164 Lecture 14 Fall 2004

An object is like a struct in C. The reference foo.field Object Layout (Cont.) An object is like a struct in C. The reference foo.field is an index into a foo struct at an offset corresponding to field Objects are laid out in contiguous memory Each attribute stored at a fixed offset in object When a method is invoked, the object is this and the fields are the object’s attributes CS 164 Lecture 14 Fall 2004

The first 3 words of an object contain header information: Object Layout The first 3 words of an object contain header information: Offset Class Tag Object Size 4 Dispatch Ptr 8 Attribute 1 12 Attribute 2 16 . . . CS 164 Lecture 14 Fall 2004

Object size is an integer Object Layout (Cont.) Class tag is an integer Identifies class of the object Object size is an integer Size of the object in words Dispatch ptr is a pointer to a table of methods More later Attributes in subsequent slots Lay out in contiguous memory CS 164 Lecture 14 Fall 2004

Leaves the layout of A unchanged Subclasses Observation: Given a layout for class A, a layout for subclass B can be defined by extending the layout of A with additional slots for the additional attributes of B Leaves the layout of A unchanged (B is an extension) CS 164 Lecture 14 Fall 2004

Layout Picture Offset Class 4 8 12 16 20 A Atag 5 * a d B Btag 6 b C 4 8 12 16 20 A Atag 5 * a d B Btag 6 b C Ctag c CS 164 Lecture 14 Fall 2004

Consider layout for An · … · A3 · A2 · A1 Subclasses (Cont.) The offset for an attribute is the same in a class and all of its subclasses Any method for an A1 can be used on a subclass A2 Consider layout for An · … · A3 · A2 · A1 Header A1 object What about multiple inheritance? A1 attrs. A2 object A2 attrs A3 object A3 attrs . . . CS 164 Lecture 14 Fall 2004

Consider again our example Dynamic Dispatch Consider again our example class A { int a = 0; int d = 1; int f() { return a = a + d } } class B extends A { int b = 2; int f() { return a } int g() { return a = a - b } class C extends A { int c = 3; int h() { return a = a * c } } CS 164 Lecture 14 Fall 2004

Dynamic Dispatch Example e.g() g refers to method in B if e is a B e.f() f refers to method in A if f is an A or C (inherited in the case of C) f refers to method in B for a B object The implementation of methods and dynamic dispatch strongly resembles the implementation of attributes CS 164 Lecture 14 Fall 2004

Every class has a fixed set of methods (including inherited methods) Dispatch Tables Every class has a fixed set of methods (including inherited methods) A dispatch table indexes these methods An array of method entry points A method f lives at a fixed offset in the dispatch table for a class and all of its subclasses CS 164 Lecture 14 Fall 2004

Dispatch Table Example The dispatch table for class A has only 1 method The tables for B and C extend the table for A to the right Because methods can be overridden, the method for f is not the same in every class, but is always at the same offset Offset Class 4 A fA B fB g C h CS 164 Lecture 14 Fall 2004

Using Dispatch Tables The dispatch pointer in an object of class X points to the dispatch table for class X Every method f of class X is assigned an offset Of in the dispatch table at compile time CS 164 Lecture 14 Fall 2004

Using Dispatch Tables (Cont.) Every method must know what object is “this” “this” is passed as the first argument to all methods To implement a dynamic dispatch e.f() we Evaluate e, obtaining an object x Find D by reading the dispatch-table field of x Call D[Of](x) D is the dispatch table for x In the call, this is bound to x CS 164 Lecture 14 Fall 2004