Object Model Baojian Hua
What ’ s an Object Model? An object model dictates how to represent an object in memory A good object model can: Make the representation simpler Maximize the efficiency of frequent language operations Minimize storage overhead
Object Model in Java Two sorts of value formats in Java: Primitive: int, float, etc. Reference: pointers to objects Arrays with elements Scalar with fields
Object Header An object header may contain: Virtual methods pointers Hash code Type information pointers Lock Garbage collection information Array length Misc fields: such as profiling information We sketch a few of them next
Next Sketch the Java object model in a step- by-step way. Road-Path: Static and dynamic data Static and dynamic method Inheritance Give some insights into the general principals I write Java syntax in Meta Language
Java#1: Static Data Fields First look at a Java subset with only static data fields. Abstract syntax: // example class C { static int i; static int j; } datatype class = T of {name : string, staticVar : var list}
Two Steps Name mangling Try to avoid name space conflict Consider: class C1 {static int i;} class C2 {static int i;} Lifting Lift the variables out of the class body Just like the C global variables
To C We write a translation function to show how to compile this Java#1 to C fun trans (Class {name, vars}) = case varList of [] => [] | x :: xs => (name^x) :: trans xs (* or the following, if you’re really lazy :-P *) fun trans (Class{name, vars} = List.map addPrefix vars // an example class C { static int i j; } // would be //translated into: int C_i; int C_j;
To Pentium To Pentium is also not hard, you would want to only change the definition of “ prefix ” // example code from above again trans (Class {static int i; static int j}) =.section.data.C_i:.int 0.C_j:.int 0
Member Data Access All the static data member access should be systematically turned into the newly created names
Example class C {static int i;} class Main { public static void main (String [] args) { C.i = 99; } public static void foo (){ C.i = 88; } C_i = 99;C_i = 88;
Problem These static fields could also be accessed through objects: C c = new C (); c.i = 99; Should this assignment be turned into? C_i = 99;
Java#2: Static Methods (*Abstract syntax: *) datatype class = T of {name : string, staticData : var list, staticMethod : func list } datatype func = T of {name : string, args : …, locals : …, stm : …} // example class C { static int i, j; static int f (int a, int b) {…} static int g (int x) {…} }
Java#2: Static Methods The translation of static methods are very much like that of the static member data fields. Let ’ s try the our previous steps: Name mangling Lifting
Translation to C fun trans (Class {name, funs, …}) = transFunc (name, funs) and transFunc (name, funs) = List.map addPrefix funcs // example code class C {static int foo (int i) {…}} // be translated into: int C_foo (int i) {…} To Pentium code is simple, leave to you But later we may find that this naïve translation scheme does NOT work for dynamic methods…
Static Method Reference class C {pubic static int foo (int i){…}} class Main { public static void main (String [] args) { C.foo (99); } C_foo (99) ;
Problem These static methods could also be accessed by objects: C a = new C (); a.foo (99); Should this assignment be turned into the following? C_foo (99);
Java#3: Instance Variables Abstract Syntax: datatype class = T of { name : string, staticData : var list, dynData : var list, staticMethods : func list } Created only after the real object is allocated, and destroyed after the object is reclaimed Every object holds its own version of data
Translation to C fun trans (Class {name, dynData, …}) = struct name {dynData} // example code class C {int i; int j;} // would be translated into struct C {int i; int j;} All dynamic data fields in a class grouped as a same-name C structure To Pentium is similar
Member Data Access fun trans (new C ()) = malloc (struct C); Every object is malloc”ed” in memory… Dereference an object is NOT different from accessing an ordinary pointer, all your C (x86) hacking skills apply…
Example class C {int i;} class Main { public static void main (String [] args) { C a; a = new C (); a.i = 99; } a = malloc (struct C); a -> i = 99; struct C {int i;}
Java#4: Instance Methods Abstract syntax: datatype class = T of {name : string, staticData : var list, dynData : var list, staticMethods : func list, dynMethods : func list } Could be invoked only after concrete object instance has been born …
Java#4: Instance Methods As a first try, we would like to experiment our “ old ” magic: mangling followed by lifting Then, what the method invocations would look like? C c = new C (); c.foo (args); Should it be turned into C_foo (args) ?
Access the Instance Data class C{ int i; int set (int i){ this.i = i; return 0; } int get (){ return i; } struct {int i;} int C_set (int i){ this.i = i; return 0; } int C_get (){ return i; } Where does the “this” or “i” come from? “i”?
“ This ” To invoke a dynamic method, we would have to also tell it which object instance we ’ re using, and pass this object to this method explicitly (recall that an object is a pointer to a malloc-ed structure … ) In Java, passing the object itself to a dynamic method is the job of “ this ” pointer
Access the Instance Data --- Revisit class C{ int i; int set (C this, int i){ this.i = i; return 0; } int get (C this){ return this.i; } struct C {int i;} int C_set (C * this, int i){ this -> i = i; return 0; } int C_get (C * this){ return this -> i; } “this” turns into a pointer…
Java#5: Class Inheritance Java allows you to write tree-like class hierarchy An object may have both: a static “ type ” ---the declared type, and a dynamic “ type ” ---the creation type Class inheritance make things messy
Inheritance Syntax Abstract syntax: datatype class = T of { name : string, staticData : var list, dynData : var list, staticMethods : func list, dynMethods : func list, extends : class option }
Example // class hierarchy class C1 { int i; void f () {…} } class C2 extends C1 { int j; void f () {…} void g () {…} } // and the “Main” class class Main{ static void main (…) { …; test (a); } void test (C1 a){ a.f (); }
Example Continued Notice that object a has a static “type” C1 in declaration: void test (C1 a) {…}; But the real argument arg may have a sub-class name as its real type---its dynamic type, consider: C2 arg = new C2 (); test(arg); It’s well-typed, so what the a.f() would be at run time? C1_f() or C2_f() ?
Dynamic Method Dispatch Generally, we can NOT always statically predicate what the dynamic type of the incoming argument object would be So we would have to record such kind of information in object itself, and do method dispatch at run time invocation points
Virtual Method Table (VMT) We can use a virtual method table An object is composed of virtual method table pointer and data fields (just the C- style structure we ’ ve seen) VMT pointer at zero offset, followed by others data fields VMT itself may be used by all same-type objects to reduce the space overhead
VMT in Graph C1 i C1_f C2 i C2_f C2_g j
Method Invocation fun trans (a.f (…)) = x = lookup (*a, f); x (a, …); First, we take the VMT pointer through the object pointer Lookup the callee method Note that this is static known And call this method pointer, with additional “this” pointer augmented as before
Java#6: Interface Consider an interface: I = {foo (); bar ();} Any object of a class C that implements methods named foo and bar can be treated as if it has interface type I Can we use C's vtable? No In general, C may have defined methods before, between, or after foo and bar or may have defined them in a different order So to support interfaces, we need a level of indirection …
Interface address of method 1 address of method 2 … address of method n Shared vtable for Interface vtable pointer instance variable 1 instance variable 2 … instance variable m Actual Object vtable pointer actual object Wrapper Object
Backup Slides A Toy Mini-Java Compiler ---Putting All Together
The Phases The toy Mini-Java compiler is organized into several separate passes: Front-end issues: lexing, parsing, type-checking etc. Name-mangling: make the class property and “this” pointer explicit De-SubClass: eliminate “sub-class” De-Class: eliminate “class”, and output C code These passes are presented by a running example
A Sample Program class A{ int i; int j; int k; int f (int x, int y){ …} int g (int a){ …} } class B extends A{ int i; int e; int k; int f (int x, int y){ …} }
After Front-end Processing class A{ int i; int j; int k; int f (int x, int y){ …} int g (int a){ …} } class B extends A{ int i; int e; int k; int f (int x, int y){ …} }
After Name Mangling class A{ int A_i; int A_j; int A_k; int A_f (int x, int y){ …} int A_g (int a){ …} } class B extends A{ int B_i; int B_e; int B_k; int B_f (int x, int y){ …} }
After Inserting “ this ” class A{ int A_i; int A_j; int A_k; int A_f (A this, int x, int y){ …} int A_g (A this, int a){ …} } class B extends A{ int B_i; int B_e; int B_k; int B_f (B this, int x, int y){ …} }
After De-SubClass class A{ int A_i; int A_j; int A_k; int A_f (A this, int x, int y){ …} int A_g (A this, int a){ …} } class B{ int B_i; int A_j; int B_k; int B_e; int B_f (B this, int x, int y){ …} int A_g (A this, int a){ …} } // Note the order
What Happened? Instance data fields unification Relative order is important Insert an extra “ this ” argument As the first pointer? Methods name mangling You may or may not want to copy the actual method body, if you don ’ t want to, you just have to construct the vtable here … After these, all classes are closed
After De-Class (class A) struct Data_A{ Vtable_A * vptr; int i; int j; int k; } struct Vtable_A{ int (*f) (); //=A_f int (*g) (); //=A_g } int A_f (struct Data_A * this, int x, int y){ … } int A_g (struct Data_A * this, int a){ … }
After De-Class (class B) struct Data_B{ Vtable_B * vptr; int i; int j; int k; int e; } struct Vtable_B{ int (*f) (); //=B_f int (*g) (); //=A_g } int B_f (struct Data_B * this, int x, int y){ … } int A_g (struct Data_B * this, int a){ … }
What Happened? Object layout selection The “ 0 ” offset is the vptr Vtable layout selection The vtable fields should be initialized by corresponding method pointers Methods lifting Methods go to top-level