Lecture 5 Inheritance, Polymorphism, Object Memory Model, The Visitor Pattern …and gazillion other tidbits!
Process Memory Layout Each process consists of three distinct areas: The Heap - stores values allocated using the new operator The Stack - stores values in activation frames The text section (code segment) - stores the code of all the classes used in the program executed by the process The object which is an instance of a class uses memory where Variables created using keyword new are stored on the heap This is done during run time. While the code is being executed! Variables created without using new are stored on the stack This is done during compile time. The compiler provides an offset from StackPointer (SP) for each variable The compiler provides a contiguous block of memory for each variable
Object Memory Model Object is an instance of a class that contains: State: data members static data members are allocated once and shared among all class instances non-static data members are not shared Methods: member functions Methods are shared amongst all instances of the same class Their code is stored once in memory for each class They mutate the data members of the calling object only Object is implemented at runtime as a region of storage in memory A contiguous block of memory Object is basically a struct with methods
Object Memory Layout C++ has no standard way to storing variables in memory! It means each compiler may store data differently It means that same struct may have different sizes using different compilers G++7.3.0 basic storage methodology: variable type can only reside in a memory location divisible by its size Chars can be stored in any memory address Shorts can reside in memory addresses divisible by two Integers/floats can reside in memory addresses divisible by four Double can resize in memory addresses divisible by eight And so on. Memory layout example at the website uses a different compiler Stores variables at addresses divisible by word size regardless of their type
Object Memory Layout Example – g++7.3.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 #include <iostream> struct structA { int a; float f; char c1; char c2; char d[4]; char e[4][4]; short w; short y; double z; }; int main(){ structA strct; std::cout << "printing out [beginning relative address, size]:" << std::endl; std::cout << "a=[" << (long)(&(strct.a))-(long)(&(strct)) << "," << sizeof(structA.a) << "] f=[" << (long)(&(strct.f))-(long)(&(strct)) << "," << sizeof(structA.f) << "] c1=[" << (long)(&(strct.c1))-(long)(&(strct)) << "," << sizeof(structA.c1) << "] c2=[" << (long)(&(strct.c2))-(long)(&(strct)) << "," << sizeof(structA.c2) << "] d=[" << (long)(&(strct.d))-(long)(&(strct)) << "," << sizeof(structA.d) << "] e=[" << (long)(&(strct.e))-(long)(&(strct)) << "," << sizeof(structA.e) << "] w=[" << (long)(&(strct.w))-(long)(&(strct)) << "," << sizeof(structA.w) << "] y=[" << (long)(&(strct.y))-(long)(&(strct)) << "," << sizeof(structA.y) << "] z=[" << (long)(&(strct.z))-(long)(&(strct)) << "," << sizeof(structA.z) << "]" << std::endl; } Output: a=[0,4] f=[4,4] c1=[8,1] c2=[9,1] d=[10,4] e=[14,16] w=[30,2] y=[32,2] z=[40,8]
Oddity – struct sizes? Output? 8 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 #include <iostream> struct A{ char c; char d; int i; }; struct B{ int main() { std::cout << sizeof(A) << std::endl; std::cout << sizeof(B) << std::endl; } Output? 8 12
Object Fields Alignment – Code Example 1 2 3 4567 struct structA { int a; float f; char c1; char c2; char d[4]; }; This compiler for instance, uses word blocks to store data If the data is smaller than word size, rest of the block goes unused If the data is larger than word size, more blocks are used as needed Field Alignment The result of this method is easier access to fields This is since all fields are aligned at addresses divisible by four only
Memory Layout: Inheritance Let’s say class B extends class A: Fields defined in A also exist in B B has in addition to that new fields that are not found in class A 1 2 3 4 class B : public A { public: double g; };
The Text Section: “Code Segment” It contains code of all the classes used in the program and executed by the process Method is stored in code region allocated to process in which the class is used It is encoded as sequence of processor instructions It is known to compiler by start address in memory
Method Execution Executing a method: Method Invocation: compiler maintains a table where it keeps track of the address of each method of the class compiler invokes a method of a class method has access to internal state of object, wherever it may be! Method Invocation: parameters are pushed on stack method invoked by using CALL instruction The parameter of CALL is the address of the first instruction of the method
The Implicit this Parameter How method knows where are fields of object? the compiler passes a hidden parameter to method call The parameter contains the address of the object The parameter is called this Example: this->foo() is called like this: foo(this) Note: static functions do not implicitly send this, as a result, the programmer has no access to the object.
Static and Dynamic Function Binding They are also called Early and Late Binding Binding To convert function calls into addresses These addresses are jumped to during code execution Static Binding Done by the compiler (linker) to directly associate the function with a machine address during compile time! Since all functions have unique addresses The linker replaces a function call with a machine language instruction This instruction tells the CPU to jump to the address of the function once executed
Static Binding Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 #include <iostream> class Person{ public: void sound(){ std::cout << "This is parent class" << std::endl; } }; class Man : public Person{ void salute(){ std::cout << "hello" << std::endl; int main(){ Person *p; Man m; p = &m; p -> salute(); // early binding return 0; By default, the compiler uses static binding Unless it said otherwise in code Output: This is parent class
Dynamic Binding In this case, the compiler matches the function call with the correct function definition at runtime The compiler identifies the type of object at runtime Then matches the function call with the correct function definition By default, early binding takes place Explicit commands are used to ensure applying dynamic binding This can be achieved by declaring a virtual function The compiler creates virtual table during compilation Assists the compiler in applying dynamic binding properly
Dynamic Binding Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 #include <iostream> class Person{ public: virtual void salute(){ std::cout << "This is parent class" << std::endl; } }; class Man : public Person{ void salute(){ std::cout << "hello" << std::endl; int main(){ Person *p; Man m; p = &m; p -> salute(); // late binding return 0; To force dynamic binding, we prefix the sound function with the keyword virtual Output: hello
Virtual and Pure Virtual Functions A function of the base class which can be overridden in the derived class Syntax: Prefix function declaration with virtual keyword virtual void sound(); This requires implementation! Pure Virtual Function A virtual function that has no definition i.e. abstract functions virtual void sound() = 0; By default, the most derived implementation is stored in the vtable for a most derived class
Virtual Table: Single Inheritance To implement virtual functions, C++ uses a special form of late binding known as the virtual table The virtual table is a lookup table of functions used to resolve function calls during runtime Each class has its own virtual table Virtual table is created at compile time Each virtual function is assigned a fixed index in the virtual table This index remains associated with the particular virtual function throughout the inheritance hierarchy Other names: vtable, virtual function table, virtual method table, dispatch table
Virtual Table Single Inheritance Base Class Example 1 2 3 4 5 6 7 8 9 10 11 12 13 class Point { public: virtual ~Point(); virtual Point& mult( float ) = 0; // ... other functions... virtual float x() const { return _x; } virtual float y() const { return 0; } float z() const { return 0; } // ... protected: Point( float x = 0.0 ); float _x; }; Slot 0 contains address of the Point class virtual table The virtual destructor is assigned slot 1 Pure virtual function mult() is assigned slot 2 There is no mult() definition, so the address of the library function pure_virtual_called() is placed within the slot If that instance should get invoked, generally, it would terminate the program! y() is assigned to slot 3 z() is assigned to slot 4 What slot is x() assigned? None because it is not declared to be virtual
Virtual Table Single Inheritance: Derived Class There are three possibilities for a derived class: Inheriting implementation from a higher level Using own implementation – thus overriding higher levels Addition of a new non existent function It can inherit the instance of the virtual function declared at the base class The function address is copied from the base to the derived class associated slot It can override the instance with one of its own The address of its instance is placed within the associated slot It can introduce a new virtual function not present in the base class The virtual table is grown by a slot and the address of the function is placed within that slot
Virtual Table Single Inheritance Derived Class Example 1 2 3 4 5 6 7 8 9 10 1112 class Point2d : public Point { public: Point2d( float x = 0.0, float y = 0.0 ) : Point( x ), _y( y ) {} ~Point2d(); // overridden base class virtual functions Point2d& mult( float ); float y() const { return _y; } // ... other functions... protected: float _y; }; Slot 0 points at the Point2d class virtual table The Point2d destructor is assigned to slot 1 - replacing Point’s destructor address The Point2d mult() instance is assigned to slot 2 - replacing the pure virtual instance The Point2d y() instance is assigned to slot 3 - replacing Point’s y() address Slot 4 does not change - It retains Point's inherited instance of z()
Virtual Table Single Inheritance 2nd Derived Class Example 1 2 3 4 5 6 7 8 9 10 11 class Point3d: public Point2d { public: Point3d( float x = 0.0, float y = 0.0, float z = 0.0 ) : Point2d( x, y ), _z( z ) {} ~Point3d(); // overridden base class virtual functions Point3d& mult( float ); float z() const { return _z; } // ... other functions... protected: float _z; }; This is similar to first derived class mechanics Slot 0 points at the Point3d class virtual table Point3d's destructor assigned to slot 1 replacing Point2d’s destructor address Point3d's instance of mult() assigned to slot 2 replacing Point’s mult() address It places Point2d's inherited instance of y() in slot 3 Point3d's instance of z() in slot 4 So how is ptr->z()invoked? We don’t know the type of ptr We do know that z() is at slot 4 We do know that the vtable is found at 0 Thus: (ptr[0][4])(); Regardless of ptr’s type!
Virtual Table – Performance Invoking a virtual method is more expensive than regular function: Requires two operations: Finding the location of the function code from virtual table and then invoking the function Virtual methods cannot be compiled inline: Using inline keyword, the compiler replaces the function call statement with the function code itself and then compiles the entire code As a result, it speeds up code execution by avoiding function call overhead! Each function call will result in a complete copy of the called function added to code Too many calls of the same function will result in too big of an executable! RAM is expensive Note: in C++, methods are not virtual by default. If a class does not have virtual methods, then it does not contain a vtable! In Java, methods are always virtual
Multiple Inheritance The complexity of virtual function support under multiple inheritance revolves around the second and subsequent base classes To be able to access them, the this pointer needs to be adjusted at runtime The derived class object will consist of multiple sub-objects One sub-object per base class that was extended Each sub-object has its own virtual table This means that the this pointer needs to be altered dynamically to be able to work on each of the sub-objects as needed! After all sub-objects we have the data members of the derived class this needs to be adjusted for the following cases: The virtual destructor – it executes both parent destructors one after another Inherited functions from second base class onwards – to be able to execute the correct function from the correct sub-object block Unimplemented functions
Multiple Inheritance: Code Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 class Base1 { public: Base1(); virtual ~Base1(); virtual void speakClearly(); virtual Base1 *clone() const; protected: float data_Base1; }; class Base2 { Base2(); virtual ~Base2(); virtual void mumble(); virtual Base2 *clone() const; float data_Base2; class Derived : public Base1, public Base2 { Derived(); virtual ~Derived(); virtual Derived *clone() const; float data_Derived;
Multiple Inheritance: Function Call Ambiguity What happens in these cases: Derived, Base1 and Base2 classes have clone method implementation Base1 and Base2 classes have clone method implementation but Derived does not have clone method implementation Which clone implementation is executed for these three cases for each two code implementations mentioned above? 1 2 3 4 5 6 7 #include <iostream> int main(){ Derived *d = new Derived; d->clone(); delete d; } 1 2 3 4 5 6 7 #include <iostream> int main(){ Base1 *d = new Derived; d->clone(); delete d; } 1 2 3 4 5 6 7 #include <iostream> int main(){ Base2 *d = new Derived; d->clone(); delete d; } Which case out of the six is problematic?
Code Example: Base1 Class Implementation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151617 #include <iostream> class Base1 { public: Base1(){ std::cout <<"Base1 constructor" << std::endl; } Base1(float data_Base1): data_Base1(data_Base1){} virtual ~Base1(){std::cout <<"Base1 destructor" << std::endl;} virtual void speakClearly(){ std::cout << "speak!" << std::endl;}; virtual Base1 *clone() const{ std::cout << "Base1 clone" << std::endl; return new Base1(data_Base1); protected: float data_Base1; };
Code Example: Derived Class Implementation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 #include <iostream> class Derived : public Base1, public Base2 { public: Derived(){ std::cout <<"Derived constructor" << std::endl; } Derived(float data_Derived): data_Derived(data_Derived){} virtual ~Derived(){std::cout <<"Derived destructor" << std::endl;} virtual Derived *clone() const{ std::cout << "Derived clone" << std::endl; return new Derived(data_Derived); protected: float data_Derived; };
Multiple Inheritance: Function Call Ambiguity What happens in this case: Derived, Base1 and Base2 classes have clone method implementation Base1 and Base2 classes have clone method implementation but Derived does not have clone method implementation What is the output for these three cases for each two code implementations mentioned above: 1 2 3 4 5 6 7 #include <iostream> int main(){ Derived *d = new Derived; d->clone(); delete d; } 1 2 3 4 5 6 7 #include <iostream> int main(){ Base1 *d = new Derived; d->clone(); delete d; } 1 2 3 4 5 6 7 #include <iostream> int main(){ Base2 *d = new Derived; d->clone(); delete d; }
Multiple Inheritance: The Diamond Problem 1 2 3 4 5 6 7 8 9 10 11 class Top{}; class Left : public Top{}; class Right : public Top{}; class Bottom : public Left, public Right{ }; int main(){ Left* lptr = new Bottom(); Right* rptr = new Bottom(); Top* tptr = new Bottom(); Bottom* bptr = new Bottom(); } What’s the error at line 11 during compilation? diamond_problem.cpp:11:28: error: ‘Top’ is an ambiguous base of ‘Bottom’ Why? Bottom will contains two copies of the sub-object Top! Which one to use when we access Top data members? Ambiguous!
The Diamond Problem Solution: Virtual Inheritance 1 2 3 4 5 6 7 8 9 10 11 class Top{}; class Left : virtual public Top{}; class Right : virtual public Top{}; class Bottom : public Left, public Right{ }; int main(){ Left* lptr = new Bottom(); Right* rptr = new Bottom(); Top* tptr = new Bottom(); Bottom* bptr = new Bottom(); } Virtual inheritance ensures that one copy of Top is added to Bottom Done by prefixing the inheritance syntax with the virtual keyword
Casting (Type Conversion) in C++ Casting is the ability to change the variable type Implicit Casting: Explicit Casting: 1 2 3 short a = 2000 int b; b = a; 1 2 3 short a = 2000 int b; b = (int)a;
C vs C++ Style Casting C Style casting: C++ Style casting: Tries to do a static_cast if possible Then trying a reinterpret_cast if static_cast failed It also will apply a const_cast if it must C++ Style casting: static_cast - converts the type by preserving the value and altering its binary representation dynamic_cast - performs checked polymorphic conversions const_cast - adds or removes const – used when variable is const but API isn’t reinterpret_cast - performs general low-level conversions Can be used when interpreting the bits by converting their type has a meaning Converts the type but reinterprets the value – the binary representation does not change!! 1 2 3 int x = 5; void* voidPtr = &x; int* intPtr = (int*)voidPtr;
C++ static_cast and dynamic_cast Converts the type by preserving the value It means the CPU needs to recalculate the binary value due to the type conversion Example: converting 12 from integer to float: Integer binary representation: 00000000 00000000 00000000 00001100 Float binary representation: 01000001 01000000 00000000 00000000 dynamic_cast: Used for pointers and references Used for handling polymorphism - when casting to a derived class Performs type checking for valid casting Its primary purpose is to perform type-safe downcasts - down the inheritance tree! In case of casting failure: Returns nullptr if the casting fails for pointers Throws exception when casting fails for references
static_cast Example 1 2 3 4 5 6 7 8 9 10 #include <iostream> int main() { int j = 41; int v = 4; float m = j/v; float d = static_cast<float>(j)/v; std::cout << "m = " << m << std::endl; std::cout << "d = " << d << std::endl; } Output? Line 8: m = 10 Line 9: d = 10.25
dynamic_cast Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 #include <iostream> struct A { virtual void f() { cout << "Class A" << endl; } }; struct B : A { virtual void f() { cout << "Class B" << endl; } }; struct C : A { virtual void f() { cout << "Class C" << endl; } }; void f(A* arg) { B* bp = dynamic_cast<B*>(arg); C* cp = dynamic_cast<C*>(arg); if (bp) bp->f(); else if (cp) cp->f(); else arg->f(); }; int main() { A aobj; C cobj; A* ap = &cobj; A* ap2 = &aobj; f(ap); f(ap2); } Output? Line 17: Class C Line 18: Class A
C++ Interfaces: Pure Abstract Class C++ has no interfaces, instead, abstract and pure abstract classes: A class considered an abstract class if it has at least one pure virtual function It can define data members A class in C++ is “considered” an interface if it’s pure abstract class: Its methods, constructors and destructor are pure virtual virtual function() = 0; It does not define any data members Both pure abstract classes and abstract classes cannot be instantiated
The Visitor Pattern Solves the issue with type-checking by deciding at runtime which piece of logic to execute based on the type of the received object 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1516 #include <iostream> class Base; class Derived1 : public Base {}; class Derived2 : public Base {}; class Base{ public: void handle(Base* obj) { if (dynamic_cast<Derived1*>(base)) { std::cout << "Handling Derived1" << std::endl; execute(dynamic_cast<Derived1*>(base)); } else if (dynamic_cast<Derived2*>(base)) { std::cout << "Handling Derived2" << std::endl; execute(dynamic_cast<Derived2*>(base)); }
The Visitor Pattern: Visitor Framework Code 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 class Derived1; class Derived2; class Visitor; // Forward Declaration – allows using unimplemented types! class Base { public: // This is for dispatching on Base's concrete type. // Any derived class that wants to use double dispatch with visitor must override accept virtual void accept(Visitor* visitor) = 0; }; class Visitor { public: // These are for dispatching on visitor's type // add one visit function per new Derived type virtual void visit(Derived1* derived1) = 0; virtual void visit(Derived2* derived2) = 0; class Derived1 : public Base { public: virtual void accept(Visitor* visitor){ visitor->visit(this); } class Derived2 : public Base {
Implementing the Visitor Pattern 1 2 3 4 5 6 7 8 9 10 11121314 #include <iostream> // Implementing a custom visitor class PrinterVisitor : public Visitor { public: virtual void visit(Derived1* derived1) { std::cout << "Handling Derived1" << std::endl; //then we print the object } virtual void visit(Derived2* derived2) { std::cout << "Handling Derived2" << std::endl; }; 1 2 3 4 5 6 7 8 9 10 11 12 13 // Using the visitor implementation int main(){ PrinterVisitor* printer = new PrinterVisitor(); Base* base = new Derived1(); base->accept(printer); delete base; base = new Derived2(); } Follow Code Execution – Where the double dispatch happens? Output for lines 6, 11? Type-checking problem solved!
Visitor Pattern: Double Dispatch Definition - Double Dispatch Two function calls are used to be able to execute the requested function logic First Dispatch: accept() is a virtual function that dispatches on the type of derived object Then it calls the correct overload of visit() Since inside accept() object type is already known Second Dispatch: visit(), in turn, is the virtual function that dispatches on the type of visitor