This lecture is a bit of a departure in that we’ll cover how C++’s features are actually implemented. This implementation will look suspiciously similar to pointers to functions in C, so we’ll take a detour to check this out (for those of you who haven’t seen C pointers to functions before). When we’re done, we’ll see that virtual functions can actually do things in a type-safe way that pointers to functions cannot. Why C++’s type system is more powerful than C’s Virtual function implementation detour back to C++ Pointers to functions in C
How virtual functions are (typically) implemented If a class has any virtual functions, an extra (hidden) pointer is added to the beginning of each object of that class. This pointer points to the virtual function table (v-table for short), which pointers to code for each virtual function supported by the object. vtable 01001 01011 11010 object ptr to code vtable ptr 01001 01011 11010 ptr to code data ... data ...
Example: The object s looks like: Shape v-table s1 vtable ptr class Shape { int xCoord, yCoord; // coordinates of center ShapeColor color; // current color public: void move(int xNew, int yNew); virtual void draw(); virtual void rotate(double angle); virtual double area(); }; Shape s1; The object s looks like: Shape v-table s1 ptr to Shape::draw vtable ptr ptr to Shape::rotate xCoord ptr to Shape::area yCoord color
Multiple object of type Shape share the same v-table: Shape v-table ptr to Shape::draw s1 vtable ptr ptr to Shape::rotate s2 xCoord ptr to Shape::area vtable ptr yCoord xCoord color yCoord color
Each class derived from Shape gets its own v-table, which contains code for inherited and overridden member functions. Suppose Circle inherits Shape::rotate (without overriding it), and overrides Shape::draw and Shape::area. To make things interesting, let’s also suppose that Circle adds a new virtual function, circumference. class Circle: public Shape { int radius; public: void draw(); double area(); virtual double circumference(); };
Shape v-table Circle v-table Now the objects and v-tables look like: Shape s1; Shape s2; Circle c1; Now the objects and v-tables look like: ptr to Shape::draw ptr to Circle::draw ptr to Shape::rotate ptr to Shape::rotate ptr to Shape::area ptr to Circle::area ptr to Circle::circ... c1 s1 s2 vtable ptr vtable ptr vtable ptr xCoord xCoord xCoord yCoord yCoord yCoord color color color radius
So Circle::area is called, and passed an implicit this pointer: When a virtual function is invoked, C++ looks up the right code at run-time (it knows at compile-time what offset in the v-table to look in) and calls it: Shape *sPtr = new Circle(); double a = sPtr->area(); So Circle::area is called, and passed an implicit this pointer: Circle::area(sPtr); // sPtr is implicitly passed // as “this” vtable ptr ptr to Circle::draw sPtr xCoord ptr to Shape::rotate yCoord ptr to Circle::area color ptr to Circle::circ... radius
Circle::area can then do its thing, making use of Circle-specific data like radius: double Circle::area() { return PI * radius * radius; } Remember, there is an implicit this pointer being passed to the function, so the function really works like: double Circle::area(Circle *this) { return PI * this->radius * this->radius;
This lecture was inspired in part by a conversation about C++ I had with a colleague once. He asked me what C++ could do that C couldn’t do, so I told him about virtual functions. When he asked how virtual functions worked, I told him about the pointer to the v-table in each object, and the corresponding run-time dispatch to functions in the v-table. “No big deal,” he said, “you can do all that in C with pointers to functions.” Was he right?
Pointers to functions C and C++ provide datatypes for pointers to functions. Suppose there is a function named compareInt: int compareInt(int i1, int i2) { if(i1 < i2) return -1; else if(i1 == i2) return 0; else return 1; } We can create a pointer to this function as follows: int (*ptrToCompareInt)(int i1, int i2); ptrToCompareInt = &compareInt;
ptrToCompareInt is a pointer to a function. Its declaration means: int (*ptrToCompareInt)(int i1, int i2); argument types of function pointed to return type of function pointed to star means “pointer” name of variable
Example: this prints: int addInt(int i1, int i2) {return i1 + i2;} int multiplyInt(int i1, int i2) {return i1 * i2;} ... int (*ptrToFunc)(int i1, int i2); ptrToFunc = &addInt; int result1 = (*ptrToFunc)(7, 6); ptrToFunc = &multiplyInt; int result2 = (*ptrToFunc)(7, 6); cout << result1 << " " << result2 << endl; this prints: 13 42
qsort takes an array of elements and sorts them: A practical example of pointers to functions is the qsort routine, which is provided in stdlib.h in C and C++: void qsort(void *base, int num, size_t width, int (*compare)(void *elem1, void *elem2 ) ); qsort takes an array of elements and sorts them: base and num specify the array and the size of the array width specifies the size, in bytes, of each element in the array compare is a pointer to a function which compares two elements from the array
What’s all this void* stuff? void qsort(void *base, int num, size_t width, int (*compare)(void *elem1, void *elem2 ) ); What’s all this void* stuff? void* is C’s way of giving up - C is saying, “I don’t have a powerful enough type system do precisely describe what’s going on here”. “void* elem1” means that C says “I have no idea what type elem1 has”. As we’ll see later, the void* in this case stems from a lack of polymorphism - qsort can be implemented in a much nicer way with the polymorphism provided by templates.
Instead of writing the nice compareInt function void qsort(void *base, int num, size_t width, int (*compare)(void *elem1, void *elem2 ) ); Instead of writing the nice compareInt function int compareInt(int i1, int i2) { if(i1 < i2) return -1; else if(i1 == i2) return 0; else return 1; } to use qsort, we have to write a function that takes void* arguments and does a bunch of casts: int compareInt(void *i1Ptr, void *i2Ptr) { if(*((int*)i1Ptr) < *((int*)i2Ptr)) return -1; else if(*((int*)i1Ptr) == *((int*)i2Ptr)) return 0; Ugh! This is the first clue that pointers to functions, without any additional polymorphism, may not solve all our problems.
The next clue is that pointers to functions refer to code but hold no other data. In languages with higher-order functions, function values can hold both data and code. C and C++ do not have higher-order functions - pointers to functions refer to code but contain no data. So here’s what we get in C: structs: data, but no code pointers to functions: code, but no data But we can fix this. Just mix the two together to get both code and data. Based on this, let’s take a crack at implementing v-tables with structs and pointers to functions.
class Shape { int xCoord, yCoord; // coordinates of center public: virtual void draw(); virtual void rotate(double angle); virtual double area(); }; We’ll start with a slightly simplified version of Shape. We need a struct to hold the data in a Shape object: struct Shape { ShapeVTable *vTable; int xCoord, yCoord;
The myself argument is an imitation of C++’s “this”. class Shape { // C++ version int xCoord, yCoord; // coordinates of center public: virtual void draw(); virtual void rotate(double angle); virtual double area(); }; We also need a struct full of pointers to functions to implement the v-table: struct Shape { // struct/pointer to function version ShapeVTable *vTable; int xCoord, yCoord; struct ShapeVTable { void (*draw)(Shape *myself); void (*rotate)(Shape *myself, double angle); double (*area)(Shape *myself); The myself argument is an imitation of C++’s “this”.
struct Shape { ShapeVTable *vTable; int xCoord, yCoord; }; struct ShapeVTable { void (*draw)(Shape *myself); void (*rotate)(Shape *myself, double angle); double (*area)(Shape *myself); So now we have something that looks like an object with a pointer to a v-table: ShapeVTable Shape ptr to Shape’s draw vtable ptr ptr to Shape’s rotate xCoord ptr to Shape’s area yCoord
struct Shape { ShapeVTable *vTable; int xCoord, yCoord; }; struct ShapeVTable { void (*draw)(Shape *myself); void (*rotate)(Shape *myself, double angle); double (*area)(Shape *myself); So far, so good. But now let’s try to implement the data and v-table for a simplified version of Circle, which overrides the area function: class Circle: public Shape { int radius; public: double area();
struct Shape { ShapeVTable *vTable; int xCoord, yCoord; }; First, let’s look at the data part. We run into an immediate problem. We could define a completely new struct: struct Circle { CircleVTable *vTable; int radius; But this Circle has no relation to Shape, so they can’t be used interchangeably (we don’t get the polymorphism we wanted).
Maybe we could embed a Shape inside a circle: struct Shape { ShapeVTable *vTable; int xCoord, yCoord; }; Maybe we could embed a Shape inside a circle: struct Circle { Shape shape; int radius; But let’s just punt on this, and assume we can use a simple C++-like inheritance (but we won’t assume that the inheritance mechanism supports virtual functions - the whole point is to implement virtual functions ourselves): struct Circle: public Shape {
We can almost sort of use this now to imitate virtual function calls: struct Shape { ShapeVTable *vTable; int xCoord, yCoord; }; struct ShapeVTable { void (*draw)(Shape *myself); void (*rotate)(Shape *myself, double angle); double (*area)(Shape *myself); struct Circle: public Shape { int radius; We can almost sort of use this now to imitate virtual function calls: Circle c; Shape *s = &c; // imitation of “double a = s->area()”: double a = (*(s->vTable->area))(s);
struct Shape { ShapeVTable *vTable; int xCoord, yCoord; }; struct ShapeVTable { void (*draw)(Shape *myself); void (*rotate)(Shape *myself, double angle); double (*area)(Shape *myself); struct Circle: public Shape { int radius; But we have trouble when we try to implement a Circle::area function that conforms to the type in ShapeVTable: double circle_area(Shape *myself) { return PI * myself->radius * myself->radius; } The problem is that myself has type Shape*, and Shapes don’t have a radius member.
circle_area could do a cast, but this subverts the type system: double circle_area(Shape *myself) { return PI * ((Circle *) myself)->radius * ((Circle *) myself)->radius; }
Maybe the solution is that Shape should have a vtable pointing to a ShapeVTable, and Circle should have a vtable pointing to a CircleVTable (which holds functions that take myself arguments of type Circle*, instead of Shape*): struct Shape { ShapeVTable *vTable; ... }; struct Circle { CircleVTable *vTable; But we’ve already seen that this causes Shapes and Circles to be incompatible.
struct Shape { ShapeVTable *vTable; ... }; struct Circle { CircleVTable *vTable; In particular, the following code thinks that it is looking up a function in a ShapeVTable, when it is actually using a CircleVTable: Circle c; Shape *s = &c; double a = (*(s->vTable->area))(s); The type system can’t be sure that ShapeVTable and CircleVTable are compatible, so it won’t allow this code to typecheck.
Compare this to C++’s virtual functions: So we’re stuck with: double circle_area(Shape *myself) { return PI * myself->radius * myself->radius; } Compare this to C++’s virtual functions: double Circle::area(Circle *this) { return PI * this->radius * this->radius; With C++’s virtual functions, Circle::area somehow knows that it gets a this pointer of type Circle*, not Shape*.
double Circle::area(Circle *this) { return PI * this->radius * this->radius; } Circle::area knows that it gets a this pointer of type Circle*, even when the caller doesn’t know that it has a Circle: Circle c; Shape *s = &c; double a = s->area(); // at run-time (after the v-table lookup), this means: // double a = Circle::area(s); // s is the “this” ptr Even though the caller is dealing with a type Shape*, the function that gets called knows that it gets a Circle*, not just a Shape*.
double Circle::area(Circle *this) { return PI * this->radius * this->radius; } So virtual member functions know more about their own data than the caller knows. This is a very powerful form of data hiding, or abstraction. C++’s support for this abstraction is the reason that it’s type system is more expressive than C’s type system or Pascal’s system, and is why C++ is particularly suitable for object-oriented programming. Note: languages with higher-order functions share some of this abstraction expressiveness, and higher-order functions are closely related to objects in this respect.
“What is a ‘virtual member function’? From an OO perspective, it is the single most important feature of C++” - from the C++ FAQ Lite by Marshall Cline http://www.cerfnet.com/~mpcline/C++-FAQs-Lite/ Object oriented programming consists of building objects that know how to operate on their own data. Virtual functions are the mechanism that binds code tightly to data, and let’s the code know more about the data than anyone else knows about the data.
Remember to turn in homework 5 Good luck on your projects!