Pointers and References Lecture 3 Pointers and References
Introduction to Pointers – C++ In Java: There are only primitive types and Objects. Primitives are always allocated on the stack, and Objects on the heap. In C++: Both primitives and objects may be allocated on the stack or on the heap. Anything can be reached by holding an address to its location. The type of a variable that holds an address to a memory location is called a pointer. The main memory can be thought of as an array of bytes A pointer is an index in this array
C++ Pointers Pointers are themselves primitive types Which means they are stored in the Stack They hold the memory address of a primitive or an object The object may reside on the heap or stack A pointer to a type type_a is of type type_a* where type_a* is a primitive type regardless of the type type_a Supposed that aPtr is a pointer to some memory of a variable reading this variable will be done by dereferencing aPtr as follows: std::cout << *aPtr;
C++ Pointers Most types require more than one byte - the pointer will hold the address to the first byte in the sequence of bytes holding the value: int - 4 bytes long - 8 bytes double – 8 bytes string – unlimited bytes How can the compiler know the end of the type? Example: the int value 7 will be stored as: Hexadecimal: 00 00 00 07 (0x07) where 0xFF -> 255 binary: 00000000 000000000 00000000 00000111 (0b111) where 0b11111111 = 255 This value may be found at address starting at 51084 or 0xC78C therefore: int *p = (int *)51084 or int *p = (int *)0xC78C; or int *p = (int *)0b1100011110001100; The value at this address is 7 while the address itself is 0xC78C
C++ Pointer: Definition Using *(asterisk)we define a pointer variable type* pointerName; Pointer definition must contain the object type it will point at Syntax: int* iPtr; iPtr variable is a pointer due to the asterisk in its definition it will point at an int due the presence of int in its definition string* sPtr; sPtr variable is a pointer due to the asterisk in its definition It will point at a string due the presence of string in its definition
C++ Pointer: Initialization and Dereferencing We initialize the pointer by giving it a variable of the same type the pointer defined as Compiles: int* iPtr = new int(5); Does not compile: string* sPtr = new int(5); What would this do? int* iPtr = 5; Dereferencing: It is the action of retrieving the value that the pointer is pointing at: Using * (asterisk) we are able to retrieve the value from the pointer Example: std::cout << *iPtr; Retrieving variable location(address): Using &(ampersand) we can retrieve the starting location of any variable Example: std::cout << &iPtr;
Pointer Example The following takes place: A primitive of type int* is allocated. On the activation frame of function main The space allocated is associated with i_ptr. A primitive int is allocated on the heap. initialized to 10. The address of the allocated int is the value of i_ptr. The operator << of std::cout is passed the content (by value) i_ptr points to. std::cout's operator << is called with the integer 10. #include <iostream> int main(){ int *i_ptr = new int(10); std::cout << i_ptr << std::endl; std::cout << *i_ptr << std::endl; std::cout << &i_ptr << std::endl; return 0; }
Another Example Arrow notation: objectPtr->objectField Allows convenient access to data members of an object through its pointer. Object initialization inside constructor will implicitly call the default constructor. Member initialization list comes to remove this requirement. This is needed because C++ has evolved from C with backward compatibility 1 2 3 4 5 6 7 8 9 10111213141516 class Person { private: int id; public: //default constructor Person(){this -> id = 0;} //member initialization list Person(int id): id(id) {} //member initialization inside constructor Person(int id){this -> id = id;}; int getId() const { return this -> id; } void setId(int id) { this -> id = id; } void speak() const { std::cout << "I am " << this -> id << std::endl; }
Member Initialization List Used to initialize object data members in C++ The member initialization is executed before the body of the function The correct way to initialize object members in C++ Issues initializing data members inside the constructor body: Implicit call to default constructor, in cases where objects contain other objects const members of a class cannot be initialized inside constructor body The order the initialization is executed according to the order the member variables are declared in the code not the order in the initialization list It is good practice to initialize the variables in the member initialization list in the same order of their declaration It is hence a convention to keep the order of the list as the order of the declaration.
Implicit Default Constructor Call: Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 class Man { private: std::string name; public: Man(){ std::cout << "CREATE: Man default constructor" << std::endl; } Man(std::string name){ this->name = name; std::cout << "CREATE: Man constructor" << std::endl; ~Man(){ std::cout << "DELETE: Man default destructor" << std::endl; };
Implicit Default Constructor Call: Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 class Person{ private: int id; Man man; public: Person(){ std::cout << "CREATE: Person default constructor" << std::endl; } Person(int id): id(id), man(Man("mark")){ std::cout << "CREATE: Person MIL constructor" << std::endl; Person(int id, std::string s){ this->id = id; this->man = Man(s); std::cout << "CREATE: Person body constructor" << std::endl; ~Person(){ std::cout << "DELETE: Person destructor" << std::endl; };
Implicit Default Constructor Call: Example 1 2 3 4 5 6 7 int main(){ Person *p = new Person(5); delete p; std::cout << "vs" << std::endl; Person *p2 = new Person(5, std::string("mark")); delete p2; } CREATE: Man constructor CREATE: Person MIL constructor DELETE: Person destructor DELETE: Man default destructor CREATE: Man default constructor CREATE: Man constructor DELETE: Man default destructor CREATE: Person body constructor DELETE: Person destructor vs
Inability to initialize const data members When the class’s constructor is executed, m_value1, m_value2, and m_value3 are created. Then the body of the constructor is run, where the member data variables are assigned values. This is similar to the flow of the following code in non-object-oriented C++ (on the right) Which means it is not possible to initialize a const data member this way. Members initialization list fixes this problem
this, -> and . “this”: a pointer to the currently active object. Example: void setX(int x){ this->x = x; } “->”: Equals to dereferencing a pointer and then using the “.” operator a->b is equivalent to (*a).b (*p).method() is equivalent to p->method() “.”: Used to access variables of a reference or a regular object-variable: Person p1; Person &p2 = p1; Person *p3 = &p1; std::cout << p1.getName(); std::cout << p2.getName(); std::cout << (*p3).getName();
Back to the example int main() { Person bety(482528404); Person *ula = new Person(83457934); bety.speak(); ula->speak(); return 0; } class Person{ private: int id; public: //member initialization list Person(int id): _id(id) {} int getId() const { return this->_id; void setId(int id) { this->id = id; void speak() const { std::cout << “I am “ << this->_id << std::endl; Let’s have a look at the main function next slide
Back to the example The following takes place: Space for a Person is allocated on the activation frame of the main function, and the constructor of Person is called with 482528404. this points to the address of the space on the stack. The allocated space is associated with the variable bety. Space for a pointer Person* is allocated on the activation frame of the main function. It is associated with the variable ula. Space for a Person is allocated on the heap, and its constructor is called with 83457934. this points to the address of the space on the heap. The address of the newly allocated Person is saved in ula int main() { Person bety(482528404); Person *ula = new Person(83457934); bety.speak(); ula->speak(); return 0; }
Back to the example The following takes place: Space for a Person is allocated on the activation frame of the main function, and the constructor of Person is called with 482528404. this points to the address of the space on the stack. The allocated space is associated with the variable bety. Space for a pointer Person* is allocated on the activation frame of the main function. It is associated with the variable ula. Space for a Person is allocated on the heap, and its constructor is called with 83457934. this points to the address of the space on the heap. The address of the newly allocated Person is saved in ula int main() { Person bety(482528404); Person *ula = new Person(83457934); bety.speak(); ula->speak(); return 0; }
Dereferencing a Pointer and the “address of” operator We saw how to dereference a pointer, using the * operator: (*ula).speak(); It is valid, since (*ula) is of type Person, and can be accessed using the . (dot) operator. The same can be done with ula->speak(); We sometimes would like to get the address of something. To this end there is the "address of" operator, &. int i = 10; int *i_ptr = &i; i_ptr holds the address in which i is stored on the stack.
Pointers arguments for a function void inc(int *i_ptr){ (*i_ptr)++; } int i = 0; inc(&i); std::cout << i << endl; and the output will be 1.
const pointers int *const i_ptr Any type in C++ can be marked as const, which means that its value cannot be changed. Two roles: Hints the user that this value is not going to change. Prevents bugs caused by changing this variable by mistake. const pointer is declared by adding the const keyword after the type int *const i_ptr a const pointer to an int The pointer itself can not be changed – it must be set after such a decleration.
const pointers const int * x (int const * x) x is pointer to a constant integer Can also be int const * x (const int * x) int * const x x is a constant pointer to an integer int const * const x X is a const pointer to a const integer
Example: Text vs Heap vs Stack 1 2 3 4 5 6 7 8 9 10 11 12 13 #include <iostream> int main(){ int i = 0; int* j = new int(0); char* c ="hi there"; std::cout << &i << "\t" << j << "\t” << (void *)c << std::endl; } Line 7,8, and 9: What is the section name of the printed address? 7 – stack 8 – j on stack, the int on heap 9 – c on stack, the string on text section
lvalues and rvalues Every C++ expression is either an lvalue or an rvalue. Anything that cannot be on the left side of the equation, is an rvalue. lvalue: An lvalue refers to an object that persists beyond a single expression. An lvalue is an object that has a name. All variables, including nonmodifiable (const) variables, are lvalues. rvalue: An rvalue is a temporary value It does not persist beyond the expression that uses it
lvalues and rvalues: example int main() { int x = 3 + 4; print(x); } x is an lvalue because it persists beyond the expression that defines it. 3 + 4 is an rvalue because it evaluates to a temporary value that does not persist beyond the expression that defines it. lvalues are values that can appear in the left side of an assignment while rvalue are the rest (you can write x=3+4 and not 3+4=x).
The concept of references A reference is a pointer with restrictions A reference is a way of accessing an object, but is not the object itself The syntax of reference is simpler than the pointer syntax Upon definition it acts like a pointer variable Once defined and upon use it acts like a non-pointer variable C++ has two types of references: lvalue references rvalue references
lvalue references An lvalue reference is like a const pointer to an lvalue without using any pointer notations! A reference may only be assigned once, when it is declared Upon declaration the reference is initialized with some lvalue expression It cannot receive rvalue expressions as initializer! It may not be altered to reference something else later Once declared it will act as a non-pointer variable! Any change via the reference will modify the value it references!
Example: int i=0; int &i_ref = i; i_ref++; std::cout<<i<<std::endl; What is the output? 1 We can have a reference to any type by adding & after the type However, we cannot have references to references
const lvalue References lvalue references can only accept lvalues. int foo() { return 42; } int x = 10; int& i = 4; int& j = x + 1; int& k = foo(); const int& s = foo(); Line2: //illegal - 4 is not an lvalue Line 3://illegal - again - not an lvalue Line 4://not an lvalue too!! Line 5: //this works! Why?
const lvalue References const int& s = foo(); //this works! Be careful when using it! It may create dangling reference problem. The temporary value returned by foo() lives until s gets out of scope Once the function finishes execution its stack frame is completely removed along with all declared variables Function return value best practices: Return by value – for variables found on stack and small variables Return by pointer - for variables found on heap and big variables Return by reference - for objects found in the heap only Never on stack!
Parameter Passing There are two types of parameter passing: “IN” parameters “OUT” parameters IN parameters: variables passed to the function, which the function does not need to change. In addition, the function cannot change them if they are sent by value, and any mutating operation on them is not visible outside of the function OUT parameters: They are meant as a side-channel from which the function may return information This in addition to the return value! Changes made to these parameters are visible outside the scope of the functions
Parameter Passing In Java there are 2 forms of implicit passing parameters: primitives are passed by value - as 'in' parameters objects by reference - possible 'out' parameters Both of these versions are automatically handled by Java following their detected type The syntax however is the same – hence the implicitly of the action. In C++ there are 3 forms of explicit for parameter passing. The programmer explicitly states the method, by using the appropriate type C++ allows passing parameters By value - as 'in' parameters By pointer - either ‘in’ or ‘out’ By reference - either ‘in’ or ‘out’
By Value void byVal(int i, Person person){ person.setId(i); } Person hemda(20); byVal(30, hemda); std::cout << hemda.getId() << std::endl; The output is 20. When we call byVal, both 30 and the entire content of hemda are copied, and placed on the activation frame. byVal performs all of its operations on these local copies only
By Pointer If we don’t want to copy objects or want to change the parameter we can use pointers. void byPointer(int i, Person *person){ person->setId(i); } Person hemda(20); byPointer(30, &hemda); std::cout << hemda.getId() << std::endl; The output is 30. byPointer received a pointer to the location of hemda on the activation frame, and changed its id.
By Reference (lvalue) When we wish to refrain from using pointers, which are inherently unsafe, we may use references. void byReference(int i, Person &person){ person.setId(i); } Person hemda(20); byReference(30, hemda); std::cout << hemda.getId() << std::endl; This code produces the same output as before (30), but we did not have to pass pointers. Moreover, the compiler is allowed to optimize the reference beyond the "const pointer" abstraction from above.
Best Practices for Parameter Passing For IN parameters: If the data object is small, pass it by value If the data object is an array, use a pointer. Make the pointer a const If the data object is a struct, use a const reference If the data object is a class object, use a const reference For OUT parameters: If the data object is a built-in data type, use a pointer or a reference If the data object is an array, use a pointer If the data object is a struct, or a class object, use a reference When receiving a pointer check for being nullptr A reference is never null, so no need to check references
Returning Values From Functions Values can be returned either by value, reference or pointer. When returning something by reference or pointer care should be taken not to return a reference or a pointer to the soon to be demolished activation frame! Person& f(int x) { Person c(x); return c; // THIS IS WRONG! } Once the function ends executing, its corresponding stack frame is completely removed.
Another bad example g++ 1.cpp; ./a.out 0xbffff564 134513864 0xbffff564 134513864 g++ 1.cpp -O1; ./a.out 0xbffff564 1 0xbffff564 2 g++ 1.cpp -O2 ; ./a.out 0xbffff560 1 0xbffff564 134519000 g++ 1.cpp -O3 ; ./a.out 0xbffff574 1 0xbffff570 1 We cannot predict how our program will work! No flag is lifted for us, no exception, no segmentation fault. It just works every time differently! It’s undefined behavior #include <iostream> int* f(){ int i = 1; cout << &i << endl; return &i; } void g(){ int k = 2; cout << &k << endl; void main() { int *i = f(); cout << *i << endl; g();
C++ arrays An Array in C++ is a block of continuous memory that contain data of the same defined type The memory image of an array of integers which holds the number 5 in each cell looks like this: int arr[5] = {5, 5, 5, 5, 5, 5} Each cell is of size 4 bytes which corresponds to the size of int in this example Defining an array is done by: Denoting cell type - int Denoting its predefined size using square brackets – [5] Initializing it using curly brackets - {5, 5, 5, 5, 5, 5}
C++ arrays By dereferencing a pointer to the specific cell we access that cell Array index begins from 0 Array variable (for example: arr)denotes the beginning address of the array Array variable is of type const pointer. Using square brackets we access a specified cell: arr[3] – array notation *(arr+3) – explicit pointer arithmetic and pointer dereferencing Pointer arithmetic: By adding 3 to arr_ptr, we add 3 ints to the initial address We add 3 times sizeof(int)
Arrays on the Heap Arrays may be allocated on the Stack or on the Heap Allocating an array on the heap is achieved using new[], and deallocating it is done by delete[] int *arr = new int[100]; std::cout << arr[2] << std::endl; delete[] arr; The output of this code will always be 0 The new[]operator initializes the array's elements using their default constructor Which in the case of int is 0
Arrays on the Heap Consider the code: Person **Person_arr = new Person*[100]; for (int i=0; i<100; i++) Person_arr[i] = new Person(i); delete [] Person_arr; We initialize new Person objects on the heap, store their pointers in Person_arr When we call delete[], we expect that the array will be deallocated However, each element in the array is a pointer This means that each individual Persons we allocated will not deleted We will need to delete each Person manually – before deleting the array
Arrays on the Stack To allocate an array on the Stack, the array's size must be known in advance Person Person_arr[5]; Each Person will be initialized using its default constructor We may also tell the compiler how to initialize individual Persons: Person Person_arr[5] = {Person(1), Person(21), Person(454), Pe rson(8), Person(88)}; Accessing cells of the array on the Stack is achieved in the same as through a pointer Person_arr is basically a pointer to the beginning of the array of Persons
Arrays on the Stack To allocate an array on the Stack, the array's size must be known in advance Person Person_arr[5]; Each Person will be initialized using its default constructor We may also tell the compiler how to initialize individual Persons: Person Person_arr[5] = { Person(1), Person(21), Person(454), Person(8), Person(88)}; Accessing cells of the array on the Stack is achieved in the same as through a pointer Person_arr is a pointer to the beginning of the array of Persons