CSE 214 – Computer Science II Pointers & Memory Management Source: http://www.joelneuenhaus.com/blog3/wp-content/uploads/2007/01/homer-simpson-brain-1024.jpg
Ever used C? Java inherited much from C But lots is different similar primitives (not entirely the same) similar conditional statements similar loops But lots is different pointers structs C memory management and much more
What is a computer’s memory? A sequential group of storage cells typically a cell is a byte Each cell has an address (a number) In C as in Java some cells contain data some cells contain references to other memory addresses memory addresses are typically 4 bytes
Pointers In Java, what do object variables store? memory addresses this makes them pointers what do they point to? object data on the heap In C, you can have either: a pointer to a struct/array/primitive OR the actual struct/array/primitive
* * is a C operator used for 2 purposes: To declare a variable as a pointer. Ex: int *myIntPointer; right now, myIntPointer points to a random address trying to use before initialization results in a segmentation fault or bus error To dereference an existing pointer variable. Huh? means to get what’s at the address the pointer stores
& A C operator used for getting the address of a variable This would produce an address similar to what’s stored by a pointer
Think of it this way When you compile a C program, instructions are added for properly running your program One thing these instructions do is manipulate declared variables Every declared variable is stored at a memory location So the question is what’s there? for a regular int, just a number for an int*, a memory address of an int
Example, what output will we get? int num = 5; int *pNum = # num = 7; printf("num = %d\npNum = %d\n", num, *pNum); if (&num == pNum) printf("true\n"); else printf("false\n"); OUTPUT: num = 7 pNum = 7 true
Why do we need pointers? What good are they? So who cares? Why do we need pointers? What good are they? In C, pointers can be used to multiple advantages: call-by-reference methods dynamic arrays dynamic memory allocation using pointer arithmetic Data at the end of a pointer can be filled in later
Call-by-value In Java, when you pass an argument to a method, you are actually passing a copy of that argument this is called call-by-value Ex: public static void main(String[] args) { int x = 5; junk(x); System.out.println("x is " + x); } public void junk(int argument) argument++; OUTPUT: x is 5
C also has call-by-reference We can pass the adddress of a variable. So? We can directly change the original variable Ex: void junk(int cbv, int *cbr); int main() { int x = 5; int y = 6; junk(x, &y); printf("x is %d\ny is %d\n", x, y); } void junk(int cbv, int *cbr) cbv++; (*cbr)++; OUTPUT: x is 5 y is 7
Is it still call by value? Some might say it’s still technically call-by-value we are passing a copy of the address So, assigning an address to the pointer would not change the original pointer
What’s the output and why? void junk(int *test); int main() { int x = 5; junk(&x); printf("x is %d\n", x); } void junk(int *test) int num = 10; test = # OUTPUT: x is 5
What’s dynamic memory allocation? In Java when objects are constructed based on decisions made at runtime In C, when structs and arrays are constructed based on decisions made at runtime For dynamic structs & arrays, we can use pointers
First things first, what’s a struct? Like a record in Pascal A single construct that can store multiple variables sounds like an object BUT does not have methods
Declaring a struct type & struct variables To declare a struct type: struct Point { int x; int y; }; To declare a struct variable: struct Point p1; To reference data in a struct, use ‘.’ operator, just like an object: p1.x = 5;
We just changed p, a copy of p1, but not p1 struct Point { int x; int y; }; void changePoint(struct Point p); int main() { struct Point p1; p1.x = 100; p1.y = 200; changePoint(p1); printf("p1.x is %d\np1.y is %d\n", p1.x, p1.y); } void changePoint(struct Point p) p.x *= 5; p.y *= 4; OUTPUT: p1.x is 100 p1.y is 200 We just changed p, a copy of p1, but not p1
We just changed p, and so changed p1 struct Point { int x; int y; }; void changePoint(struct Point p); int main() { struct Point p1; p1.x = 100; p1.y = 200; changePoint(&p1); printf("p1.x is %d\np1.y is %d\n", p1.x, p1.y); } void changePoint(struct Point *p) (*p).x *= 5; (*p).y *= 4; OUTPUT: p1.x is 500 p1.y is 800 We just changed p, and so changed p1
BTW, where is p1 in memory? int main() { struct Point p1; p1.x = 100; p1.y = 200; changePoint(&p1); printf("p1.x is %d\np1.y is %d\n", p1.x, p1.y); } IN THE main method STACK FRAME!
When makePoint ends, p1 gets popped from the stack struct Point { int x; int y; }; void changePoint(struct Point p); int main() { struct Point *p1 = makePoint(); printf("p1.x is %d\np1.y is %d\n", (*p1).x, (*p1).y); } struct Point* makePoint() struct Point p1; p1.x = 100; p2.y = 200; return &p1; What happens? DISASTER! WHY? When makePoint ends, p1 gets popped from the stack
Declare a struct pointer variable How can we fix this? Declare a struct pointer variable When you want to make one on the heap, use malloc What’s malloc? a method for dynamic memory allocation you give it a size it gives you that many bytes of continuous memory cells it returns you the address of the first byte in the block
Note, always free what you malloc C has no garbage collector If you malloc something, when you’re done with it you need to free it Why? if you don’t the memory will not be recycled this is called a memory leak Who cares? you should if you want your program to run efficiently What’s free? a method that releases the memory block argument
struct Point { int x; int y; }; void changePoint(struct Point p); int main() { struct Point *p1 = makePoint(); printf("p1.x is %d\np1.y is %d\n", p1->x, p1->y); } struct Point* makePoint() struct Point *p1; p1 = malloc(sizeof(struct Point)); (*p1).x = 100; (*p2).y = 200; return p1; NOW IT WORKS
-> Used to dereference data in a pointer to a struct or array. Ex: struct Point { int x; int y; }; int main() int pointBytes = sizeof(struct Point); struct Point *p = malloc(pointBytes); p->x = 10; p->y = 20; printf("x is %d\ny is %d\n", p->x, p->y); free(p); }
What if we don’t free? Memory Leak! (that’s a bad thing) struct Point *p; int pointSize = sizeof(struct Point); int i; for (i = 0; i < 10000; i++) { p = malloc(pointSize); p->x = i % 1000; p->y = i % 500; printf("p->x is %d\tp->y is %d\n", p->x, p->y); } Memory Leak! (that’s a bad thing) note, by the time this loop ends, your program will be using up 8 bytes * 10000 = 80000 bytes of memory for one p variable where’d I get 8 bytes from?
What if we don’t free? struct Point *p; int pointSize = sizeof(struct Point); int i; for (i = 0; i < 10000; i++) { p = malloc(pointSize); p->x = i % 1000; p->y = i % 500; printf("p->x is %d\tp->y is %d\n", p->x, p->y); free(p); } No memory leak now What if we were to use p->x now? dangling reference (that’s bad) it might still be there, it might not (segmentation fault error possible)
We can dynamically construct arrays too With malloc, we can request continuous blocks of memory right? So a * will point to the front of the block That can be the first index in an array What can we put into arrays? primitives pointers to primitives (other arrays of primitives) structs pointers to structs
char* as an array char *text; text = malloc(sizeof(char) * 4); int i; for (i = 0; i < 3; i++) text[i] = (char)(65 + i); text[3] = '\0'; printf("text is %s\n", text); Output: text is ABC
Pointers to Pointers (for 2D arrays) char **text2; text2 = malloc(sizeof(char*) * 2); text2[0] = malloc(sizeof(char) * 4); for(i = 0; i < 3; i++) text2[0][i] = (char)(i + 65); text2[0][3] = '\0'; printf("text2[0] is %s\n", text2[0]); text2[1] = malloc(sizeof(char) * 6); for(i = 0; i < 5; i++) text2[1][i] = (char)(i + 68); text2[0][5] = '\0'; printf("text2[1] is %s\n", text2[1]);
Array of structs struct Point *points; int numPoints = 5; points = malloc(sizeof(struct Point) * numPoints); int i; srand(time(NULL)); for (i = 0; i < numPoints; i++) { points[i].x = rand() % 1000; points[i].y = rand() % 1000; printf("points[%d] = (%d,%d)\n", i, points[i].x, points[i].y); } bash-2.05$ ./StructArraysTester points[0] = (597,209) points[1] = (41,800) points[2] = (464,96) points[3] = (59,892) points[4] = (418,231)
Array of struct pointers struct Point **array; int numPointers = 100; int numPoints = 0; array = malloc(sizeof(struct Point*) * numPointers); srand(time(NULL)); int i; for (i = 0; i < 2; i++) { array[i] = malloc(sizeof(struct Point)); array[i]->x = rand() % 1000; array[i]->y = rand() % 1000; numPoints++; printf("point%d = (%d,%d)\n", i, array[i]->x, array[i]->y); } OUTPUT: point0 = (403,358) point1 = (941,453)
Segmentation Fault One of the most common C errors Means you are trying to access a place in memory that you do not have permission to access Typical problems: accessing uninitialized pointer improper pointer arithmetic
Detecting Memory Leaks How do we know if we have a memory leak? Is the program’s memory footprint growing when it should not be? If yes, you have a leak malloc gets passed the # of bites to allocate free does not get passed such info this info exists in library methods, but we don’t have access to it To track memory allocation, we can define our own memory allocation/deallocation methods
What are our malloc & free going to do? Same as before. How? we’ll call C’s malloc and free from them What else? when mallocing, add a little extra memory for some header info when freeing, extract header info what kind of info? size of allocation checksum for error checking
Time to practice pointer arithmetic What’s that? Feature of C language If you have a pointer char *text, it points to memory address storing text text + 1 points to 1 byte after start of text might be second character might not you need to know what you’re moving your pointer to Why do we care? we need to stick our info at the front of our object
malloc_info Our header we’ll stick in in front of each piece of data we allocate We can total memory usage in global variables totalAlloc, totalFreed, dataAlloc, & dataFreed #define MALLOC_CHECKSUM 123456789 struct malloc_info { int checksum; size_t bytes; }; long totalAlloc, totalFreed, dataAlloc, dataFreed;
void. our_malloc(size_t bytes) { void. data; struct malloc_info void* our_malloc(size_t bytes) { void *data; struct malloc_info *info; data = malloc(bytes + sizeof(struct malloc_info)); if (!data) return NULL; else { size_t headerSize = sizeof(struct malloc_info); totalAlloc += bytes + headerSize; dataAlloc += bytes; info = (struct malloc_info *)data; info->checksum = MALLOC_CHECKSUM; info->bytes = bytes; char *data_char = (char*)data; data_char += headerSize; return (void*)data_char; }
void our_free(void. data) { struct malloc_info. info; void void our_free(void *data) { struct malloc_info *info; void *data_n_header; size_t header_size = sizeof(struct malloc_info); char *data_char = (char*)data; data_char = data_char - header_size; data_n_header = (void*)data_char; info = (struct malloc_info *)data_n_header; if (info->checksum != MALLOC_CHECKSUM) throw std::bad_alloc(); totalFreed += info->bytes + sizeof(struct malloc_info); dataFreed += info->bytes; free(data_n_header); }
So what? So we can monitor the following differences: totalAlloc–totalFreed dataAlloc–dataFreed We can also reset them if we wish Why? reset before a method starts check differences after method completes only relevant for methods that are supposed to have a 0 net memory allocation if differences > 0, memory leak may exist