Memory Leaks and Valgrind Malloc'd objects, garbage collection, and memory leaks The Java language provides for "automagic" garbage collection in which storage for an object is magically reclaimed when all references to an object have gone out of existence. C provides no such mechanism.
Memory Leaks and Valgrind A memory leak is said to have occurred when: the last pointer to a malloc'd object is reset or the last pointer to a malloc'd object is a local variable in a function from which a return is made. In these cases the malloc'd memory is no longer accessible. Excessive leaking can lead to poor performance and, in the extreme, program failure.
Memory Leaks and Valgrind C programmers must recognize when the last pointer to malloc'd storage is about to be lost and use the free() function call to release the storage before it becomes impossible to do so. Several examples of incorrect pointer use and memory leaking have been observed in student programs.
Problem 1: the instant leak This is an example of an instant leak. The memory is allocated at the time temp is declared and leaked when temp is reassigned the address of the first object in the list. int *data = (int *) malloc(N * sizeof(int)); temp = list->head; while (temp != NULL) -- process list of objects -- Two possible solutions: Insert free(temp) before temp = list->head; This eliminates the leak, but what benefit is there to allocating storage and instantly freeing it???
Problem 1: the instant leak (cont'd) Possible solution: Instead of allocating memory for temp, change the declaration to obj_t *temp = NULL; This is the correct solution Rules of thumb: A rational rule of thumb is never malloc memory unless you are going to write into it! Another good rule of thumb is to never declare a pointer without initializing it.
Problem 2: The traditional leak In this traditional leak, storage is also allocated for univec at the time it is declared. The call to vl_unitvec3() writes into that storage. If the storage is not malloc'd, then univec will not point to anything useful and the call to vl_unitvec3() will produce a segfault or will overwrite some other part of the program's data. So this malloc() is necessary. { double *univec = malloc(3 * sizeof(double)); vl_unitvec3(dir, univec); : more stuff involving univec return; }
Problem 2: The traditional leak { double *univec = malloc(3 * sizeof(double)); vl_unitvec3(dir, univec); : more stuff involving univec return; } Although this malloc is necessary, the instant the return statement is executed, the value of univec becomes no longer accessible and the memory has been leaked. The correct solution is to add free(univec); just before the return;
Problem 2: The traditional leak A rational rule of thumb is: malloc'd storage must be freed before the last pointer to it is lost.
Problem 3: Overcompensation The concern about leakage might lead to an overcompensation. For example, an object loader might do the following: obj_t *new_obj; : new_obj = malloc(sizeof(obj_t)); if (list->head == NULL) { list->head = list->tail = new_obj; } else list->tail->next = new_obj; list->tail = new_obj; free(new_obj);
Problem 3: Overcompensation obj_t *new_obj; : if (…) else { list->tail->next = new_obj; list->tail = new_obj; } free(new_obj); This problem is the reverse of a memory leak . A live pointer to the object exists through the list structure, but the storage has been freed.
Problem 3: Overcompensation else { list->tail->next = new_obj; list->tail = new_obj; } free(new_obj); The results of this are: Usually attempts to reference the freed storage will succeed. The storage will eventually be assigned to another object in a later call to (). Then “both” objects will occupy the same storage.
Problem 3: Overcompensation else { list->tail->next = new_obj; list->tail = new_obj; } free(new_obj); Rational rule of thumb: Never free an object while live pointers to the object exist. Any pointers to the freed storage that exist after the return from free() should be set to NULL. To fix this problem, the free(new_obj) must be deleted from the code. If the objects in the object list are to be freed, it is safe to do so only at the end of processing. It is not imperative to do so at that point because the Operating System will reclaim all memory used by the process when the program exits.
Problem 3b: Overcompensation revisited The free() function must be used only to free memory previously allocated by malloc() unsigned char buf[256]; : free(buf); is a fatal error.
Problem 3b: Overcompensation revisited The free() function must be not be used to free the same area twice. buf = (unsigned char *)malloc(256); : free(buf); is also a fatal error.
General Solution to Freeing Memory Even for some complicated programs, it is usually easy for an experienced programmer to know when to free() dynamically allocated storage. In programs as complicated as the Linux kernel it is not. A technique known as reference counting is used.
General Solution to Freeing Memory typedef struct obj_type { int refcount; : } obj_t; At object creation time: new_obj = malloc(sizeof(obj_t)); new_obj->refcount = 1; When a new reference to the object is created my_new_ptr = new_obj; my_new_ptr->refcount += 1;
General Solution to Freeing Memory When a reference is about to be reused or lost my_new_ptr->refcount -= 1; if (my_new_ptr->refcount == 0) free(my_new_ptr); my_new_ptr = NULL; In a multithreaded environment, such as in an OS kernel, it is mandatory that the testing and update of the reference counter be done atomically.
Valgrind – debugging utility Memory leaks are such a common problem that utilities have been developed to help detect memory leaks. One of the more popular public domain versions is “Valgrind”. The Valgrind utility can actually be used to detect a number of common problems with using pointers. We will limit the discussion here to memory leaks. You can google “Valgrind” and you will find a host of tutorials and directions on using Valgrind.
Valgrind – debugging utility Consider the following leak.c program: #include <stdio.h> #include <stdlib.h> #include <assert.h> #define INFILE 1 #define ARGS 2 const int N = 5; int main(int argc, char *argv[]) { int i; int *iptr = NULL; FILE *input = NULL; // assert sufficient command-line arguments assert(argc >= ARGS); // open file and verify success input = fopen(argv[INFILE], "r"); assert(input != NULL);
Valgrind – debugging utility leak.c (cont'd) // dynamically allocate memory for N integers and verify success iptr = (int *)malloc(N * sizeof(int)); assert(iptr != NULL); // read data for(i = 0; i < N; i++) fscanf(input, "%d\n", (iptr + i)); // print data printf("%d\n", *(iptr + i)); int *x = (int *) malloc(7 * sizeof(int)); x = &i; printf("\nx = %d\n", *x); return 0; }
Valgrind – debugging utility Compile the program using gcc -Wall -o leak -g leak.c Run Valgrind using valgrind ./leak data.txt
Valgrind – debugging utility The following output is produced by Valgrind ==31095== ==31095== HEAP SUMMARY: ==31095== in use at exit: 636 bytes in 3 blocks ==31095== total heap usage: 4 allocs, 1 frees, 684 bytes allocated ==31095== LEAK SUMMARY: ==31095== definitely lost: 68 bytes in 2 blocks ==31095== indirectly lost: 0 bytes in 0 blocks ==31095== possibly lost: 0 bytes in 0 blocks ==31095== still reachable: 568 bytes in 1 blocks ==31095== suppressed: 0 bytes in 0 blocks ==31095== Rerun with --leak-check=full to see details of leaked memory ==31095== For counts of detected and suppressed errors, rerun with: -v ==31095== ERROR SUMMARY: 12 errors from 1 contexts (suppressed: 0 from 0)
Valgrind – debugging utility The key line I want to look at is the statement “definitely lost: 48 bytes in 2 blocks”. This means 48 bytes were “leaked” during the execution of this program. I can get more detail by rerunning Valgrind with the “leak-check=full” options, i.e.: valgrind --leak-check=full ./leak data.txt This time I get the following output:
Valgrind – debugging utility ==31099== ==31099== HEAP SUMMARY: ==31099== in use at exit: 636 bytes in 3 blocks ==31099== total heap usage: 4 allocs, 1 frees, 684 bytes allocated ==31099== 20 bytes in 1 blocks are definitely lost in loss record 1 of 3 ==31099== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==31099== by 0x40092F: main (leak.c:48) ==31099== 48 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==31099== by 0x40097D: fill (leak-fns.c:8) ==31099== by 0x4008E4: main (leak.c:41) ==31099== LEAK SUMMARY: ==31099== definitely lost: 68 bytes in 2 blocks ==31099== indirectly lost: 0 bytes in 0 blocks ==31099== possibly lost: 0 bytes in 0 blocks ==31099== still reachable: 568 bytes in 1 blocks ==31099== suppressed: 0 bytes in 0 blocks ==31099== Reachable blocks (those to which a pointer was found) are not shown. ==31099== To see them, rerun with: --leak-check=full --show-leak-kinds=all
Valgrind – debugging utility The output highlighted in yellow leads me right to the malloc() in main() at leak.c, line 49 ==31099== ==31099== 20 bytes in 1 blocks are definitely lost in loss record 1 of 3 ==31099== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==31099== by 0x40092F: main (leak.c:48) The section outlined in green leads to the fill()function. ==31099== 48 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==31099== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==31099== by 0x40097D: fill (leak-fns.c:8) ==31099== by 0x4008E4: main (leak.c:41)
Valgrind – debugging utility The code for fill() is 1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <assert.h> 4 5 void fill(FILE *in, int *mydata, int numVals) 6 { 7 int i; 8 int *temp = (int *)malloc(numVals *sizeof(int)); 9 10 for(i = 0; i < numVals; i++) 11 fscanf(in, "%d\n", (mydata + i)); 12 13 temp = mydata; 14 assert(temp != NULL); 15 free(mydata); 16 } The statement being flagged by Valgrind is highlighted in green (line 8 in leak-fns.c).
Valgrind – debugging utility Memory leaks are easy bugs to create. Two cautions about Valgrind: it isn’t perfect – it won’t catch everything, it is SLOWWWWW. Your program may run 10-20 times slower under Valgrind. But you don’t need to run your program every time under Valgrind. You can use it periodically during the debugging stage for the program but discontinue its use when things have stabilized.