CS 240 – Lecture 16 Structs, Struct Arrays, Struct Pointers, sizeof, malloc, free, void pointer
Types – struct types The C language predates widespread object orientation. Hence, C doesn't have built-in Object functionality. Though it's something that can and has been added to the language; see C++ You should already be familiar from previous classes that it's useful to group multiple related data values together as a first-class citizen. A first-class citizens are any part of a language which can be passed in as an argument to functions, returned from a function, assigned to a variable, etc. So far, we only know of integers, chars, addresses, and enums. C facilitates this for multiple values with what are called structs.
Types – structs type definition Using structs, you can define a new datatype within the C programming language for use with your program. struct name { type1 member1; … }; A struct has three pieces to it: the struct keyword, the name of the new struct type, and a block of variable declarations called members. The block must have at least one member but can have any number of them.
Types – struct types Below is an example of a struct type definition; struct point { int x; int y; }; To make use of the type, we need to declare a variable of that type. struct point origin; The type is not just point but struct point. Address Value Variable 0x82ff6f80 *junk* origin.x 0x82ff6f81 0x82ff6f82 0x82ff6f83 0x82ff6f84 origin.y 0x82ff6f85 0x82ff6f86 0x82ff6f87
Types – struct members As you can saw, the rules for variables apply normally. Values in automatic variables that are not explicitly set are junk. Also, make a note that the order of members in memory match the order of declaration. struct point origin; To access a member variable of a struct, use the dot (.) operator. origin.x = origin.y = 0; Address Value Variable 0x82ff6f80 0x00 origin.x 0x82ff6f81 0x82ff6f82 0x82ff6f83 0x82ff6f84 origin.y 0x82ff6f85 0x82ff6f86 0x82ff6f87
Types – struct initialize and assign It's also possible to initialize a struct during declaration of a struct variable. struct point origin = {0, 0}; The block in this case is called a compound literal, and will be implicitly interpreted as a struct point value. The block must have the right amount of values for the struct for this assignment to be meaningful, but you can set fewer or more (dangerous!) After declaration, assignment with compound literals must be done with explicit casting. origin = (struct point) {0, 0};
structs – Arrays of struct types If you recall, arrays are defined as follows: type name[number]; Now that we have struct types, you can declare arrays of struct types in the same way you would with ints or chars. struct point arrayname[number]; You can also declare the contents of a struct array literally like so: struct point points[] = { {0, 0}, {1, 0}, {0, 1}, {1, 1} }; If the number of elements of the array isn't specified, it will allocate enough space for the literal elements.
structs – Pointers to structs In memory, a struct is stored together with it's members. The address of the struct variable itself is the same address as its first member. Again, structs work like other types for pointers: struct point *pointptr; As do the address-of and value-at operators. *pointptr = (struct point) {0, 0}; pointptr = &origin; Address Value Variable 0x82ff6f80 0x00 origin.x, origin 0x82ff6f81 0x82ff6f82 0x82ff6f83 0x82ff6f84 origin.y 0x82ff6f85 0x82ff6f86 0x82ff6f87
structs – Pointers to structs When you have a pointer to a struct, it can be rather tedious to access members of the struct through it's address. (*pointptr).x += dx; In general, the fewer parenthesis that an expression has, the better. There is a syntactic sugar for this operation of accessing the members of a struct at an address. pointptr->x += dx; The arrow operator (->) simplifies indirectly referencing members. Fewer characters, cleaner, less holding the shift key.
structs – Nested structs Let's say that we wanted to use the point struct we defined earlier to describe a rectangle. In general, a rectangle can be uniquely determined by the following properties: Top-left vertex point. Height Width Rotation about top-left point (let's pretend all rectangles sit flat to ignore this) struct rectangle { struct point topleft; int height; int width; }; As you can see, structs can be members of other struct types without any issue. Can a struct be a member of it's own definition?
structs – Self-referential structs Let's consider mapping out a family tree with structs. Every person has a biological mother and father (generally) struct person { struct person mom; struct person dad; } What's wrong with this? Address Value Variable 0x82ff6f80 *junk* ??? 0x82ff6f81 0x82ff6f82 0x82ff6f83 0x82ff6f84 0x82ff6f85 0x82ff6f86 0x82ff6f87
structs – Self-referential structs A struct can't have an explicit reference to itself because of what's called infinite descent. You can't define a person until you know what a person is. struct person { struct person *mom; struct person *dad; } Instead, a struct can make indirect references to another person through pointers. Address Value Variable 0x82ff6f80 0x8c mom 0x82ff6f81 0x6f 0x82ff6f82 0xff 0x82ff6f83 0x82 0x82ff6f84 0x88 dad 0x82ff6f85 0x82ff6f86 0x82ff6f87
Language Built-in – sizeof "function" The sizeof function is actually an operator that works on types and expressions. sizeof(type) sizeof expression In the first case, sizeof(type) will tell you how many bytes a variable of that type takes up in memory. In the case of array types (int[10]) it will tell you the full allocation size for an array of that size (40); In the second case, sizeof expression will give you the size of the type of the expressions value but will not evaluate the expression. Since C is statically-typed, an expression will only ever result in the same type.
Memory – void pointer Of all the things we learned about pointers, an interesting thing about them is that we can declare pointers to be void. void *nothingptr; A pointer is a variable that houses an address which is just a location in memory, regardless of the type of value that resides there. Why is this useful? In many cases, a function may need to return a generic location in memory, where the surrounding memory layout may be a mosaic of varying types. int char char int char int int char
Memory – Dynamic Allocation with malloc So far, the only way we know of to allocate memory is through declaration statements. type name; type array[100]; struct type; However, the big flaw with declaration statements is that they only exist within their scope. Once you leave, that memory is deallocated. One alternative to this is static and global variables, but the allocation size of those variables must be decided before runtime. To dynamically allocate persistent blocks of memory for us to use, we need something else.
Memory – Dynamic Allocation with malloc In C, part of the built-in library includes a function called malloc which will reserve memory for the caller elsewhere. void *malloc(size_t size); // from stdlib.h The size argument is an unsigned integer specifying how many bytes malloc should allocate. The function returns a void pointer to the first byte in a block of exactly that many bytes. If malloc fails to reserve such a block from its allocation buffer, it returns NULL.
Memory – Memory Model of malloc malloc makes use of an allocation buffer somewhere in memory. This is implementation specific how it actually manages the memory of the allocation buffer, but it has to, at some level have storage. allocation buffer Every call to malloc reserves some chunk of memory in this buffer. This memory is persistently reserved and will stay allocated after the scope of the malloc call is left. What do we do when we're done with that memory? In use Free
Memory – Deallocating malloc Memory To release memory that has been allocated through malloc, we need to use the free function. void free(void* ptr); free takes an arbitrary address as input and unreserves memory allocated by malloc. After this is done, access to that memory is NOT sanctioned and CAN cause serious problems. Attempting to free memory that was not allocated by malloc will result in undefined behavior. Failing to call free for when you're done malloc allocated memory can cause a memory leak, when memory overtime is wastefully occupied.
Memory – Fragmentation If you stagger calls to malloc and free in random ways, there's an interesting and difficult problem that arises. alloc buffer These allocation buffer will eventually have numerous gaps. Although there is probably enough net total memory available for another allocation, there is no contiguous block of memory with enough for a larger allocation. This is called fragmentation. Why can't we defrag the allocation buffer like we could with a hard drive? In use Free
Memory – Allocating Multiple Typed Units Since you're able to ask for a certain number of bytes with malloc, it's often useful to ask for "enough room for X values of Y type." Y *name = (Y *) malloc(X * sizeof(Y)); This is architecture independent! Whereas we often hope that we're on a modern system with 4-byte integers, this works even for architectures where integers are 2-byte, 8- byte, or other! int *name = (int *) malloc(100 * sizeof(int)); The above gives us 400 bytes on our Unix system or enough space for 100 integers on any archtecture.