C and Data Structures Baojian Hua bjhua@ustc.edu.cn
Why need “Structure”? So far, we’d discussed two kinds of data: Simple: char, int, long, double Large: array, pointer It’s inconvenient in some applications: See next slides for an example
Example // We want to represent year/month/date int f (){ int year1, year2, month, date; year1 = 2050; year2 = 2020; month = 12; date = 30; date++; // ??? Should we increase year1 or year2? } // The problem is that there is no logical // connection between them. We need “structure”!
Structure Declaration // Structure declaration starts with key word // “struct” and followed by an optional struct // tag, and a list (one or more) fields. // For instance, to represent a two-dimensional // point: struct point2d { int x; // the x coordinate int y; // the y coordinate }; // note the semicolon // point2d contains two fields x and y, both of // type int. // struct point2d is a (user-defined) new type.
Variable Definition // Given the declaration above, we may write struct point2d pt; // to define a variable pt to be of type point2d, // just as we write int i;
Structure Fields // Given the variable definition struct point2d pt; // the following syntax fetch its fields x or y: pt.x; pt.y; // And they act as ordinary variables, such as: pt.x = 9; pt.y = 10; // or: printf (“%d\n”, pt.x + pt.y);
Structure in Structure // Having known that structures are types, we may // know its layout in memory. Technically, a // structure occupies a piece of continuous space: // So, we may even nest structures // in other structures, as in: struct rect { point2d pt1; point2d pt2; }; x y x y x y
Structures and Functions // A function creating a point2d structure: struct point2d create (int x, int y) { struct point2d pt; pt.x = x; pt.y = y; return pt; } // A sample call: struct point2d mypt = create (3, 4); x=??? y=???
Structures and Functions // A function creating a point2d structure: struct point2d create (int x, int y) { struct point2d pt; pt.x = x; pt.y = y; return pt; } // A sample call: struct point2d mypt = create (3, 4); x=3 y=4
Structures and Functions // A function creating a point2d structure: struct point2d create (int x, int y) { struct point2d pt; pt.x = x; pt.y = y; return pt; } // A sample call: struct point2d mypt = create (3, 4); x=3 y=4 x=3 y=4
Structures as Functions Arguments // Like the structure return value, passing // structures to functions copy the whole // structure (call-by-value): struct point2d doublee (struct point2d pt) { pt.x *= 2; pt.y *= 2; return pt; } // A sample call: struct point2d mypt = doublee (initPt); x=3 y=4 x=3 y=4
Structures as Functions Arguments // Like the structure return value, passing // structures to functions copy the whole // structure (call-by-value): struct point2d doublee (struct point2d pt) { pt.x *= 2; pt.y *= 2; return pt; } // A sample call: struct point2d mypt = doublee (initPt); x=6 y=8 x=3 y=4
Structures as Functions Arguments // Like the structure return value, passing // structures to functions copy the whole // structure: struct point2d doublee (struct point2d pt) { pt.x *= 2; pt.y *= 2; return pt; } // A sample call: struct point2d mypt = double (initPt); x=6 y=8 x=6 y=8 x=3 y=4
Moral Structures returned from function and passed to functions are also call-by-value Pros: a simple style of functional programming result in elegant and easy to reason code ideas from functional languages (lisp, ML), but may also be useful in imperative languages (C, Java) Cons: may be too inefficient copy a big value inconsistent with array Next, we’d see a more imperative style Update in place
Pointers to Structures // Pointers to structures are just like pointers // to any other kind of data: struct point2d *pt; // declares pt to be a pointer to a structure // ponit2d, which looks like: // To reference x and y, we use: (*pt).x; (*pt).y; // or a more concise notation: pt -> x; pt -> y; pt x=3 y=4
Structure Pointers as Functions Arguments // Address passing: void doublee (struct point2d *pt) { (*pt).x *= 2; (*pt).y *= 2; return; } // A sample call (no return value): doublee (&initPt); pt x=3 y=4
Structure Pointers as Functions Arguments // Address passing: void doublee (struct point2d *pt) { (*pt).x *= 2; (*pt).y *= 2; return; } // A sample call: doublee (&initPt); pt x=6 y=8
Or // Address passing: void doublee (struct point2d * pt) { pt->x *= 2; pt->y *= 2; return; } // A sample call: doublee (&initPt); pt x=6 y=8
Self-referential Structures // With structures pointer mechanism, we may // write self-referential structures (structures // fields point to same type of structures): struct node { int data; struct node *next; }; data p data p data p
Union A union is a variable that may hold (at different times) objects of different types and sizes compilers take care of the space allocation, alignment, etc. Unions provide a way to manipulate different kinds of data in a single area of storage
Union // To declare a union type, we use: union intOrArray { int i; int a[2]; } // which declares intOrArray to have two fields: // integer i and int array a of length 2. i, a[0] a[1]
Union // To declare a union type, we use: union intOrArray { int i; int a[2]; }; // which declares intOrArray to have two fields: // integer i and int array a of length 2. union intOrArray u; u.a[0] = 88; u.a[1] = 99; // u.i = ??? i, a[0] a[1]
Union // To declare a union type, we use: union intOrArray { int i; int a[2]; }; // which declares intOrArray to have two fields: // integer i and int array a of length 2. union intOrArray u; u.i = 77;; // u.a[2] = ??? i, a[0] a[1]
Union Union gives you a magical bag to let you bypass the C’s type system As we’ve seen, we may store an int, but take out an array It’s the programmers’ responsibility to ensure that union data is consistent But what if the union value is written by others or from a library, which we may know nothing about?
Tagged Union // In order to distinguish union state, we // annotate union with tags: struct ss { enum {INT, ARRAY} tag; union intOrArray int i; int a[2]; } u; }; i, a[0] a[1] tag
Tagged Union // And for variable u: struct ss temp; // we may assign temp’s fields (note the tag): temp.u.i = 99; temp.tag = INT; // data reading is guarded: if (INT == temp.tag) printf (“%d”, temp.u.i); else if (ARRAY == temp.tag) printf (“%d, %d”, temp.u.a[0], temp.u.a[1]); else printf (“impossible\n”); i, a[0] a[1] tag
typedef---Define Our Own Types And it’s rather stupid and annoying to always write like this: struct point2d pt; struct point2d *pp; And some types are rather crazy: int (*f[10])(int, int); int (*f(char))(int, int); Is there some better methods? Yes! It’s the “typedef”
typedef---Define Our Own Types // C has a mechanism called “typedef” allowing us // to define our own types (abbreviations). // For instance: typedef struct point2d pt2d; // define “pt2d” to be a type equivalent to // struct point2d. And in the following, “point” // could be used just as any other types: pt2d pt; pt.x = 3; pt.y = 4;
typedef---Define Our Own Types // In essence, not only structures can be // typedefed, but also any type name, even the // pre-defined ones in C, as typedef int size; size i; i = 99; typedef size size2; size2 j; j = i;
typedef---Define Our Own Types // More example of typedef: typedef int *ptrToInt; typedef int (*ptrToFunc); typedef int func(int); typedef double a[5]; typedef int a[]; // Exercises: explain the meanings of variables: ptrToInt p; func f; ptrToFunc *g; a *arr[10];
Summary of Typedefs Typedef doesn’t create any new type name it just adds a shorthand for known types Typedef is an important mechanism: to make modular programming easy to enable information hiding But the physical representation is visible So it’s weak