More on Data Structures in C CS-2301 D-term More on Data Structures in C CS-2301 System Programming D-term 2009 (Slides include materials from The C Programming Language, 2 nd edition, by Kernighan and Ritchie and from C: How to Program, 5 th and 6 th editions, by Deitel and Deitel)
More on Data Structures in C CS-2301 D-term Linked List Review Linear data structure Easy to grow and shrink Easy to add and delete items Time to search for an item – O(n)
More on Data Structures in C CS-2301 D-term Linked List (continued) payload next payload next payload next payload next struct listItem *head;
More on Data Structures in C CS-2301 D-term Doubly-Linked List (review) prevnext payload prevnext payload prevnext payload prevnext payload struct listItem *head, *tail;
More on Data Structures in C CS-2301 D-term AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; if (p -> next) p->next->prev = new; new -> prev = p; p -> next = new; }
More on Data Structures in C CS-2301 D-term AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; if (p -> next) p->next->prev = new; new -> prev = p; p -> next = new; } prevnext payload prevnext payload prevnext payload
More on Data Structures in C CS-2301 D-term AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; if (p -> next) p->next->prev = new; new -> prev = p; p -> next = new; } prevnext payload prevnext payload prevnext payload
More on Data Structures in C CS-2301 D-term AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; if (p -> next) p->next->prev = new; new -> prev = p; p -> next = new; } prevnext payload prevnext payload prevnext payload
More on Data Structures in C CS-2301 D-term AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; if (p -> next) p->next->prev = new; new -> prev = p; p -> next = new; } prevnext payload prevnext payload prevnext payload
More on Data Structures in C CS-2301 D-term deleteNext(item *p) Simple linked list {if (p->next != NULL) p->next = p->next-> next; } Doubly-linked list Complicated Easier to deleteItem
More on Data Structures in C CS-2301 D-term deleteItem(item *p) Simple linked list Not possible without having a pointer to previous item! Doubly-linked list {if(p->next != NULL) p->next->prev = p->prev; if(p->prev != NULL) p->prev->next = p->next; } prevnext payload prevnext payload prevnext payload
More on Data Structures in C CS-2301 D-term deleteItem(item *p) Simple linked list Not possible without having a pointer to previous item! Doubly-linked list {if(p->next != NULL) p->next->prev = p->prev; if(p->prev != NULL) p->prev->next = p->next; } prevnext payload prevnext payload prevnext payload
More on Data Structures in C CS-2301 D-term deleteItem(item *p) Simple linked list Not possible without having a pointer to previous item! Doubly-linked list {if(p->next != NULL) p->next->prev = p->prev; if(p->prev != NULL) p->prev->next = p->next; } prevnext payload prevnext payload prevnext payload
More on Data Structures in C CS-2301 D-term Special Cases of Linked Lists Queue:– –Items always added to tail –Items always removed from head Stack:– –Items always added to head –Items always removed from head Singly-linked list works okay Need pointers to head and tail Singly-linked list works okay Only need pointer to head
More on Data Structures in C CS-2301 D-term Bubble Sort a Linked List item *BubbleSort(item *p) { if (p->next != NULL) { item *q = p->next, *qq = p; for (;q != NULL; qq = q, q = q->next) if (p->payload > q->payload){ /*swap p and q */ } p->next = BubbleSort(p->next); }; return p; }
More on Data Structures in C CS-2301 D-term Bubble Sort a Linked List item *BubbleSort(item *p) { if (p->next != NULL) { item *q = p->next, *qq = p; for (;q != NULL; qq = q, q = q->next) if (p->payload > q->payload){ item *temp = p->next; p->next = q->next; q->next = temp; qq->next = p; p = q; } p->next = BubbleSort(p->next); }; return p; }
More on Data Structures in C CS-2301 D-term Bubble Sort a Linked List item *BubbleSort(item *p) { if (p->next != NULL) { item *q = p->next, *qq = p; for (;q != NULL; qq = q, q = q->next) if (p->payload > q->payload){ item *temp = p->next; p->next = q->next; q->next = temp; qq->next = p; p = q; } p->next = BubbleSort(p->next); }; return p; } Head of (sub)list being sortedPointer to step thru (sub)list Pointer to item previous to q in (sub)list
More on Data Structures in C CS-2301 D-term Potential Exam Questions Analyze BubbleSort to determine if it is correct, and fix it if incorrect. Hint: you need to define “correct” Hint2: you need to define a loop invariant to convince yourself Draw a diagram showing the nodes, pointers, and actions of the algorithm
More on Data Structures in C CS-2301 D-term Observations:– What is the order (Big-O notation) of the Bubble Sort algorithm? Answer: O(n 2 ) Note that Quicksort is faster – O(n log n) on average Pages 87 & 110 in Kernighan and Ritchie Potential exam question:– why?
More on Data Structures in C CS-2301 D-term Questions?
More on Data Structures in C CS-2301 D-term Binary Tree (review) A linked list but with two links per item struct treeItem { type payload; treeItem *left; treeItem *right; }; leftright payload leftright payload leftright payload leftright payload leftright payload leftright payload leftright payload
More on Data Structures in C CS-2301 D-term Binary Trees (continued) Two-dimensional data structure Easy to grow and shrink Easy to add and delete items at leaves More work needed to insert or delete branch nodes Search time is O(log n) If tree is reasonably balanced Degenerates to O(n) in worst case if unbalanced
More on Data Structures in C CS-2301 D-term Order of Traversing Binary Trees In-order Traverse left sub-tree (in-order) Visit node itself Traverse right sub-tree (in-order) Pre-order Visit node first Traverse left sub-tree Traverse right sub-tree Post-order Traverse left sub-tree Traverse right sub-tree Visit node last
More on Data Structures in C CS-2301 D-term Order of Traversing Binary Trees In-order Traverse left sub-tree (in-order) Visit node itself Traverse right sub-tree (in-order) Pre-order Visit node first Traverse left sub-tree Traverse right sub-tree Post-order Traverse left sub-tree Traverse right sub-tree Visit node last Programming Assignment #6
More on Data Structures in C CS-2301 D-term Example of Binary Tree x = (a.real*b.imag - b.real*a.imag) / sqrt(a.real*b.real – a.imag*b.imag) = x/ sqrt - **.. arealbimag.. brealaimag - …
More on Data Structures in C CS-2301 D-term Question What kind of traversal order is required for this expression? In-order? Pre-order? Post-order?
More on Data Structures in C CS-2301 D-term Binary Trees in Compilers Used to represent the structure of the compiled program Optimizations Common sub-expression detection Code simplification Loop unrolling Parallelization Reductions in strength – e.g., substituting additions for multiplications, etc. Many others
More on Data Structures in C CS-2301 D-term Questions about Trees? or about Programming Assignment 6?
More on Data Structures in C CS-2301 D-term New Challenge What if we require a data structure that has to be accessed by value in constant time? I.e., O(log n) is not good enough! Need to be able to add or delete items Total number of items unknown But an approximate maximum might be known
More on Data Structures in C CS-2301 D-term Examples Anti-virus scanner Symbol table of compiler Virtual memory tables in operating system Bank account for an individual
More on Data Structures in C CS-2301 D-term Observation Arrays provide constant time access … … but you have to know which element you want! We only know the contents of the item we want! Also Not easy to grow or shrink Not open-ended Can we do better?
More on Data Structures in C CS-2301 D-term Answer – Hash Table Definition:– Hash Table A data structure comprising an array (for constant time access) A set of linked lists (one list for each array element) A hashing function to convert search key to array index
More on Data Structures in C CS-2301 D-term Definition Search key:– a value stored as (part of) the payload of the item you are looking for Need to find the item containing that value (i.e., key)
More on Data Structures in C CS-2301 D-term Answer – Hash Table Definition:– Hash Table A data structure comprising an array (for constant time access) A set of linked lists (one list for each array element) A hashing function to convert search key to array index Definition:– Hashing function (or simply hash function) A function that takes the search key in question and “randomizes” it to produce an index So that non-randomness of keys avoids concentration of too many elements around a few indices in array See §6.6 in Kernighan & Ritchie
More on Data Structures in C CS-2301 D-term data next Hash Table Structure item... data next data next data next data next data next data next data next data next data next data next data next data next
More on Data Structures in C CS-2301 D-term Guidelines for Hash Tables Lists from each item should be short I.e., with short search time (approximately constant) Size of array should be based on expected # of entries Err on large side if possible Hashing function Should “spread out” the values relatively uniformly Multiplication and division by prime numbers usually works well
More on Data Structures in C CS-2301 D-term Example Hashing Function P. 144 of K & R #define HASHSIZE 101 unsigned int hash(char *s) { unsigned int hashval; for (hashval = 0; *s != ‘\0’; s++) hashval = *s + 31 * hashval; return hashval % HASHSIZE }
More on Data Structures in C CS-2301 D-term Example Hashing Function P. 144 of K & R #define HASHSIZE 101 unsigned int hash(char *s) { unsigned int hashval; for (hashval = 0; *s != ‘\0’; s++) hashval = *s + 31 * hashval; return hashval % HASHSIZE } Note choice of prime numbers to “mix it up”
More on Data Structures in C CS-2301 D-term Using a Hash Table struct item *lookup(char *s) { struct item *np; for (np = hashtab[hash(s)]; np != NULL; np = np -> next) if (strcmp(s, np->data) == 0) return np; /*found*/ return NULL;/* not found */ }
More on Data Structures in C CS-2301 D-term Using a Hash Table struct item *lookup(char *s) { struct item *np; for (np = hashtab[hash(s)]; np != NULL; np = np -> next) if (strcmp(s, np->data) == 0) return np; /*found*/ return NULL;/* not found */ } Hash table is indexed by hash value of s
More on Data Structures in C CS-2301 D-term Using a Hash Table struct item *lookup(char *s) { struct item *np; for (np = hashtab[hash(s)]; np != NULL; np = np -> next) if (strcmp(s, np->data) == 0) return np; /*found*/ return NULL;/* not found */ } Traverse the linked list to find item s
More on Data Structures in C CS-2301 D-term Using a Hash Table (continued) struct item *addItem(char *s, …) { struct item *np; unsigned int hv; if ((np = lookup(s)) == NULL) { np = malloc(item); /* fill in s and data */ np -> next = hashtab[hv = hash(s)]; hashtab[hv] = np; }; return np; }
More on Data Structures in C CS-2301 D-term Using a Hash Table (continued) struct item *addItem(char *s, …) { struct item *np; unsigned int hv; if ((np = lookup(s)) == NULL) { np = malloc(item); /* fill in s and data */ np -> next = hashtab[hv = hash(s)]; hashtab[hv] = np; }; return np; } Inserts new item at head of the list indexed by hash value
More on Data Structures in C CS-2301 D-term Hash Table Summary Widely used for constant time access Easy to build and maintain There exist an art and science to the choice of hashing functions Consult textbooks, web, etc.
More on Data Structures in C CS-2301 D-term Questions?