Maps and Dictionaries Data Structures and Algorithms CS 244 Brent M. Dingle, Ph.D. Department of Mathematics, Statistics, and Computer Science University of Wisconsin – Stout Based on the book: Data Structures and Algorithms in C++ (Goodrich, Tamassia, Mount) Some content from Data Structures Using C++ (D.S. Malik)
Previously Priority Queues nodes have keys and data Min PQs and Max PQs PQ Sorting Implemented using an unsorted list (selection sort) Implemented using a sorted list (insertion sort) Implemented using a Heap (heap sort) Heaps 2 Major properties: Structure and Order Complete Binary Tree Min Heaps and Max Heaps Huffman Encoding Uses Priority Queues and Trees Trees Binary Search Trees (BSTs) AVL Trees (height-balanced trees) Decision Trees
Ponder In October of 1976 I observed that a certain algorithm – parallel reduction – was associated with monoids: collections of elements with an associative operation. That observation led me to believe that it is possible to associate every useful algorithm with a mathematical theory and that such association allows for both widest possible use and meaningful taxonomy. As mathematicians learned to lift theorems into their most general settings, so I wanted to lift algorithms and data structures. – Alex Stepanov, inventor of the STL. [Ste07] Sounds complicated Probably difficult But it is not It’s just an abstract thought Sounds complicated Probably difficult But it is not It’s just an abstract thought
Context Take things you have seen done Relate them to abstract data types We are Generalizing Generalize the algorithm to work with any data type Save coding time Implement the algorithm so it works with (most) any data type Save execution time Apply the right data type for the right problem
Context Take things you have seen done Relate them to abstract data types We are Generalizing Generalize the algorithm to work with any data type Save coding time Implement the algorithm so it works with (most) any data type Save execution time Apply the right data type for the right problem General Algorithm: Hammer Nail (hammering device, n = depth) Pick up [hammering device] Position Nail While nail depth < n Hit nail with [hammering device] General Algorithm: Hammer Nail (hammering device, n = depth) Pick up [hammering device] Position Nail While nail depth < n Hit nail with [hammering device] Algorithm: Hammer Nail (wrench, n) Pick up wrench Position Nail While nail depth < n Hit nail with wrench Algorithm: Hammer Nail (wrench, n) Pick up wrench Position Nail While nail depth < n Hit nail with wrench O(1) O(n) O(n 2 ) Takes n swings of wrench to drive nail 1 unit deep, So to achieve depth n requires n*n time Takes n swings of wrench to drive nail 1 unit deep, So to achieve depth n requires n*n time Algorithm: Hammer Nail (hammer, n) Pick up hammer Position Nail While nail depth < n Hit nail with hammer Algorithm: Hammer Nail (hammer, n) Pick up hammer Position Nail While nail depth < n Hit nail with hammer O(1) O(n) Takes 1 swing of hammer to drive nail 1 unit deep, So to achieve depth n requires 1*n time Takes 1 swing of hammer to drive nail 1 unit deep, So to achieve depth n requires 1*n time O(n) O(n 2 ) O(n)
More Context To generalize algorithms we generalize data types Abstract Data Types Example We have 2 data types: linked list and arrays We want to make an algorithm to work with either Can we? Design an algorithm to work with a Sequence ADT Turns out linked lists and arrays both fit the criteria of sequence
Searching Continues into the Future Searching is a common problem/task to solve/perform The efficiency of the algorithm depends on the data structure As seen with priority queue SORTING (unordered, ordered, heap) We will look at Maps and Dictionaries and Hash Tables each is a “list of keys” with data each has a relation to a SEARCH method We will then consider 3 types of searching Linear Search (seen this before) Binary Search (seen this before) Hashing (new)
Marker Slide General Questions? Next Maps Dictionaries and eventually Linear Searching Binary Searching Hash Tables Relation to the book Map ADT (§9.1) Dictionary ADT (§9.5) Hash Tables (§9.2) Ordered Maps (§9.3)
Map ADT The map ADT models a searchable collection of key-element items The main operations of a map are searching, inserting, and deleting items Multiple items with the same key are not allowed Applications: – address book – mapping host names (e.g., cs16.net) to internet addresses (e.g., ) Map ADT methods: – find(k): if M has an entry with key k, return an iterator p referring to this element, else, return special end iterator. – put(k, v): if M has no entry with key k, then add entry (k, v) to M, otherwise replace the value of the entry with v; return iterator to the inserted/modified entry – erase(k) or erase(p): remove from M entry with key k or iterator p; An error occurs if there is no such element. – size(), isEmpty()
Map Example: Direct Address Table A direct address table is a map in which The keys are in the range {0, 1, 2, …, N-1} Stored in an array of size N: T[0,N-1] Item with key k stored in T[k] Performance: insertItem, find, and removeElement all take O(1) time requires space O(N), independent of n = the number of items stored in the map The direct address table is not space efficient unless the range of the keys is close to the number of elements to be stored in the map: i.e. unless n is close to N. Think ARRAY data type The index is now the value of the key associated with the data Really fast search times O(1) but every piece of data must have an index permanently associated with it so lots of wasted space… if lots of data items/keys not feasible to use Think ARRAY data type The index is now the value of the key associated with the data Really fast search times O(1) but every piece of data must have an index permanently associated with it so lots of wasted space… if lots of data items/keys not feasible to use
Marker Slide Questions on: Maps Next Dictionaries
Dictionary ADT The dictionary ADT models a searchable collection of key- element items The main difference from a map is that dictionaries allow multiple items with the same key Any data structure that supports a dictionary also supports a map Applications: – Dictionary that has multiple definitions for the same word Dictionary ADT methods: – find(k): if the dictionary has an entry with key k, returns an iterator p to an arbitrary element – findAll(k): Return iterators (b,e) s.t. that all entries with key k are between them – insert(k, v): insert entry (k, v) into D, return iterator to it – erase(k), erase(p): remove arbitrary entry with key k or entry referenced by iterator p. Error occurs if there is no such entry – begin(), end(): return iterator to first or just beyond last entry of D – size(), isEmpty()
Dictionary/Map Example: Log File (unordered sequence implementation) A log file is a dictionary implemented by means of an unsorted sequence We store the items of the dictionary in a sequence (based on doubly-linked lists or circular arrays), in arbitrary order Performance: insert takes O(1) time since we can insert the new item at the beginning or at the end of the sequence find and erase take O(n) time since in the worst case (item is not found) we traverse the entire sequence to find the item with the given key Space - can be O(n), where n is the number of elements in the dictionary The log file is effective only for dictionaries of small size or for dictionaries on which insertions are the most common operations, while searches and removals are rarely performed Example: historical record of logins to a workstation Think doubly-linked list data type using unsorted data This is where LINEAR search comes into play. O(n) time, but no huge memory waste If keys are mapped to only one data item, this can be a ‘map’ example too Think doubly-linked list data type using unsorted data This is where LINEAR search comes into play. O(n) time, but no huge memory waste If keys are mapped to only one data item, this can be a ‘map’ example too
Ordered Map/Dictionary In an ordered Map, we wish to perform the usual map operations, but also maintain an order relation for the keys in the dictionary. Naturally supports – Look-Up Tables - store dictionary in a vector by non-decreasing order of the keys – Binary Search Ordered Dictionary ADT: – In addition to the generic dictionary ADT, the ordered dictionary ADT supports the following functions: – closestBefore(k): return the position of an item with the largest key less than or equal to k – closestAfter(k): return the position of an item with the smallest key greater than or equal to k
Dictionary/Map Example: Lookup Table (ordered/sorted sequence implementation) A lookup table is a dictionary implemented by means of a sorted sequence – We store the items of the dictionary in an array-based sequence, sorted by key – We use an external comparator for the keys Performance: – find takes O(log n) time, using binary search – insertItem takes O(n) time since in the worst case we have to shift n 2 items to make room for the new item – removeElement take O(n) time since in the worst case we have to shift n 2 items to compact the items after the removal The lookup table is effective only for dictionaries of small size or for dictionaries on which searches are the most common operations, while insertions and removals are rarely performed – Example: credit card authorizations Think dynamic array data type using sorted data This is where BINARY search comes into play. O(lg n) time, and no huge memory waste If keys are mapped to only one data item, this can be a ‘map’ example too Think dynamic array data type using sorted data This is where BINARY search comes into play. O(lg n) time, and no huge memory waste If keys are mapped to only one data item, this can be a ‘map’ example too
Example of Ordered Map: Binary Search Binary search performs operation find(k) on a dictionary implemented by means of an array-based sequence, sorted by key similar to the high-low game at each step, the number of candidate items is halved terminates after a logarithmic number of steps Example: find(7) m l h m l h m l h l m h
Summary If you can do something using ONE data type and want to do the same thing with a different data type Are the two data types similar enough to generalize? generalize the data type and thus generalize the algorithm? or generalize the algorithm for a generalized data type? Algorithm: Drive Car Algorithm: Drive Pickup Truck How different are the two algorithms? How different are Car and Truck? Algorithm: Drive Car Algorithm: Drive Pickup Truck How different are the two algorithms? How different are Car and Truck?
The End of This Part End