Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

Similar presentations


Presentation on theme: "David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures."— Presentation transcript:

1 David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures

2 David Luebke 2 3/19/2016 Administrivia l Midterm is postponed until Thursday, Oct 26 l Reminder: homework 3 due today n In the CS front office n Due at 5 PM (but don’t risk being there at 4:59!) n Check your e-mail for some clarifications & hints

3 David Luebke 3 3/19/2016 Review: Hash Tables l More formally: n Given a table T and a record x, with key (= symbol) and satellite data, we need to support: u Insert (T, x) u Delete (T, x) u Search(T, x) n Don’t care about sorting the records l Hash tables support all the above in O(1) expected time

4 David Luebke 4 3/19/2016 Review: Direct Addressing l Suppose: n The range of keys is 0..m-1 n Keys are distinct l The idea: n Use key itself as the address into the table n Set up an array T[0..m-1] in which u T[i] = xif x  T and key[x] = i u T[i] = NULLotherwise n This is called a direct-address table

5 David Luebke 5 3/19/2016 Review: Hash Functions l Next problem: collision T 0 m - 1 h(k 1 ) h(k 4 ) h(k 2 ) = h(k 5 ) h(k 3 ) k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys)

6 David Luebke 6 3/19/2016 Review: Resolving Collisions l How can we solve the problem of collisions? l Open addressing n To insert: if slot is full, try another slot, and another, until an open slot is found (probing) n To search, follow same sequence of probes as would be used when inserting the element l Chaining n Keep linked list of elements in slots n Upon collision, just add new element to list

7 David Luebke 7 3/19/2016 Review: Chaining l Chaining puts elements that hash to the same slot in a linked list: —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k1k1 k4k4 —— k5k5 k2k2 k3k3 k8k8 k6k6 k7k7

8 David Luebke 8 3/19/2016 Review: Analysis Of Hash Tables l Simple uniform hashing: each key in table is equally likely to be hashed to any slot l Load factor  = n/m = average # keys per slot n Average cost of unsuccessful search = O(1+α) n Successful search: O(1+ α/2) = O(1+ α) n If n is proportional to m, α = O(1) l So the cost of searching = O(1) if we size our table appropriately

9 David Luebke 9 3/19/2016 Review: Choosing A Hash Function l Choosing the hash function well is crucial n Bad hash function puts all elements in same slot n A good hash function: u Should distribute keys uniformly into slots u Should not depend on patterns in the data l We discussed three methods: n Division method n Multiplication method n Universal hashing

10 David Luebke 10 3/19/2016 Review: The Division Method l h(k) = k mod m n In words: hash k into a table with m slots using the slot given by the remainder of k divided by m l Elements with adjacent keys hashed to different slots: good l If keys bear relation to m: bad l Upshot: pick table size m = prime number not too close to a power of 2 (or 10)

11 David Luebke 11 3/19/2016 Review: The Multiplication Method l For a constant A, 0 < A < 1: l h(k) =  m (kA -  kA  )  l Upshot: n Choose m = 2 P n Choose A not too close to 0 or 1 n Knuth: Good choice for A = (  5 - 1)/2 Fractional part of kA

12 David Luebke 12 3/19/2016 Review: Universal Hashing l When attempting to foil an malicious adversary, randomize the algorithm l Universal hashing: pick a hash function randomly when the algorithm begins (not upon every insert!) n Guarantees good performance on average, no matter what keys adversary chooses n Need a family of hash functions to choose from

13 David Luebke 13 3/19/2016 Review: Universal Hashing l Let  be a (finite) collection of hash functions n …that map a given universe U of keys… n …into the range {0, 1, …, m - 1}. l If  is universal if: n for each pair of distinct keys x, y  U, the number of hash functions h   for which h(x) = h(y) is |  |/m n In other words: u With a random hash function from , the chance of a collision between x and y (x  y) is exactly 1/m

14 David Luebke 14 3/19/2016 Review: A Universal Hash Function l Choose table size m to be prime l Decompose key x into r+1 bytes, so that x = {x 0, x 1, …, x r } n Only requirement is that max value of byte < m n Let a = {a 0, a 1, …, a r } denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1} n Define corresponding hash function h a   : n With this definition,  has m r+1 members

15 David Luebke 15 3/19/2016 Augmenting Data Structures l This course is supposed to be about design and analysis of algorithms l So far, we’ve only looked at one design technique (What is it?)

16 David Luebke 16 3/19/2016 Augmenting Data Structures l This course is supposed to be about design and analysis of algorithms l So far, we’ve only looked at one design technique: divide and conquer l Next up: augmenting data structures n Or, “One good thief is worth ten good scholars”

17 David Luebke 17 3/19/2016 Dynamic Order Statistics l We’ve seen algorithms for finding the ith element of an unordered set in O(n) time l Next, a structure to support finding the ith element of a dynamic set in O(lg n) time n What operations do dynamic sets usually support? n What structure works well for these? n How could we use this structure for order statistics? n How might we augment it to support efficient extraction of order statistics?

18 David Luebke 18 3/19/2016 Order Statistic Trees l OS Trees augment red-black trees: n Associate a size field with each node in the tree x->size records the size of subtree rooted at x, including x itself: M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1

19 David Luebke 19 3/19/2016 Selection On OS Trees M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 How can we use this property to select the ith element of the set?

20 David Luebke 20 3/19/2016 OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); }

21 David Luebke 21 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); }

22 David Luebke 22 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6

23 David Luebke 23 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6 i = 5 r = 2

24 David Luebke 24 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6 i = 5 r = 2 i = 3 r = 2

25 David Luebke 25 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6 i = 5 r = 2 i = 3 r = 2 i = 1 r = 1

26 David Luebke 26 3/19/2016 OS-Select: A Subtlety OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } l What happens at the leaves? l How can we deal elegantly with this?

27 David Luebke 27 3/19/2016 OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } l What will be the running time?

28 David Luebke 28 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 What is the rank of this element?

29 David Luebke 29 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 Of this one? Why?

30 David Luebke 30 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 Of the root? What’s the pattern here?

31 David Luebke 31 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 What about the rank of this element?

32 David Luebke 32 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 This one? What’s the pattern here?

33 David Luebke 33 3/19/2016 OS-Rank OS-Rank(T, x) { r = x->left->size + 1; y = x; while (y != T->root) if (y == y->p->right) r = r + y->p->left->size + 1; y = y->p; return r; } l What will be the running time?

34 David Luebke 34 3/19/2016 OS-Trees: Maintaining Sizes l So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time l Next step: maintain sizes during Insert() and Delete() operations n How would we adjust the size fields during insertion on a plain binary search tree?

35 David Luebke 35 3/19/2016 OS-Trees: Maintaining Sizes l So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time l Next step: maintain sizes during Insert() and Delete() operations n How would we adjust the size fields during insertion on a plain binary search tree? n A: increment sizes of nodes traversed during search

36 David Luebke 36 3/19/2016 OS-Trees: Maintaining Sizes l So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time l Next step: maintain sizes during Insert() and Delete() operations n How would we adjust the size fields during insertion on a plain binary search tree? n A: increment sizes of nodes traversed during search n Why won’t this work on red-black trees?

37 David Luebke 37 3/19/2016 Maintaining Size Through Rotation l Salient point: rotation invalidates only x and y l Can recalculate their sizes in constant time n Why? y 19 x 11 x 19 y 12 rightRotate(y) leftRotate(x) 64 76 47

38 David Luebke 38 3/19/2016 Augmenting Data Structures: Methodology l Choose underlying data structure n E.g., red-black trees l Determine additional information to maintain n E.g., subtree sizes l Verify that information can be maintained for operations that modify the structure n E.g., Insert(), Delete() (don’t forget rotations!) l Develop new operations n E.g., OS-Rank(), OS-Select()

39 David Luebke 39 3/19/2016 The End l Up next: n Interval trees n Review for midterm


Download ppt "David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures."

Similar presentations


Ads by Google