David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.

Slides:



Advertisements
Similar presentations
David Luebke 1 6/7/2014 CS 332: Algorithms Skip Lists Introduction to Hashing.
Advertisements

David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Hash Tables CIS 606 Spring 2010.
CSCE 3400 Data Structures & Algorithm Analysis
September 26, Algorithms and Data Structures Lecture VI Simonas Šaltenis Nykredit Center for Database Research Aalborg University
David Luebke 1 5/22/2015 CS 332: Algorithms Augmenting Data Structures: Interval Trees.
David Luebke 1 5/22/2015 ITCS 6114 Universal Hashing Dynamic Order Statistics.
CSE 2331/5331 Topic 10: Balanced search trees Rotate operation Red-black tree Augmenting data struct.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Data Structures – LECTURE 11 Hash tables
1.1 Data Structure and Algorithm Lecture 9 Hashing Topics Reference: Introduction to Algorithm by Cormen Chapter 12: Hash Tables.
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
Hash Tables How well do hash tables support dynamic set operations? Implementations –Direct address –Hash functions Collision resolution methods –Universal.
11.Hash Tables Hsu, Lih-Hsing. Computer Theory Lab. Chapter 11P Directed-address tables Direct addressing is a simple technique that works well.
CSC 2300 Data Structures & Algorithms February 27, 2007 Chapter 5. Hashing.
Universal Hashing When attempting to foil an malicious adversary, randomize the algorithm Universal hashing: pick a hash function randomly when the algorithm.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Lecture 11: Binary Search Trees Shang-Hua Teng. Data Format KeysEntryKeysSatellite data.
Tirgul 9 Hash Tables (continued) Reminder Examples.
Hash Tables1 Part E Hash Tables  
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Lecture 10: Search Structures and Hashing
Hashing General idea: Get a large array
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
Spring 2015 Lecture 6: Hash Tables
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
IS 2610: Data Structures Searching March 29, 2004.
Implementing Dictionaries Many applications require a dynamic set that supports dictionary-type operations such as Insert, Delete, and Search. E.g., a.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 4: Data Structures.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
David Luebke 1 10/25/2015 CS 332: Algorithms Review For Midterm.
Data Structures Hash Tables. Hashing Tables l Motivation: symbol tables n A compiler uses a symbol table to relate symbols to associated data u Symbols:
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Hashing Hashing is another method for sorting and searching data.
Hashing Amihood Amir Bar Ilan University Direct Addressing In old days: LD 1,1 LD 2,2 AD 1,2 ST 1,3 Today: C
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hashing Fundamental Data Structures and Algorithms Margaret Reid-Miller 18 January 2005.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Introduction to Algorithms 6.046J/18.401J LECTURE7 Hashing I Direct-access tables Resolving collisions by chaining Choosing hash functions Open addressing.
October 6, Algorithms and Data Structures Lecture VII Simonas Šaltenis Aalborg University
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Midterm Midterm is Wednesday next week ! The quiz contains 5 problems = 50 min + 0 min more –Master Theorem/ Examples –Quicksort/ Mergesort –Binary Heaps.
Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Many slides here are based on E. Demaine , D. Luebke slides
Hashing, Hash Function, Collision & Deletion
Hash table CSC317 We have elements with key and satellite data
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Dynamic Order Statistics
Introduction to Algorithms 6.046J/18.401J
CSCE 3110 Data Structures & Algorithm Analysis
Introduction to Algorithms
Introduction to Algorithms
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
CS 5243: Algorithms Hash Tables.
Augmenting Data Structures: Interval Trees
CS 3343: Analysis of Algorithms
Data Structures and Algorithm Analysis Hashing
Presentation transcript:

David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures

David Luebke 2 3/19/2016 Administrivia l Midterm is postponed until Thursday, Oct 26 l Reminder: homework 3 due today n In the CS front office n Due at 5 PM (but don’t risk being there at 4:59!) n Check your for some clarifications & hints

David Luebke 3 3/19/2016 Review: Hash Tables l More formally: n Given a table T and a record x, with key (= symbol) and satellite data, we need to support: u Insert (T, x) u Delete (T, x) u Search(T, x) n Don’t care about sorting the records l Hash tables support all the above in O(1) expected time

David Luebke 4 3/19/2016 Review: Direct Addressing l Suppose: n The range of keys is 0..m-1 n Keys are distinct l The idea: n Use key itself as the address into the table n Set up an array T[0..m-1] in which u T[i] = xif x  T and key[x] = i u T[i] = NULLotherwise n This is called a direct-address table

David Luebke 5 3/19/2016 Review: Hash Functions l Next problem: collision T 0 m - 1 h(k 1 ) h(k 4 ) h(k 2 ) = h(k 5 ) h(k 3 ) k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys)

David Luebke 6 3/19/2016 Review: Resolving Collisions l How can we solve the problem of collisions? l Open addressing n To insert: if slot is full, try another slot, and another, until an open slot is found (probing) n To search, follow same sequence of probes as would be used when inserting the element l Chaining n Keep linked list of elements in slots n Upon collision, just add new element to list

David Luebke 7 3/19/2016 Review: Chaining l Chaining puts elements that hash to the same slot in a linked list: —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k1k1 k4k4 —— k5k5 k2k2 k3k3 k8k8 k6k6 k7k7

David Luebke 8 3/19/2016 Review: Analysis Of Hash Tables l Simple uniform hashing: each key in table is equally likely to be hashed to any slot l Load factor  = n/m = average # keys per slot n Average cost of unsuccessful search = O(1+α) n Successful search: O(1+ α/2) = O(1+ α) n If n is proportional to m, α = O(1) l So the cost of searching = O(1) if we size our table appropriately

David Luebke 9 3/19/2016 Review: Choosing A Hash Function l Choosing the hash function well is crucial n Bad hash function puts all elements in same slot n A good hash function: u Should distribute keys uniformly into slots u Should not depend on patterns in the data l We discussed three methods: n Division method n Multiplication method n Universal hashing

David Luebke 10 3/19/2016 Review: The Division Method l h(k) = k mod m n In words: hash k into a table with m slots using the slot given by the remainder of k divided by m l Elements with adjacent keys hashed to different slots: good l If keys bear relation to m: bad l Upshot: pick table size m = prime number not too close to a power of 2 (or 10)

David Luebke 11 3/19/2016 Review: The Multiplication Method l For a constant A, 0 < A < 1: l h(k) =  m (kA -  kA  )  l Upshot: n Choose m = 2 P n Choose A not too close to 0 or 1 n Knuth: Good choice for A = (  5 - 1)/2 Fractional part of kA

David Luebke 12 3/19/2016 Review: Universal Hashing l When attempting to foil an malicious adversary, randomize the algorithm l Universal hashing: pick a hash function randomly when the algorithm begins (not upon every insert!) n Guarantees good performance on average, no matter what keys adversary chooses n Need a family of hash functions to choose from

David Luebke 13 3/19/2016 Review: Universal Hashing l Let  be a (finite) collection of hash functions n …that map a given universe U of keys… n …into the range {0, 1, …, m - 1}. l If  is universal if: n for each pair of distinct keys x, y  U, the number of hash functions h   for which h(x) = h(y) is |  |/m n In other words: u With a random hash function from , the chance of a collision between x and y (x  y) is exactly 1/m

David Luebke 14 3/19/2016 Review: A Universal Hash Function l Choose table size m to be prime l Decompose key x into r+1 bytes, so that x = {x 0, x 1, …, x r } n Only requirement is that max value of byte < m n Let a = {a 0, a 1, …, a r } denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1} n Define corresponding hash function h a   : n With this definition,  has m r+1 members

David Luebke 15 3/19/2016 Augmenting Data Structures l This course is supposed to be about design and analysis of algorithms l So far, we’ve only looked at one design technique (What is it?)

David Luebke 16 3/19/2016 Augmenting Data Structures l This course is supposed to be about design and analysis of algorithms l So far, we’ve only looked at one design technique: divide and conquer l Next up: augmenting data structures n Or, “One good thief is worth ten good scholars”

David Luebke 17 3/19/2016 Dynamic Order Statistics l We’ve seen algorithms for finding the ith element of an unordered set in O(n) time l Next, a structure to support finding the ith element of a dynamic set in O(lg n) time n What operations do dynamic sets usually support? n What structure works well for these? n How could we use this structure for order statistics? n How might we augment it to support efficient extraction of order statistics?

David Luebke 18 3/19/2016 Order Statistic Trees l OS Trees augment red-black trees: n Associate a size field with each node in the tree x->size records the size of subtree rooted at x, including x itself: M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1

David Luebke 19 3/19/2016 Selection On OS Trees M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 How can we use this property to select the ith element of the set?

David Luebke 20 3/19/2016 OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); }

David Luebke 21 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); }

David Luebke 22 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6

David Luebke 23 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6 i = 5 r = 2

David Luebke 24 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6 i = 5 r = 2 i = 3 r = 2

David Luebke 25 3/19/2016 OS-Select Example l Example: show OS-Select(root, 5): M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } i = 5 r = 6 i = 5 r = 2 i = 3 r = 2 i = 1 r = 1

David Luebke 26 3/19/2016 OS-Select: A Subtlety OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } l What happens at the leaves? l How can we deal elegantly with this?

David Luebke 27 3/19/2016 OS-Select OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } l What will be the running time?

David Luebke 28 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 What is the rank of this element?

David Luebke 29 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 Of this one? Why?

David Luebke 30 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 Of the root? What’s the pattern here?

David Luebke 31 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 What about the rank of this element?

David Luebke 32 3/19/2016 Determining The Rank Of An Element M8M8 C5C5 P2P2 Q1Q1 A1A1 F3F3 D1D1 H1H1 This one? What’s the pattern here?

David Luebke 33 3/19/2016 OS-Rank OS-Rank(T, x) { r = x->left->size + 1; y = x; while (y != T->root) if (y == y->p->right) r = r + y->p->left->size + 1; y = y->p; return r; } l What will be the running time?

David Luebke 34 3/19/2016 OS-Trees: Maintaining Sizes l So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time l Next step: maintain sizes during Insert() and Delete() operations n How would we adjust the size fields during insertion on a plain binary search tree?

David Luebke 35 3/19/2016 OS-Trees: Maintaining Sizes l So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time l Next step: maintain sizes during Insert() and Delete() operations n How would we adjust the size fields during insertion on a plain binary search tree? n A: increment sizes of nodes traversed during search

David Luebke 36 3/19/2016 OS-Trees: Maintaining Sizes l So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time l Next step: maintain sizes during Insert() and Delete() operations n How would we adjust the size fields during insertion on a plain binary search tree? n A: increment sizes of nodes traversed during search n Why won’t this work on red-black trees?

David Luebke 37 3/19/2016 Maintaining Size Through Rotation l Salient point: rotation invalidates only x and y l Can recalculate their sizes in constant time n Why? y 19 x 11 x 19 y 12 rightRotate(y) leftRotate(x)

David Luebke 38 3/19/2016 Augmenting Data Structures: Methodology l Choose underlying data structure n E.g., red-black trees l Determine additional information to maintain n E.g., subtree sizes l Verify that information can be maintained for operations that modify the structure n E.g., Insert(), Delete() (don’t forget rotations!) l Develop new operations n E.g., OS-Rank(), OS-Select()

David Luebke 39 3/19/2016 The End l Up next: n Interval trees n Review for midterm