Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 326: Data Structures Lecture #14 Whoa… Good Hash, Man Steve Wolfman Winter Quarter 2000.

Similar presentations


Presentation on theme: "CSE 326: Data Structures Lecture #14 Whoa… Good Hash, Man Steve Wolfman Winter Quarter 2000."— Presentation transcript:

1 CSE 326: Data Structures Lecture #14 Whoa… Good Hash, Man Steve Wolfman Winter Quarter 2000

2 Today’s Outline Discuss the midterm Hashing and Hash Tables

3 Reminder: Dictionary ADT Dictionary operations –create –destroy –insert –find –delete Stores values associated with user-specified keys –values may be any (homogenous) type –keys may be any (homogenous) comparable type Zasha –interesting ID, but not enough ooomph! Bone –More oomph, less high scoring Scrabble action Wolf –the perfect mix of oomph and Scrabble value insert find(Wolf) Darth - formidable Wolf - the perfect mix of oomph and Scrabble value

4 Implementations So Far Unsorted listO(1)O(n)O(n) TreesO(log n)O(log n)O(log n) Special case:O(1)O(1)O(1) integer keys between 0 and k insert delete find How about O(1) insert/find/delete for any key type?

5 Hash Table Goal Andrew … We can do: a[2] = “Andrew” k-1 3 2 1 0 Andrew … We want to do: a[“Steve”] = “Andrew” “Ed” “Brad” “Steve” “Nic” “Zasha”

6 Hash Table Approach But… is there a problem is this pipe-dream? f(x) Zasha Steve Nic Brad Ed

7 Hash Table Dictionary Data Structure Hash function: maps keys to integers –result: can quickly find the right spot for a given entry Unordered and sparse table –result: cannot efficiently list all entries, f(x) Zasha Steve Nic Brad Ed

8 Hash Table Terminology f(x) Zasha Steve Nic Brad Ed hash function collision keys load factor = # of entries in table tableSize

9 Hash Table Code First Pass Value & find(Key & key) { int index = hash(key) % tableSize; return Table[tableSize]; } What should the hash function be? What should the table size be? How should we resolve collisions?

10 A Good Hash Function… …is easy (fast) to compute (O(1) and practically fast). …distributes the data evenly (hash(a) % size  hash(b) % size). …uses the whole hash table (for all 0  k < size, there’s an i such that hash(i) % size = k).

11 Good Hash Function for Integers Choose –tableSize is prime –hash(n) = n Example: –tableSize = 7 insert(4) insert(17) find(12) insert(9) delete(17) 3 2 1 0 6 5 4

12 Good Hash Function for Strings? Let s = s 1 s 2 s 3 s 4 …s 5 : choose –hash(s) = s 1 + s 2 128 + s 3 128 2 + s 4 128 3 + … + s n 128 n Problems: –hash(“really, really big”) = well… something really, really big –hash(“one thing”) % 128 = hash(“other thing”) % 128 Think of the string as a base 128 number.

13 Making the String Hash Easy to Compute Use Horner’s Rule int hash(String s) { h = 0; for (i = s.length() - 1; i >= 0; i--) { h = (s i + 128*h) % tableSize; } return h; }

14 Making the String Hash Cause Few Conflicts Ideas?

15 Good Hashing: Multiplication Method Hash function is defined by size plus a parameter A h A (k) =  size * (k*A mod 1)  where 0 < A < 1 Example: size = 10, A = 0.485 h A (50) =  10 * (50*0.485 mod 1)  =  10 * (24.25 mod 1)  =  10 * 0.25  = 2 –no restriction on size! –if we’re building a static table, we can try several As –more computationally intensive than a single mod

16 Good Hashing: Universal Hash Function Parameterized by prime size and vector: a = where 0 <= a i < size Represent each key as r + 1 integers where k i < size –size = 11, key = 39752 ==> –size = 29, key = “hello world” ==> h a (k) =

17 Universal Hash Function: Example Context: hash strings of length 3 in a table of size 131 let a = h a (“xyz”) = (35*120 + 100*121 + 21*122) % 131 = 129

18 Universal Hash Function Strengths: –works on any type as long as you can form k i ’s –if we’re building a static table, we can try many a’s –a random a has guaranteed good properties no matter what we’re hashing Weaknesses –must choose prime table size larger than any k i

19 Alternate Universal Hash Function Parameterized by k, a, and b: –k * size should fit into an int –a and b must be less than size H k,a,b (x) =

20 Alternate Universe Hash Function: Example Context: hash integers in a table of size 16 let k = 32, a = 100, b = 200 h k,a,b (1000) = ((100*1000 + 200) % (32*16)) / 32 = (100200 % 512) / 32 = 360 / 32 = 11

21 Universal Hash Function Strengths: –if we’re building a static table, we can try many a’s –random a,b has guaranteed good properties no matter what we’re hashing –can choose any size table –very efficient if k and size are powers of 2 Weaknesses –still need to turn non-integer keys into integers

22 Collisions Pigeonhole principle says we can’t avoid all collisions –try to hash without collision m keys into n slots with m > n –try to put 10 pigeons into 5 holes What do we do when two keys hash to the same entry? –open hashing: put little dictionaries in each entry –closed hashing: pick a next entry to try shove extra pigeons in one hole!

23 To Do Form your team and start Project III Read chapter 5 in the book

24 Coming Up More hash tables Disjoint-set union-find ADT Fourth Quiz (February 10 th )


Download ppt "CSE 326: Data Structures Lecture #14 Whoa… Good Hash, Man Steve Wolfman Winter Quarter 2000."

Similar presentations


Ads by Google