Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grokking Hash Tables. A hash table is… Just a data structure that’s used to establish a mapping between arbitrary objects Not to be confused with Map.

Similar presentations


Presentation on theme: "Grokking Hash Tables. A hash table is… Just a data structure that’s used to establish a mapping between arbitrary objects Not to be confused with Map."— Presentation transcript:

1 Grokking Hash Tables

2 A hash table is… Just a data structure that’s used to establish a mapping between arbitrary objects Not to be confused with Map : An interface that specifies methods that a program could call to get/set/query key/value mappings Basically, defines what mappings are A hash table is one way (but not the only) to make a Map And your MondoHashTable is just implementation of that hash table data struct that happens to match the Map interface

3 Example: /** * MondoHashTable is a hash table based * implementation of the Map interface */ import java.util.Map; public class MondoHashTable implements Map { … }

4 So what’s a mapping? A mapping is just a pair relationship between two things In math, we write a mapping m: x  y to denote that m establishes relationships between things of type x and things of type y Big restriction: every unique x must map to a single y Essentially: every left hand side has one and only one right hand side

5 Examples of mappings 37  6.0827(abs(sqrt(37))) 79609  “Prof Lane’s office phone” 123456789  studentRecord (“J. Student”) -6.0827  36.999 studentRecord (“J. Student”)  gradeSet() largeFunkyDataObject ()  otherObject () “nigeria”  114 “cs35”  12 Left side is the key; right side is the value

6 Small integers are easy… 37 6.0827 double[] mapArray=new double[200]; mapArray[37]=Math.sqrt(37);

7 What about non-integers? Desire: have a table that gives quick lookup for arbitrary objects Doesn’t require vast space Answer: introduce an intermediate step Hash function turns key into hash code Use hash code to look up key/val pair in table table[h(key)]=

8 The big picture… “nigeria” MondoHashTable get(“nigeria”) h(“nigeria”) 1945462417 nigeria 114

9 Some practical questions 1.9 billion? Isn’t that a little much?

10 Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length;

11 Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from?

12 Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode()

13 Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially?

14 Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially? Good choice: pick a prime # Ask the user (arg to constructor)

15 Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially? Good choice: pick a prime # Ask the user (arg to constructor) What happens if the table gets too full?

16 Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially? Good choice: pick a prime # Ask the user (arg to constructor) What happens if the table gets too full? A: resize it!

17 #1 killer question: Collisions What happens if (a.hashCode()%tSz)==(b.hashCode()%tSz) Depends… If a.equals(b), then these are the same key If not… This is a hash collision Basically, you have two different keys pointing at the same location in the hash table Have to resolve this somehow -- find unique storage for every key and don’t lose anything

18 Collision strategy 1: Chaining Make each cell in the hash table a “bucket” containing multiple key/value pairs h(“nigeria”) nigeria 114 h(“viagra”) viagra 29

19 Collision strategy 2: Open addressing Each cell of the table actually holds a key/value pair When you have a collision, rehash to find a new location for the new pair Linear probing: try next cell in line Quadratic probing: try cell h+1, then h+4, h+9, h+16, h+15 … Double hashing: h(k)=(h 1 (k)+i*h 2 (k)) mod table.size() Repeat probes until you find an empty spot

20 Map.keySet() A common operation on hash tables ( Map s): get all of the keys You’ll probably use this in the “dump” functionality of SpamBGon Map requires: Set keySet(): Returns a set view of the keys contained in this map. What’s a view?

21 Views of data Different interface to the same underlying data Doesn’t copy the data -- just provides new methods for accessing it A “set view of the keys”, then, is an object that behaves like (i.e., implements ) a set, but gives you access to all of the keys of the hash table.

22 The set view picture MondoHashTable keySet() Set size() iterator() contains()


Download ppt "Grokking Hash Tables. A hash table is… Just a data structure that’s used to establish a mapping between arbitrary objects Not to be confused with Map."

Similar presentations


Ads by Google