Download presentation
Presentation is loading. Please wait.
1
Grokking Hash Tables
2
A hash table is… Just a data structure that’s used to establish a mapping between arbitrary objects Not to be confused with Map : An interface that specifies methods that a program could call to get/set/query key/value mappings Basically, defines what mappings are A hash table is one way (but not the only) to make a Map And your MondoHashTable is just implementation of that hash table data struct that happens to match the Map interface
3
Example: /** * MondoHashTable is a hash table based * implementation of the Map interface */ import java.util.Map; public class MondoHashTable implements Map { … }
4
So what’s a mapping? A mapping is just a pair relationship between two things In math, we write a mapping m: x y to denote that m establishes relationships between things of type x and things of type y Big restriction: every unique x must map to a single y Essentially: every left hand side has one and only one right hand side
5
Examples of mappings 37 6.0827(abs(sqrt(37))) 79609 “Prof Lane’s office phone” 123456789 studentRecord (“J. Student”) -6.0827 36.999 studentRecord (“J. Student”) gradeSet() largeFunkyDataObject () otherObject () “nigeria” 114 “cs35” 12 Left side is the key; right side is the value
6
Small integers are easy… 37 6.0827 double[] mapArray=new double[200]; mapArray[37]=Math.sqrt(37);
7
What about non-integers? Desire: have a table that gives quick lookup for arbitrary objects Doesn’t require vast space Answer: introduce an intermediate step Hash function turns key into hash code Use hash code to look up key/val pair in table table[h(key)]=
8
The big picture… “nigeria” MondoHashTable get(“nigeria”) h(“nigeria”) 1945462417 nigeria 114
9
Some practical questions 1.9 billion? Isn’t that a little much?
10
Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length;
11
Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from?
12
Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode()
13
Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially?
14
Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially? Good choice: pick a prime # Ask the user (arg to constructor)
15
Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially? Good choice: pick a prime # Ask the user (arg to constructor) What happens if the table gets too full?
16
Some practical questions 1.9 billion? Isn’t that a little much? A: reduce mod table size int h=hashFunction(Object a) % table.length; Where do the hash functions come from? A: java.lang.Object.hashCode() How big should the table be, initially? Good choice: pick a prime # Ask the user (arg to constructor) What happens if the table gets too full? A: resize it!
17
#1 killer question: Collisions What happens if (a.hashCode()%tSz)==(b.hashCode()%tSz) Depends… If a.equals(b), then these are the same key If not… This is a hash collision Basically, you have two different keys pointing at the same location in the hash table Have to resolve this somehow -- find unique storage for every key and don’t lose anything
18
Collision strategy 1: Chaining Make each cell in the hash table a “bucket” containing multiple key/value pairs h(“nigeria”) nigeria 114 h(“viagra”) viagra 29
19
Collision strategy 2: Open addressing Each cell of the table actually holds a key/value pair When you have a collision, rehash to find a new location for the new pair Linear probing: try next cell in line Quadratic probing: try cell h+1, then h+4, h+9, h+16, h+15 … Double hashing: h(k)=(h 1 (k)+i*h 2 (k)) mod table.size() Repeat probes until you find an empty spot
20
Map.keySet() A common operation on hash tables ( Map s): get all of the keys You’ll probably use this in the “dump” functionality of SpamBGon Map requires: Set keySet(): Returns a set view of the keys contained in this map. What’s a view?
21
Views of data Different interface to the same underlying data Doesn’t copy the data -- just provides new methods for accessing it A “set view of the keys”, then, is an object that behaves like (i.e., implements ) a set, but gives you access to all of the keys of the hash table.
22
The set view picture MondoHashTable keySet() Set size() iterator() contains()
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.