Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hashing Course: Data Structures Lecturer: Uri Zwick March 2008

Similar presentations


Presentation on theme: "Hashing Course: Data Structures Lecturer: Uri Zwick March 2008"— Presentation transcript:

1 Hashing Course: Data Structures Lecturer: Uri Zwick March 2008
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA

2 Hashing Huge universe U Hash function h Collisions 1 m-1 Hash table

3 Hashing with chaining We use a hash table to maintain a dictionary, a data structure that supports insert, delete and find operations. 1 m-1 i

4 What makes a hash function good?
Should behave like a “random function” Should have a succinct representation Should be easy to compute Usually interested in families of hash functions Allows rehashing, changing table size, …

5 Two possible views of hashing
The hash function h is fixed. The keys are random. The hash function h is chosen randomly from a family of hash functions. The keys are fixed. Typical assumption (in both scenarios):

6 Universal families of hash functions
A family H of hash functions from U to [m] is said to be universal if and only if

7 A simple universal family
To represent a function from the family we only need two numbers, a and b. The size m of the hash table is arbitrary.

8 Probabilistic analysis of chaining
n – number of elements in dictionary D m – size of hash table Assume that h is randomly chosen from a universal family H Expected Worst-case Insert O(1) Delete/Find O(1+n/m) O(n)

9 Perfect hashing Suppose that D is static.
We want to implement Find is O(1) worst case time. Perfect hashing: No collisions Can we achieve it?

10 Expected no. of collisions

11 Expected no. of collisions
If we are willing to use m=n2, then any universal family contains a perfect hash function. No collisions!

12 2-level hashing

13 2-level hashing

14 Assume that each hi can be represented using 2 words
2-level hashing Assume that each hi can be represented using 2 words Total size:

15 A randomized algorithm for constructing a perfect 2-level hash table:
Choose a random h from H(n) and compute the number of collisions. If there are more than n collisions, repeat. For each cell i,if ni>1, choose a random hash function from H(ni2). If there are any collisions, repeat. Expected construction time – O(n) Worst case find time – O(1)

16 Hashing with open addressing
Hashing without pointers Insert key k in the first free position among Assumed to be a permutation No room found  Table is full To search, follow the same order

17 Hashing with open addressing

18 Hashing with open addressing
Caution: When we delete elements, do not set the corresponding cells to nil!

19 Probabilistic analysis of open addressing
n – number of elements in dictionary D m – size of hash table =n/m – load factor (Note: 1) Assume that for every k, h(k,0),…,h(k,m-1) is random permutation Expected time of Insert or Find is:

20 Probabilistic analysis of open addressing
Expected time of Insert or Find is: Non rigorous proof: If we choose a random element in the table, the probability that it is full is . The probability that the first i locations in the probe sequence are all occupied is therefore i.

21 Implementation of open addressing
How do we define h(k,i) ? Linear probing: Quadratic probing: Double hashing:


Download ppt "Hashing Course: Data Structures Lecturer: Uri Zwick March 2008"

Similar presentations


Ads by Google