Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hashing: is an alternative search technique (earlier we had BST) Motivation: Try to access directly each possible keys! Suggestion: Enumerate possible.

Similar presentations


Presentation on theme: "Hashing: is an alternative search technique (earlier we had BST) Motivation: Try to access directly each possible keys! Suggestion: Enumerate possible."— Presentation transcript:

1 Hashing: is an alternative search technique (earlier we had BST) Motivation: Try to access directly each possible keys! Suggestion: Enumerate possible keys. This means an ordering. HASHING

2 Example: key space: 3 binary digits 1 2 3 4 7 8 N=2³
1 1 1 1 1

3 Problem: in real life this is not possible.
Names : ~20 characters long needs 26²º (26 letters). 2. Digital numbers: 20 digits needs 10²º (0,..,9: 10 letters). Binary Numbers: 20 spaces needs 2²º. => Too many => we cannot consider all.

4 We cannot create such long vectors such as 2²º or…. 26²º!
But:  We don’t need all such long vectors because not all combinations occur in practice! For example: A vocabulary contains maybe 100,000 ~ words. Personal names (memphis). White pages: 300pp*300 ~ 100,000 Or ~ ½ million ²º.

5 Idea: Assume that, There are ~ N keys occur (approx.) Define a vector of length ~ 2N. Assign integers [1,..,2N] to the keys! Look up according to the serial number (integer).

6 A Dynamical System Perspective:
Hashing, Chopping up, Granulating, Coarse graining information! This is done: Continuously, Autonomously, Reliably… in Bio-Systems! (worms, ants….., humans….alike)

7 The Major Challenge Of Life:
How the delicately defined living substances can exist in an infinitely complex world? How animals can survive and succeed? How they separate the important from the useless? =>There is/are mechanisms to complete ‘hashing’ very efficiently and promptly. =>This course is far from that but indicates a few main principles.

8 Major issues in Hashing:
How to assign the hashing? Using the hashing function. 2. Sometimes the hashing function gives the same number to different keys. We have to resolve. _________________________________

9 Complete space Hashing HASH TABLE
K L Eg: 26²º elements 2N

10 Example: dates: 1055, 1492, 1776, 1812, 1918, 1945. Q: What is complete sp? Hash Function: HashCode(x) = (5x mod 8) (hash code) 1776 1055 1492 1812 1945 1918

11 Evaluation of closed address or
chained hashing Costs of Search: Compute hash code I : costs ‘a’. Search through linked list H[i]. Linked lists H[1]………H[h] hashing L1 L2 L3 Lh ~

12 Average total cost of search k: T(n)=a + 1/n (h-1,i=0)(L1 + 1)/2
Worst case:Bad Distribution: All are in the same bucket. Needs n/2 comparisons in average same as search unordered array. Better:Good Distribution: Equally distributed among cells. Load factor  = n/h const. f cells average O(1) computations. Search # [i]

13 Hashing evaluation continued:
For uniform distribution: there is very good performance. But a hashing function is required that gives uniform distribution independently of actual data structure! Randomization: computer pseudo- random generator. Eg: multiplicative congruent.

14 HashCode (K) = (aK) mod h.
Strategy: multiply with constant a and take the modulus (i.e. remainder after division). HashCode (K) = (aK) mod h.

15 Open Address Hashing: this is really dynamic. does not allow collisions as linked lists (before in closed hashing). load factor;  = n/h <1 (if  >= 0.5, array doubling) If there is a collision: Rehashing. Linear Probing. Simple: Rehash (j) = (j+1) mod h (j is the most recent probed location, start with j = i, go until empty cell found. Eg: 6.10)

16 Rehashing: 2. Double Hashing: Rehash (j,d) = (j+d) mod h Here d-increment of rehashing. If d = 1  linear rehashing  it is determined separately.

17 Schedule 18-month schedule highlights Timing
Isolate timing dependencies critical to success Jan Feb Mar Apr May Jun July Sep Oct Nov Dec Task 2 Task 3 Task 4 Task 1 Milestone


Download ppt "Hashing: is an alternative search technique (earlier we had BST) Motivation: Try to access directly each possible keys! Suggestion: Enumerate possible."

Similar presentations


Ads by Google