Collision Resolution: Open Addressing

Slides:



Advertisements
Similar presentations
Chapter 11. Hash Tables.
Advertisements

Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.
Hash Tables.
Skip List & Hashing CSE, POSTECH.
Data Structures Using C++ 2E
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Hashing: Collision Resolution Schemes
Log Files. O(n) Data Structure Exercises 16.1.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by:  Search the hash table in.
11.Hash Tables Hsu, Lih-Hsing. Computer Theory Lab. Chapter 11P Directed-address tables Direct addressing is a simple technique that works well.
1 Introduction to Hashing & Hashing Techniques Review of Searching Techniques Introduction to Hashing Hash Tables Types of Hashing Hash Functions Applications.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Introduction to Hashing & Hashing Techniques
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Collision Resolution: Open Addressing
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Tirgul 9 Hash Tables (continued) Reminder Examples.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
1 Hashing: Collision Resolution Schemes Collision Resolution Techniques Introduction to Separate Chaining Collision Resolution using Separate Chaining.
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Tirgul 8 Hash Tables (continued) Reminder Examples.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Hashing: Collision Resolution Schemes
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
Chapter 8: Memory-Management Strategies 1. Administration n Midterm exams are returned to you l TA will talk about solution and grading guide l Likely.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Introduction to Algorithms 6.046J/18.401J LECTURE7 Hashing I Direct-access tables Resolving collisions by chaining Choosing hash functions Open addressing.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
CISC220 Fall 2009 James Atlas Dec 04: Hashing and Maps K+W Chapter 9.
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
1  H ASH TABLES ARE A COMMON APPROACH TO THE STORING / SEARCHING PROBLEM. Hashing & Hash Tables Dr. Yousef Qawqzeh CSI 312 Data Structures.
Unit 12 Hashing King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
Hashing: Collision Resolution Schemes
Collision Resolution: Open Addressing
Data Structures Using C++ 2E
Hashing Alexandra Stefan.
Hash Tables (Chapter 13) Part 2.
Hashing Alexandra Stefan.
Data Structures Using C++ 2E
Introduction to Hashing & Hashing Techniques
Hash Table.
Hash Table.
Hash In-Class Quiz.
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
Hash Tables – 2 Comp 122, Spring 2004.
Introduction to Hashing & Hashing Techniques
Collision Resolution: Open Addressing
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
Data Structures and Algorithm Analysis Hashing
Hash Tables – 2 1.
Hashing: Collision Resolution Schemes
Collision Resolution: Open Addressing
Collision Resolution: Open Addressing
Presentation transcript:

Collision Resolution: Open Addressing Quadratic Probing Double Hashing Random Probing Informal Analysis of Hashing

Open Addressing: Quadratic Probing Quadratic probing: Attempts to avoid cluster buildup. In this method, c(i) is a quadratic function in i: c(i) = a i2 + bi + c Quadratic probing is usually done using c(i) = i2 . We use the following slight modification to c (i) : c(i)=i2, c(i)= -i2, for i=1,2,…,(n-1)/2 Thus the probe sequence is h(r)+i2, h(r)- i2, for i=1,2,…,(n-1)/2

Example 2: Linear Probing Use the hash function h(r) = r.id % 13 to load the following records into an array of size 13. Al-Otaibi Ziyad 1.73 985926 Al-Turki, Musab Ahmad Bakeer 1.60 970876 Al-Saegh, Radha Mahdi 1.58 980962 Al-Shahrani, Adel Saad 1.80 986074 Al-Awami, Louai Adnan Muhammad 1.73 970728 Al-Amer, Yousuf Jauwad 1.66 994593 Al-Helal, Husain Ali AbdulMohsen 1.70 996321 Then insert the following records quadratic probing to resolve collisions, if any. Al-Najjar, Khaled Ziyad 1.69 987615 Al-Ali, Amr Ali Zaid 1.79 987630 Al-Ramadi, Husam Yahya 1.58 987602

Example 1: Quadratic Probing (cont'd) 0 1 2 3 4 5 6 7 8 9 10 11 12 Husain Yousuf Khalid Louai Ziyad Amr Radha Husam Musab Adel

Quadratic Probing: Concluding Notes In general, quadratic probing improves on linear probing but does not avoid cluster buildup. Gives rise to secondary clusters which are less harmful than the primary clusters in linear probing. Hash table size should not be an even number, therwise Property 2 will not be satisfied. Ideally, table size should be a prime, 4j+3, for an integer j, which guarantees Property 2.

Quadratic Probing: Concluding Notes (cont'd) A disadvantage: colliding keys at a given address tread the same probe sequence. This sequence of locations is called a secondary cluster. Unlike primary clusters, secondary clusters cannot combine into larger secondary clusters. To eliminate secondary clustering, synonyms must have different probe sequences. Double hashing achieves this by having two hash functions that both depend on the hash key.

Open Addressing: Double Hashing Double hashing: Best addresses secondary clustering. Uses two hash functions, h and hp, with the usual probe sequence: hi(r) = (h(r) + i*hp(r)) mod n We see that c(i) = i*hp(r) satisfies Property 2 as well, provided hp(r) and n are relatively prime. To guarantee Property 2, n must be a prime number.

Open Addressing: Double Hashing (cont'd) Using two hash functions can be expensive. In practice, a common definition for hp is : hp(r)= 1 + (r mod (n-1)). Thus, the probe sequence for r hashing to position j is: j, j+1*hp(r), j+2*hp(r), j+3*hp(r), … Notice that if hp(r)=1, the probe sequence for r is the same as linear probing. But if hp(r)=2, the probe sequence examines every other array location.

Example 2: Illustrating Double Hashing Use double hashing to load the following records into a hash table Al-Otaibi Ziyad 1.73 985926 Al-Turki, Musab Ahmad Bakeer 1.60 970876 Al-Saegh, Radha Mahdi 1.58 980962 Al-Shahrani, Adel Saad 1.80 986074 Al-Awami, Louai Adnan Muhammad 1.73 970728 Al-Amer, Yousuf Jauwad 1.66 994593 Al-Helal, Husain Ali AbdulMohsen 1.70 996321 Al-Najjar, Khaled Ziyad 1.69 987615 Al-Ali, Amr Ali Zaid 1.79 987630 Al-Ramadi, Husam Yahya 1.58 987602 Use the following pair of hash functions h(r) = r.id % 13 hp(r) = 1 + (r.id % (n-1)).

Example 2: Animating Double Hashing 0 1 2 3 4 5 6 7 8 9 10 11 12 Husain Yousuf Husam Louai Ziyad Amr Radha Khalid Musab Adel

Open Addressing: Random Probing Random probing: Uses a pseudo-random function to "jump around" in the hash table. Here, c(i) is defined so that it always produces a 'random‘ integer in the range 0, 1, ..., n - 1. The function c(i) can be defined recursively as follows c(0) = 0 c(i+1) = (a*c(i) + 1) mod n We choose a to insure Property 2, too. This recursive definition of c() is a permutation of 0, 1, ..., n - 1 iff a - 1 is a multiple of every prime divisor of n, where 4 is considered as prime.

Example 3: Random Probing Let n = 16, a = 3. We get the sequence for c(i) : 0 1 4 13 8 9 12 5 0 1 4 13 8 9 12 5 0 ... which is not a permutation of 0,1,2, …, 15. The prime divisors of 16 are 2 and 4, so (a-1) must be a multiple of 4. If we choose a = 5, then, we get the sequence 0 1 6 15 12 13 2 11 8 9 14 7 4 5 10 3 0 1 6 ... which is a permutation of 0, 1, 2, …, 15. We could also select a to be 9, 13, 17, etc.

Random Probing: Concluding Remarks For our students records example, n = 13 which has itself as its only prime divisor. We can therefore select a = 14. The probe sequence turns out to be: 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 ... which is the same as in linear probing. Note that the name "random probing" is exaggerative. Random probing eliminates the problem of secondary clusters.

Hashing: Informal Analysis In the best case, successful hash search requires one probe. In the worst case, hash search becomes a sequential search. For the average case, the number of probes depends on the load factor. The load factor in open addressing is less than 1 while it can be greater than 1 in chaining. Note that some methods are worse than others at high load factors. Performance of open addressing methods about the same at low load factors.

Exercises 1. If a hash table is 25% full what is its load factor? 2. Given that, c(i) = i2, for c(i) in quadratic probing, we discussed that this equation does not satisfy Property 2, in general. What cells are missed by this probing formula for a hash table of size 17? Characterise using a formula, if possible, the cells that are not examined by using this function for a hash table of size n. 3. It was mentioned in this session that secondary clusters are less harmful than primary clusters because the former cannot combine to form larger secondary clusters. Use an appropriate hash table of records to exemplify this situation. 4.What value would you select for a given a hash table of size 100?