Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.
Hash Tables.
HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
Part II Chapter 8 Hashing Introduction Consider we may perform insertion, searching and deletion on a dictionary (symbol table). Array Linked list Tree.
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
File Processing - Indirect Address Translation MVNC1 Hashing Indirect Address Translation Chapter 11.
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Hashing21 Hashing II: The leftovers. hashing22 Hash functions Choice of hash function can be important factor in reducing the likelihood of collisions.
CSE 250: Data Structures Week 12 March 31 – April 4, 2008.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by:  Search the hash table in.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Hash Table March COP 3502, UCF.
Searching Chapter 2.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
IT 60101: Lecture #151 Foundation of Computing Systems Lecture 15 Searching Algorithms.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Algorithm Course Dr. Aref Rashad February Algorithms Course..... Dr. Aref Rashad Part: 4 Search Algorithms.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
Comp 335 File Structures Hashing.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
File Processing - Hash File Considerations MVNC1 Hash File Considerations.
Hashing as a Dictionary Implementation Chapter 19.
Hash Tables - Motivation
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Been-Chian Chien, Wei-Pang Yang, and Wen-Yang Lin 8-1 Chapter 8 Hashing Introduction to Data Structure CHAPTER 8 HASHING 8.1 Symbol Table Abstract Data.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Hashing. Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Data Structures Using C++ 2E
Hashing - resolving collisions
Hash Tables (Chapter 13) Part 2.
Data Structures Using C++ 2E
Hash tables Hash table: a list of some fixed size, that positions elements according to an algorithm called a hash function … hash function h(element)
Advanced Associative Structures
Hash Table.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
CS202 - Fundamental Structures of Computer Science II
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
CSE 373: Data Structures and Algorithms
Presentation transcript:

Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010

Hashing 1.What is Hashing? 2.Problems in hashing 3.Collision Resolution Strategies

1. What is Hashing? Hashing is a quick and efficient searching technique. Hashing is a quick and efficient searching technique. So far, efficiency of search depended on the number of comparisons So far, efficiency of search depended on the number of comparisons In hashing the keys themselves point directly to records by applying a hashing function. In hashing the keys themselves point directly to records by applying a hashing function. All possible key values are mapped into in the hash table. All possible key values are mapped into in the hash table. The hashing function is used for search as well as for storing. The hashing function is used for search as well as for storing.

1. What is Hashing? The hash table is sequential and contiguous. The hash table is sequential and contiguous. Each slot is called a bucket. Each slot is called a bucket. Buckets may hold more than one key. Buckets may hold more than one key.

1. What is Hashing? Hashing methods: Hashing methods: Direct and Subtraction Direct and Subtraction Modulo-division (or division remainder) using list size ( prime, why?) Modulo-division (or division remainder) using list size ( prime, why?) Digit extraction Digit extraction Midsquare Midsquare Folding ( fold shift, fold boundary) Folding ( fold shift, fold boundary) Pseudo random ( seed) Pseudo random ( seed)

Hashing 1.What is Hashing? 2.Problems in hashing 3.Collision Resolution Strategies

Problems in Hashing Collision occurs whenever a hash function maps two distinct keys to the same bucket. Collision occurs whenever a hash function maps two distinct keys to the same bucket. The hashing function must generate bucket addresses quickly and efficiently, with minimum collisions. The hashing function must generate bucket addresses quickly and efficiently, with minimum collisions. As the domain of keys is usually larger than the number of buckets collisions are very likely to happen no matter how efficient the hashing function is. As the domain of keys is usually larger than the number of buckets collisions are very likely to happen no matter how efficient the hashing function is.

Hashing 1.What is Hashing? 2.Problems in hashing 3.Collision Resolution Strategies

Definitions: Definitions:  Load factor = list size/num of elements in list = list size/num of elements in list  Clustering ( primary, secondary)

3. Collision Resolution Strategies Open Addressing: (using prime area) Open Addressing: (using prime area) Probing (Linear, quadratic) Probing (Linear, quadratic) Double Hashing Double Hashing Pseudo-random Pseudo-random Key offset Key offset Linked Lists (Separate Chaining) Linked Lists (Separate Chaining) (Bucket Hashing) (Bucket Hashing) Re-hashing Re-hashing

3. Collision Resolution Strategies Open Addressing: Open Addressing:  Probing: Linear Probing: Search at constant intervals from collision (typically 1) Linear Probing: Search at constant intervals from collision (typically 1) Quadratic Probing: Search at quad- ratically increasing intervals, i.e. collision function f(i) = i 2 ; i.e. on collision searching 1 st, 4 th, 9 th, … location Quadratic Probing: Search at quad- ratically increasing intervals, i.e. collision function f(i) = i 2 ; i.e. on collision searching 1 st, 4 th, 9 th, … location

Linear Probing

3. Collision Resolution Strategies Open Addressing: (using prime area) Open Addressing: (using prime area) Probing (Linear, quadratic) Probing (Linear, quadratic) Double Hashing Double Hashing Pseudo-random Pseudo-random Key offset Key offset Linked Lists (Separate Chaining) Linked Lists (Separate Chaining) (Bucket Hashing) (Bucket Hashing) Re-hashing Re-hashing

3. Collision Resolution Strategies Open Addressing Open Addressing  Double Hashing: Apply a second hashing function and probe at the obtained address: hash 2 (x), 2* hash 2 (x), 3* hash 2 (x),...

3. Collision Resolution Strategies Open Addressing: (using prime area) Open Addressing: (using prime area) Probing (Linear, quadratic) Probing (Linear, quadratic) Double Hashing Double Hashing Pseudo-random Pseudo-random Key offset Key offset Linked Lists (Separate Chaining) Linked Lists (Separate Chaining) (Bucket Hashing) (Bucket Hashing) Re-hashing Re-hashing

3. Collision Resolution Strategies Linked lists (Separate Chaining): Linked lists (Separate Chaining): Separate chaining ( may be modified by keeping the chain sorted!) Separate chaining ( may be modified by keeping the chain sorted!) Modified Hash Table (by eliminating the first probe, hence the hash table becomes an array of records instead of an array of pointers to records) Modified Hash Table (by eliminating the first probe, hence the hash table becomes an array of records instead of an array of pointers to records)

Linked List (Separate Chaining)

3. Collision Resolution Strategies Open Addressing: (using prime area) Open Addressing: (using prime area) Probing (Linear, quadratic) Probing (Linear, quadratic) Double Hashing Double Hashing Pseudo-random Pseudo-random Key offset Key offset Linked Lists (Separate Chaining) Linked Lists (Separate Chaining) (Bucket Hashing) (Bucket Hashing) Re-hashing Re-hashing

3. Collision Resolution Strategies Rehashing: Rehashing: When table becomes too full, operations will start taking too long When table becomes too full, operations will start taking too long Solution: Build another hashing table of about double size + associated hashing function and scan down entire original hash table Solution: Build another hashing table of about double size + associated hashing function and scan down entire original hash table successful search unsuccessful search

3. Collision Resolution Strategies Rehashing: Rehashing: When is the table too full ? When is the table too full ? Rehash when table is half full Rehash when table is half full Rehash when an insertion fails Rehash when an insertion fails When table reaches a certain load factor..... best When table reaches a certain load factor..... best

End of Hashing

Probing  Definition: Each calculation of an address and test for success is known as probing

Key offset collision resolution  Offset = key/list size  Address= (Offset + old address) % list size