© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

The ADT Hash Table What is a table?
Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
CS202 - Fundamental Structures of Computer Science II
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Log Files. O(n) Data Structure Exercises 16.1.
Hashing Techniques.
Dictionaries and Their Implementations
1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Data Structures Using Java1 Chapter 8 Search Algorithms.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
Data Structures Using Java1 Chapter 8 Search Algorithms.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Hash Tables.  What can we do if we want rapid access to individual data items?  Looking up data for a flight in an air traffic control system  Looking.
Comp 335 File Structures Hashing.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Data Structure & Algorithm Lecture 8 – Hashing JJCAO Most materials are stolen from Prof. Yoram Moses’s course.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Hashing (part 2) CSE 2011 Winter March 2018.
Hashing.
Data Structures Using C++ 2E
Data Abstraction & Problem Solving with C++
School of Computer Science and Engineering
Slides by Steve Armstrong LeTourneau University Longview, TX
Data Structures Using C++ 2E
Hash functions Open addressing
Advanced Associative Structures
Hash Table.
Hash Tables.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
Resolving collisions: Open addressing
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
Advanced Implementation of Tables
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Presentation transcript:

© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables

© 2006 Pearson Addison-Wesley. All rights reserved13 A-2 Balanced Search Trees The efficiency of the binary search tree implementation of the ADT table is related to the tree’s height –Height of a binary search tree of n items Maximum: n Minimum:  log 2 (n + 1)  Height of a binary search tree is sensitive to the order of insertions and deletions Variations of the binary search tree –Can retain their balance despite insertions and deletions

© 2006 Pearson Addison-Wesley. All rights reserved13 A-3 Hashing –Enables access to table items in time that is relatively constant and independent of the items Hash function –Maps the search key of a table item into a location that will contain the item Hash table –An array that contains the table items, as assigned by a hash function

© 2006 Pearson Addison-Wesley. All rights reserved13 A-4 Hashing A perfect hash function –Maps each search key into a unique location of the hash table –Possible if all the search keys are known Collisions –Occur when the hash function maps more than one item into the same array location Collision-resolution schemes –Assign locations in the hash table to items with different search keys when the items are involved in a collision Requirements for a hash function –Be easy and fast to compute –Place items evenly throughout the hash table

© 2006 Pearson Addison-Wesley. All rights reserved13 A-5 Hash Functions It is sufficient for hash functions to operate on integers Simple hash functions that operate on positive integers –Selecting digits –Folding –Modulo arithmetic Converting a character string to an integer –If the search key is a character string, it can be converted into an integer before the hash function is applied

© 2006 Pearson Addison-Wesley. All rights reserved13 A-6 Resolving Collisions Two approaches to collision resolution –Approach 1: Open addressing A category of collision resolution schemes that probe for an empty, or open, location in the hash table –The sequence of locations that are examined is the probe sequence Linear probing –Searches the hash table sequentially, starting from the original location specified by the hash function –Possible problem »Primary clustering

© 2006 Pearson Addison-Wesley. All rights reserved13 A-7 Linear Probing Example h(k, i) = (h(k) + i ) mod m (i is probe number, initially, i = 0) Insert keys: (in that order) If a collision occurs, when j = h(k), we try next at A[(j+1)mod m], then A[(j+2)mod m], and so on. When an empty position is found the item is inserted. Linear probing is easy to implement, but leads to clustering (long run of occupied slots). Clusters can lead to poor performance, both for inserting and finding keys. How many collisions occur in this case? h(k) = k mod 13 m = 13 11

© 2006 Pearson Addison-Wesley. All rights reserved13 A-8 Resolving Collisions Approach 1: Open addressing (Continued) –Quadratic probing Searches the hash table beginning with the original location that the hash function specifies and continues at increments of 1 2, 2 2, 3 2, and so on Possible problem –Secondary clustering –Double hashing Uses two hash functions Searches the hash table starting from the location that one hash function determines and considers every n th location, where n is determined from a second hash function Increasing the size of the hash table –The hash function must be applied to every item in the old hash table before the item is placed into the new hash table

© 2006 Pearson Addison-Wesley. All rights reserved13 A-9 Quadratic Probing Insert keys: (in that order) How many collisions occur in this case? % 13 = 5 (collision), next try: (5 + 2   1 2 ) % 13 = 10 h(k) = k mod 13 m = 13 c 1 = 2, c 2 = % 13 = 5 (collision), next try: (5 + 2   1 2 ) % 13 = 10 % 13 = 10 (collision) next try: (5 + 2   2 2 ) % 13 = 21 % 13 = 8 73 % 13 = 8 (collision), next try: (8 + 2  1+ 3  1 2 ) % 13 = 0 h(k,i) = (h(k) + c 1  i + c 2  i 2 ) mod m

© 2006 Pearson Addison-Wesley. All rights reserved13 A-10 Double Hashing Example h 1 (K) = K mod m h 2 (K) = K mod (m – 1) The i th probe is h(k, i) = (h 1 (k) + h 2 (k)  i) mod m we want h 2 to be an offset to add Insert keys: (in that order) How many collisions occur in this case? 44 % 13 = 5 (collision), next try: (5 + (44 % 12)) % 13 = 13 % 13 = % 13 = 5 (collision), next try: (5 + (31 % 12)) % 13 = 12 % 13 = 12 m = 13

© 2006 Pearson Addison-Wesley. All rights reserved13 A-11 Resolving Collisions Approach 2: Restructuring the hash table –Changes the structure of the hash table so that it can accommodate more than one item in the same location –Buckets Each location in the hash table is itself an array called a bucket –Separate chaining Each hash table location is a linked list

© 2006 Pearson Addison-Wesley. All rights reserved13 A-12 The Efficiency of Hashing An analysis of the average-case efficiency of hashing involves the load factor –Load factor  Ratio of the current number of items in the table to the maximum size of the array table Measures how full a hash table is Should not exceed 2/3 –Hashing efficiency for a particular search also depends on whether the search is successful Unsuccessful searches generally require more time than successful searches

© 2006 Pearson Addison-Wesley. All rights reserved13 A-13 The Efficiency of Hashing Figure The relative efficiency of four collision-resolution methods

© 2006 Pearson Addison-Wesley. All rights reserved13 A-14 What Constitutes a Good Hash Function? A good hash function should –Be easy and fast to compute –Scatter the data evenly throughout the hash table Issues to consider with regard to how evenly a hash function scatters the search keys –How well does the hash function scatter random data? –How well does the hash function scatter nonrandom data? General requirements of a hash function –The calculation of the hash function should involve the entire search key –If a hash function uses modulo arithmetic, the base should be prime

© 2006 Pearson Addison-Wesley. All rights reserved13 A-15 Summary Hashing as a table implementation calculates where the data item should be rather than search for it