CSC 2300 Data Structures & Algorithms February 27, 2007 Chapter 5. Hashing.

Slides:



Advertisements
Similar presentations
Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
Advertisements

The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Hashing General idea Hash function Separate Chaining Open Addressing
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing Techniques.
Hashing CS 3358 Data Structures.
Maps, Dictionaries, Hashtables
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Lecture 11 March 5 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
CSE 326: Data Structures Lecture #11 B-Trees Alon Halevy Spring Quarter 2001.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
CS2420: Lecture 33 Vladimir Kulyukin Computer Science Department Utah State University.
Hash Tables1 Part E Hash Tables  
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Hashing General idea: Get a large array
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hashing. Hashing as a Data Structure Performs operations in O(c) –Insert –Delete –Find Is not suitable for –FindMin –FindMax –Sort or output as sorted.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Hashing CS 202 – Fundamental Structures of Computer Science II Bilkent.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
1.  We’ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees.  The implementation of hash tables.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Hash Tables CSIT 402 Data Structures II. Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Ihab Mohammed and Safaa Alwajidi. Introduction Hash tables are dictionary structure that store objects with keys and provide very fast access. Hash table.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Midterm Midterm is Wednesday next week ! The quiz contains 5 problems = 50 min + 0 min more –Master Theorem/ Examples –Quicksort/ Mergesort –Binary Heaps.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
CMSC 341 Hashing Readings: Chapter 5. Announcements Midterm II on Nov 7 Review out Oct 29 HW 5 due Thursday CMSC 341 Hashing 2.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
CE 221 Data Structures and Algorithms
CSE373: Data Structures & Algorithms Lecture 6: Hash Tables
Hashing Alexandra Stefan.
Collision Resolution Neil Tang 02/18/2010
Data Structures – Week #7
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
Advanced Implementation of Tables
EE 312 Software Design and Implementation I
Collision Resolution Neil Tang 02/21/2008
Data Structures – Week #7
Ch Hash Tables Array or linked list Binary search trees
Data Structures and Algorithm Analysis Hashing
EE 312 Software Design and Implementation I
Presentation transcript:

CSC 2300 Data Structures & Algorithms February 27, 2007 Chapter 5. Hashing

Today – Splay Trees and Hashing Splay Trees – Delete a Node Hashing – Overview Hashing – Separate Chaining

Splay Tree – Delete a Node How do we delete a node? First, access the node. This puts the node at the root. After we delete the root, we get T L and T R. Which node should we select as the new root? The largest element in T L. Find this element (easy), and rotate it to the root position of T L. Will this new root of T L have a right child? If not, what can we do?

Examples in Class We will illustrate the idea with a few examples in class:

Zig-zag and Zag-zig The web site uses a different (and perhaps more intuitive) way to describe the actions of Zig-zag and Zag-zig. We will use the definitions in the text. Zig-zag as in text: The web site calls it zag-zig – we do a zag first, and then a zig.

Hashing – Overview

Hash Functions

Integer Input Keys If the input keys are integers, then a good strategy is to compute key mod tableSize. When may this strategy become a bad choice? As an example, let tableSize=10. Can you suggest a sequence of input keys that will all be mapped to the same cell? What is then a good choice for tableSize?

String Input Keys Choose tableSize as a prime number – a good choice according to the previous slide. Since memory is cheap, construct a large table. Let tableSize=10,007. Suppose that all keys are eight or fewer characters long. Use this hash function: What can happen to the hash table? Hint. What is the maximum integer value of an ASCII character?

String Input Keys So, we want hashVal to be large – greater than 10,007. Multiplication may be more appropriate than addition. Try this hash function: This function uses the first three characters of the input string. We can check that 26 x 27 2 > 10,007. Why do we use 26 x 27 2 and not 26 3 ? What may go wrong with distribution in this hash table?

String Input Keys Include also the ten numbers 0 to 9. Get this hash function: This function uses Horner’s rule. What is it?

Horner’s Rule Problem 2.14 in text. Evaluate f(x) = a n x n + a n-1 x n-1 + … + a 2 x 2 + a 1 x + a 0 Code: poly = 0; for( i=n; i>=0; i-- ) poly = x * poly + a[i]; Why does this algorithm work? What is its running time?

Compromises A hash function needs not be the best with respect to table distribution. But it should be simple and reasonably fast. If the keys are very long, the hash function will take too long to compute. A common practice is not to use all the characters. As an example, consider a complete street address. The hash function may include only a couple of characters from the street address, and a couple of characters from the city name and the zip code. The idea is that the time saved in computing the hash function will make up for a slightly less evenly distributed function.

Collision Resolution If, when an element is inserted, it hashes to the same value as an already inserted element, then we have a collision and need to resolve it. There are several methods for dealing with collision, we will discuss the simplest approach: separate chaining.

How to Handle Collisions

Separate Chaining Keep a list of all elements that hash to the same value. Example. The keys are the first 10 perfect squares and the hash function is hash(x) = x mod 10.

Hash Table with Separate Hashing

Example

Discussion Another data structure could be used to resolve the collisions; for example, binary search trees. Why do we use linked lists instead? We define the load factor, λ, of a hash table to be the ratio of the number of elements in the table to the table size. The average length of a list is λ. The effort to perform a search is the constant time required to evaluate the hash function plus the time to traverse the list. In an unsuccessful search, what is the average number of nodes to examine? In a successful search, what is the average number of nodes to examine? Why 1+(λ/2) and not (λ/2)?

Hashing Applet In class, we will run some hashing examples using this web site: