Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.

Slides:



Advertisements
Similar presentations
Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.
Advertisements

Hash Tables.
CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.
Hashing as a Dictionary Implementation
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Techniques.
Hashing CS 3358 Data Structures.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Hash Tables1 Part E Hash Tables  
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 5. Abstract Data Structures & Algorithms 5.2 Static Data Structures.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
Hash Tables - Motivation
CS201: Data Structures and Discrete Mathematics I Hash Table.
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hashing is a method to store data in an array so that sorting, searching, inserting and deleting data is fast. For this every record needs unique key.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
Nov 22, 2010IAT 2651 Java Collections. Nov 22, 2010IAT 2652 Data Structures  With a collection of data, we often want to do many things –Organize –Iterate.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look.
CSC 143T 1 CSC 143 Highlights of Tables and Hashing [Chapter 11 p (Tables)] [Chapter 12 p (Hashing)]
Hashing.
Data Structures Using C++ 2E
Hashing CSE 2011 Winter July 2018.
Lecture No.43 Data Structures Dr. Sohail Aslam.
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Hash Table.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
EE 312 Software Design and Implementation I
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
What we learn with pleasure we never forget. Alfred Mercier
EE 312 Software Design and Implementation I
Presentation transcript:

Nov 12, 2009IAT 8001 Hash Table Bucket Sort

Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated using the key and a hash function  Hashed key: the result of applying a hash function to a key  Keys and entries are scattered throughout the array Hash Table keyentry Key hash function array index

Nov 12, 2009IAT 8003  insert: calculate place of storage, insert TableNode; O(1)  find: calculate place of storage, retrieve entry; O(1)  remove: calculate place of storage, set it to null; O(1) Hashing keyentry

Nov 12, 2009IAT 8004 Hashing example  10 stock details, 10 table positions keyentry  Stock numbers between 0 and 1000  Use hash function: stock no. / 100  What if we now insert stock no. 350? –Position 3 is occupied: there is a collision  Collision resolution strategy: insert in the next free position (linear probing) 85 85, apples , pears , papaya , guava , oranges  Given a stock number, we find stock by using the hash function again, and use the collision resolution strategy if necessary

Nov 12, 2009IAT 8005 Hashing performance  The hash function –Ideally, it should distribute keys and entries evenly throughout the table –It should minimize collisions, where the position given by the hash function is already occupied  The collision resolution strategy –Separate chaining: chain together several keys/entries in each position –Open addressing: store the key/entry in a different position  The size of the table –Too big will waste memory; too small will increase collisions and may eventually force rehashing (copying into a larger table) –Should be appropriate for the hash function used – and a prime number is best

Nov 12, 2009IAT 8006 Hash function  Truncation –Ignore part of the key and use the rest as the array index (converting non-numeric parts) –A fast technique, but check for an even distribution throughout the table  Folding –Partition the key into several parts and then combine them in any convenient way –Unlike truncation, uses information from the whole key  Modular arithmetic (used by truncation & folding, and on its own) –To keep the calculated table position within the table, divide the position by the size of the table, and take the remainder as the new position

Nov 12, 2009IAT 8007 Hash Function Examples  Truncation: If students have an 9-digit identification number, take the last 3 digits as the table position –e.g becomes 622  Folding: Split a 9-digit number into three 3-digit numbers, and add them –e.g becomes = 1923  Modular arithmetic: If the table size is 1000, the first example always keeps within the table range, but the second example does not (it should be mod 1000) –e.g mod 1000 = 923 (in Java: 1923 % 1000)

Nov 12, 2009IAT 8008 Choosing the table size to minimize collisions  As the number of elements in the table increases, the likelihood of a collision increases - so make the table as large as practical  If the table size is 100, and all the hashed keys are divisible by 10, there will be many collisions! –Particularly bad if table size is a power of a small integer such as 2 or 10  More generally, collisions may be more frequent if: –greatest common divisor (hashed keys, table size) > 1  Therefore, make the table size a prime number (gcd = 1) Collisions may still happen, so we need a collision resolution strategy

Nov 12, 2009IAT 8009 Collision resolution: chaining  Each table position is a linked list  Add the keys and entries anywhere in the list (front easiest)  Advantages over open addressing: –Simpler insertion and removal –Array size is not a limitation (but should still minimize collisions: make table size roughly equal to expected number of keys and entries)  Disadvantage –Memory overhead is large if entries are small key entry No need to change position!

Nov 12, 2009IAT Applications of Hashing  Compilers use hash tables to keep track of declared variables  A hash table can be used for on-line spelling checkers — if misspelling detection (rather than correction) is important, an entire dictionary can be hashed and words checked in constant time  Hash functions can be used to quickly check for inequality — if two elements hash to different values they must be different  Storing sparse data

Nov 12, 2009IAT When not to use hashing?  Hash tables are very good if there is a need for many searches in a reasonably stable table  Hash tables are not so good if there are many insertions and deletions, or if table traversals are needed  If there are more data than available memory then use a tree  Also, hashing is very slow for any operations which require the entries to be sorted –e.g. Find the minimum key

Nov 12, 2009IAT Bucket Sort  For Each item to be sorted, compute –entryIndex = key / tableSize –Chain entries on collision  Result: Each table entry has all the entries in a range of key values  For some problems, this is enough  Collision Detection key entry