HASHING CSC 172 SPRING 2002 LECTURE 22. Hashing A cool way to get from an element x to the place where x can be found An array [0..B-1] of buckets Bucket.

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

Hash Tables.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CSCE 3400 Data Structures & Algorithm Analysis
Lecture 11 oct 6 Goals: hashing hash functions chaining closed hashing application of hashing.
Hashing as a Dictionary Implementation
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
September 26, Algorithms and Data Structures Lecture VI Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Using arrays – Example 2: names as keys How do we map strings to integers? One way is to convert each letter to a number, either by mapping them to 0-25.
Implementation of Linear Probing (continued) Helping method for locating index: private int findIndex(long key) // return -1 if the item with key 'key'
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
hashing1 Hashing It’s not just for breakfast anymore!
REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Lecture 11 oct 7 Goals: hashing hash functions chaining closed hashing application of hashing.
Hashing General idea: Get a large array
Dictionaries and Hash Tables Cmput Lecture 24 Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture.
L. Grewe. Computing hash function for a string Horner’s rule: (( … (a 0 x + a 1 ) x + a 2 ) x + … + a n-2 )x + a n-1 ) int hash( const string & key )
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
Algorithm Course Dr. Aref Rashad February Algorithms Course..... Dr. Aref Rashad Part: 4 Search Algorithms.
Implementing Dictionaries Many applications require a dynamic set that supports dictionary-type operations such as Insert, Delete, and Search. E.g., a.
1 HashTable. 2 Dictionary A collection of data that is accessed by “key” values –The keys may be ordered or unordered –Multiple key values may/may-not.
Hash Tables1   © 2010 Goodrich, Tamassia.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing as a Dictionary Implementation Chapter 19.
RED-BLACK TREE SEARCH THE FOLLOWING METHOD IS IN TreeMap.java:
CS201: Data Structures and Discrete Mathematics I Hash Table.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Data Structures and Algorithms Lecture (Searching) Instructor: Quratulain Date: 4 and 8 December, 2009 Faculty of Computer Science, IBA.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Chapter 11 Hash Anshuman Razdan Div of Computing Studies
“Never doubt that a small group of thoughtful, committed people can change the world. Indeed, it is the only thing that ever has.” – Margaret Meade Thought.
CSC 172 DATA STRUCTURES. SETS and HASHING  Unadvertised in-store special: SETS!  in JAVA, see Weiss 4.8  Simple Idea: Characteristic Vector  HASHING...The.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Tables. 2 Exercise 2 /* Exercise 1 */ void mystery(int n) { int i, j, k; for (i = 1; i
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
October 6, Algorithms and Data Structures Lecture VII Simonas Šaltenis Aalborg University
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing O(1) data access (almost) -access, insertion, deletion, updating in constant time (on average) but at a price… references: Weiss, Goodrich & Tamassia,
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
CMSC 341 Hashing Readings: Chapter 5. Announcements Midterm II on Nov 7 Review out Oct 29 HW 5 due Thursday CMSC 341 Hashing 2.
1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
1 What is it? A side order for your eggs? A form of narcotic intake? A combination of the two?
CSC 172 DATA STRUCTURES.
Hashing CSE 2011 Winter July 2018.
Algorithms and Data Structures Lecture VI
A Hash Table with Chaining
Hashing in java.util
Collision Handling Collisions occur when different elements are mapped to the same cell.
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Presentation transcript:

HASHING CSC 172 SPRING 2002 LECTURE 22

Hashing A cool way to get from an element x to the place where x can be found An array [0..B-1] of buckets Bucket contains a list of set elements B = number of buckets A hash function that takes potential set elements and produces a “random” integer [0..B-1]

Example If the set elements are integers then the simplest/best hash function is usually h(x) = x % B Suppose B = 6 and we wish to store the integers {70, 53, 99, 94, 83, 76, 64, 30} They belong in the buckets 4, 5, 3, 4, 5, 4, 4, and 0 Note: If B = 7 0,4,1,3,6,6,1,2

Pitfalls of Hash Function Selection We want to get a uniform distribution of elements into buckets Beware of data patterns that cause non-uniform distribution

Example If integers were all even, then B = 6 would cause only bucktes 0,2, and 4 to fill If we hashed words in the the UNIX dictionary into 10 buckets by length of word then 20% go into bucket 7

Dictionary Operations Lookup Go to head of bucket h(x) Search for bucket list. If x is in the bucket Insertion: append if not found Delete – list deletion from bucket list

Analysis If we pick B to be new n, the nubmer of elements in the set, then the average list is O(1) long Thus, dictionary ops take O(1) time Worst case all elements go into one bucket O(n)

Managing Hash Table Size If n gets as high as 2B, create a new hash table with 2B buckets “Rehash” every element into the new table O(n) time total There were at least n inserts since the last “rehash” All these inserts took time O(n) Thus, we “amortize” the cost of rehashing over the inserts since the last rehash Constant factor, at worst So, even with rehashing we get O(1) time ops

Collisions A collision occurs when two values in the set hash to the same value There are several ways to deal with this Chaining (using a linked list or some secondary structure) Open Addressing Double hashing Linear Probing

Chaining  9964   8376  94  53  30  Very efficient Time Wise Other approaches Use less space

Open Addressing When a collision occurs, if the table is not full find an available space Linear Probing Double Hashing

Linear Probing If the current location is occupied, try the next table location LinearProbingInsert(K) { if (table is full) error; probe = h(K); while (table[probe] is occupied) probe = ++probe % M; table[probe] = K; } Walk along table until an empty spot is found Uses less memory than chaining (no links) Takes more time than chaining (long walks) Deleting is a pain (mark a slot as having been deleted)

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7, 6,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7, 6, 5,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7, 6, 5,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7, 6, 5,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7, 6, 5,

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7, 6, 5, 8

Linear Probing h(K) = K % Insert: 18, 41, 22, 59, 32, 31, 73 h(K) : 5, 2, 9, 7, 6, 5, 8 73

Double Hashing If the current location is occupied, try another table location Use two hash functions If M is prime, eventually will examine every location DoubleHashInsert(K) { if (table is full) error; probe = h1(K); offset = h2(K); while (table[probe] is occupied) probe = (probe+offset) % M; table[probe] = K; } Many of the same (dis)advantages as linear probing Distributes keys more evenly than linear probing

Double Hashing h1(K) = K % 13 h1(K) = 8 - K % Insert: 18, 41, 22, 59, 32, 31, 73 h1(K) : 5, 2, 9, 7, 6, 5, 8 h2(K) : 6, 7, 2, 5, 8, 1, 7

Double Hashing h1(K) = K % 13 h1(K) = 8 - K % Insert: 18, 41, 22, 59, 32, 31, 73 h1(K) : 5, 2, 9, 7, 6, 5, 8 h2(K) : 6, 7, 2, 5, 8, 1, 7 31

Double Hashing h1(K) = K % 13 h1(K) = 8 - K % Insert: 18, 41, 22, 59, 32, 31, 73 h1(K) : 5, 2, 9, 7, 6, 5, 8 h2(K) : 6, 7, 2, 5, 8, 1,

Implementing Hash Tables public class HashMap implements Map { private transient Entry table[]; private transient int count; ….

Implementing Hash Tables public class HashMap implements Map { ……. public HashMap(int initialCapacity, float loadFactor) public HashMap(int initialCapacity) public boolean containsValue(Object value) public boolean containsKey(Object key) public Object get(Object key) public Object put(Object value, Object key) public Object remove (Object key)

Constructor public HashMap(int initialCapacity, float loadFactor){ if (initialCapacity < 0) throw new IllegalArgumentException( “Illegal InitialCapacity “ + initalCapacity); if (loadFactor <= 0) throw new IllegalArgumentException( “Illegal loadFactor “ + loadFactor); if (initalCapacity == 0) initalCapacity = 1; this.loadFactor = loadFactor; table = new Entry[initialCapacity]; threshold = (int)(initialCapacity * loadFactor); }// constructor

containsKey() public boolean containsKey(Object key){ Entry tab[] = table; if (key != null) { int hash = key.hashCode(); int index = (hash & 0x7FFFFFFF)% tab.length; for (Entry e = tab[index];e!=null;e=e.next) if (e.hash == hash && key.equals(e.key)) return true; } else { for (Entry e = tab[index];e!=null;e=e.next) if (e.hash == null) return true; } return false; }// method containsKey

put() public Object put(Object key, Object value){ Entry tab[] = table;int hash = 0; int index = 0; if (key != null) { hash = key.hashCode(); index = (hash & 0x7FFFFFFF)% tab.length; for (Entry e = tab[index];e!=null;e=e.next) if (e.hash == hash && key.equals(e.key)){ Object old = e.value; e.value = value; return old; }

put() else { for (Entry e = tab[0];e!=null;e=e.next){ if (e.key == null){ Object old = e.value; e.value = value; return old; } }// key == null

put() modCount++; if (count >= threshold) { rehash(); tab = table; index =(hash & 0x7FFFFFFF)% tab.length; } Entry e = new Entry(hash,key,value,tab[index]); tab[index] = e; count++; return null; }//method put

rehash() private void rehash(){ int oldCapacity = table.length; Entry oldMap[] = table; int newCapacity = oldCapacity * 2 + 1; Entry newMap[] = new Entry[newCapacity]; modCount++; threshold = (int)(newCapacity * loadFactor); table = newMap; for (int I = olcCapacity;I  0;) { for (Entry old = oldMap[i];old!=null;){ Entry e = old; old = old.next; int index =(e.hash & 0x7FFFFFFF)% newCapacity; e.next = newMap[index]; newMap[index] = e; }

remove() public Object remove(Object key){ Entry tab[] = table; if (key != null) { int hash = key.hashCode(); int index = (hash & 0x7FFFFFFF)% tab.length; for (Entry e = tab[index],prev = null;e!=null;prev=e,e=e.next) if (e.hash == hash && key.equals(e.key)){ modCount++; if (prev != null) prev.next = e.next; else tab[index] = e.next; count--; Object oldValue = e.value; e.value = null; return oldValue; }

remove() else { for (Entry e = tab[0],prev = null;e!=null;prev=e,e=e.next){ if (e.key == null){ modCount++; if (prev != null) prev.next = e.next; else tab[0] = e.next; count--; Object oldValue = e.value; e.value = null; return oldValue; } return null; }

Theoretical Results Not FoundFound Chaining Linear Probing Double Hashing

Expected Probes Linear Probing Double Hashing Chaining