Hashing. Hash Tables - Introduction zA structure that offers fast insertion and searching zInsertion and searching is almost O(1) zHashing - a range of.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
Hashing as a Dictionary Implementation
© 2004 Goodrich, Tamassia Hash Tables1  
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing CS 3358 Data Structures.
1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Hash Tables1 Part E Hash Tables  
Hashing General idea: Get a large array
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
COSC 2007 Data Structures II
CS 221 Analysis of Algorithms Data Structures Dictionaries, Hash Tables, Ordered Dictionary and Binary Search Trees.
Hash Table March COP 3502, UCF.
Hashing CS 105. Hashing Slide 2 Hashing - Introduction In a dictionary, if it can be arranged such that the key is also the index to the array that stores.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hash Tables1   © 2010 Goodrich, Tamassia.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
Comp 335 File Structures Hashing.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
© 2004 Goodrich, Tamassia Hash Tables1  
Hash Tables. Introduction A hash table is a data structure that stores things and allows insertions, lookups, and deletions to be performed in O(1) time.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Elementary Data Organization. Outline  Data, Entity and Information  Primitive data types  Non primitive data Types  Data structure  Definition 
Hashing CS 110: Data Structures and Algorithms First Semester,
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Copyright © Curt Hill Hashing A quick lookup strategy.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Data Structure & Algorithm Lecture 8 – Hashing JJCAO Most materials are stolen from Prof. Yoram Moses’s course.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CSC 143T 1 CSC 143 Highlights of Tables and Hashing [Chapter 11 p (Tables)] [Chapter 12 p (Hashing)]
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
CS203 Lecture 14. Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive.
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Lecture No.43 Data Structures Dr. Sohail Aslam.
Review Graph Directed Graph Undirected Graph Sub-Graph
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
© 2013 Goodrich, Tamassia, Goldwasser
Dictionaries 9/14/ :35 AM Hash Tables   4
Hash Table.
Chapter 28 Hashing.
Hash Tables.
Chapter 21 Hashing: Implementing Dictionaries and Sets
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Dictionaries 1/17/2019 7:55 AM Hash Tables   4
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Hash Tables Computer Science and Engineering
Hashing.
Presentation transcript:

Hashing

Hash Tables - Introduction zA structure that offers fast insertion and searching zInsertion and searching is almost O(1) zHashing - a range of key values is transformed into a range of array index values

Hashing - Introduction zIn a dictionary, if the main key was the array index, searching and inserting items would be very fast. zExample: Empdata[1000], employee database, index=employee number - search for employee with emp. number = Answer: Empdata[500] - Running Time: O(1)

Hash Tables zIn the previous example, it was easy since employee number is an integer. zWhat if the main key is a word in the English Alphabet (i.e. last names) zHow can the main key be mapped into an array index

Hash Tables zSum of Digits Method - map the alphabet A-Z to the numbers 1 to 26 (a=1,b=2,c=3,etc.) - add the total of the letters - For example, “cats” (c=3,a=1,t=20,s=19); =43 -”cats” will be stored using index = 43

Hash Tables zProblem - Too may words with the same index - “was”,”tin”,”give”,”tend”,”moan”,”tick” and several other words add to 43

Hashing zAnother Method (Multiply by Powers) - an integer in the numeric system is in the power of =7x x x =7x x x zCan do the same thing with words - Use 27 as base (26 letters + blank)

Hashing z“cats”=3* * * = 60,337 zunique index for every word zmain drawback : takes too much space - For up to 10 letter words(27 9 ), one 7,000,000,000,000 (7000 gigabytes)

Hashing zWhile the scheme was able to generate unique keys, it assigns spaces to non- words (aaaaaa,zzzzzzz,aaacccc,etc.) zBe able to compress a huge range of numbers from the numbers-multiplied-by powers system into a smaller(reasonably sized) array

Hashing zHashing function - The process of converting a number in a large range into a number in a smaller range.

Hashing z“cats”=3* * * = 60,337 zunique index for every word zmain drawback : takes too much space - For up to 10 letter words(27 9 ), one 7,000,000,000,000 (7000 gigabytes)

Hashing zWhile the scheme was able to generate unique keys, it assigns spaces to non- words (aaaaaa,zzzzzzz,aaacccc,etc.) zBe able to compress a huge range of numbers from the numbers-multiplied-by powers system into a smaller(reasonably sized) array

Hashing zHashing function - The process of converting a number in a large range into a number in a smaller range. zSize of smaller range - twice the size of the data set (2s) - for 50,000 words, array of 100,000 elements

Hashing zHash Function - achieved by using the modulo function (returns the remainder) - for example, 33 mod 10 = 3 - LargeNumber mod Smallrange

Hashing  Hugenumber=C 0 *27 9 +C 1 *27 8 +C 2 *27 7 …. C 9 *27 0 zarraysize = numberofwords * 2 zarrayindex=Hugenumber mod arraysize

Hashing - Collisions zHashing presents the risk of two elements with the same index (although better than sumofdigits). zCollision - two elements with the same index key after hashing

Collisions zTwo approaches to handle collision - Open Addressing - Separate Chaining zOpen Addressing - Finding the next available free cell zSeparate Chaining - install a linked list at each index

Open Addressing zThree Types - Linear Probing, Quadratic Probing, and Double Hashing zLinear Probing - Finding the next available cell (x+1,x+2,etc.) - leads to clustering

Clustering zQuadratic Probing - Finds next available cell using the squares as the step method (x+1,x+4,x+27,etc) zDouble Hash - Hash again using a different hash function to find next free cell - 2nd hash : step size

Separate Chaining zA linked list is installed in the array index such that entries with the same keys are attached to the linked list

Hashing zRead Chapter 5 (Data Structures and Algorithms in C by Weiss) zChapter 7 (in Goodrich and Tamassia Book) zHash Functions implementations are presented in these chapters

Summary Notes

Data Structures and Algorithms zConceptual Approach zJava and C Implementations are presented in both books zFor further studies, focus more on the mathematical aspects (look at theorems & propositions) - Proving

When to use what zGeneral Purpose Data Structures - arrays,linked lists, trees, and hash tables - used to store and retrieve data using key values -applications : can be used for storing personnel records, inventories, contact lists,etc.

General Purpose Data Structures zArrays Best used : - when amount of data is reasonably small - when to amount of data is predictable in advance

General Purpose Data Structures zLinked Lists - when data stored cannot be predicted - when data will be frequently inserted and deleted zBinary Search Trees - used when arrays or linked lists are too slow - O(logN) : insertion,searching, deletion

General Purpose Data Structures zHash Tables - fastest data storage structure - used in spell checkers and as symbol tables in compilers - may require additional memory for open addressing implementations

Special Purpose Data Structures zStacks,Queues (Priority Queues) zused by a computer program to aid in carrying out some algorithm zFor example, in graph algorithms, stack and queues were used zAbstract Data Types - implemented by a more fundamental data structure (array,linked list) - conceptual aids

Special Purpose Data Structures zStacks - used when you want to access the last data inserted (LIFO structure) - implemented using array or linked lists depending on size zQueues - used when you want to access the first data item (FIFO structure)

Graphs zUnique data structure zDirectly model real world situations (maps, flight-airports,etc) zstructure of the graph reflects the structure of the problem zmain choice is representation : adjacent list or adjacency matrix

Sorting zFor limited data elements (up to entries), insertion sort may be sufficient zWhen bogged down, can use merge sort or quick sort (merge sort however requires additional memory)

Sorting - Running Times