Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.

Slides:



Advertisements
Similar presentations
Lists Chapter 8 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Advertisements

The ADT Hash Table What is a table?
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
CSCE 3400 Data Structures & Algorithm Analysis
Skip List & Hashing CSE, POSTECH.
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
CS202 - Fundamental Structures of Computer Science II
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Data Abstraction and Problem Solving with JAVA Walls and Mirrors Frank M. Carrano and Janet J. Prichard © 2001 Addison Wesley Data Abstraction and Problem.
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Queues and Priority Queues
Hashing Techniques.
Dictionaries and Their Implementations
© 2006 Pearson Addison-Wesley. All rights reserved12 A-1 Chapter 12 Tables and Priority Queues.
1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Sorted Lists and Their Implementations Chapter 12 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
COSC 2007 Data Structures II
CS 221 Analysis of Algorithms Data Structures Dictionaries, Hash Tables, Ordered Dictionary and Binary Search Trees.
Hash Table March COP 3502, UCF.
Implementations of the ADT Stack Chapter 7 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Trees Chapter 15 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
Algorithm Efficiency Chapter 10 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
© 2006 Pearson Addison-Wesley. All rights reserved12 A-1 Chapter 12 Tables and Priority Queues.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Hashing Dr. Yingwu Zhu.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Tree Implementations Chapter 16 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
Data Structures and Algorithms Lecture (Searching) Instructor: Quratulain Date: 4 and 8 December, 2009 Faculty of Computer Science, IBA.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
The ADT Table The ADT table, or dictionary Uses a search key to identify its items Its items are records that contain several pieces of data 2 Figure.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Data Structure & Algorithm Lecture 8 – Hashing JJCAO Most materials are stolen from Prof. Yoram Moses’s course.
Heaps Chapter 17 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Queues and Priority Queue Implementations Chapter 14 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hashing (part 2) CSE 2011 Winter March 2018.
Hashing.
Data Abstraction & Problem Solving with C++
School of Computer Science and Engineering
Slides by Steve Armstrong LeTourneau University Longview, TX
Quadratic probing Double hashing Removal and open addressing Chaining
Hash Table.
Hashing as a Dictionary Implementation
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
Double hashing Removal (open addressing) Chaining
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
Advanced Implementation of Tables
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Presentation transcript:

Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Contents The ADT Dictionary Possible Implementations Selecting an Implementation Hashing Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The ADT Dictionary Recall concept of a sort key in Chapter 11 Often must search a collection of data for specific information  Use a search key Applications that require value-oriented operations are frequent Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The ADT Dictionary FIGURE 18-1 A collection of data about certain cities Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012 Consider the need for searches through this data based on other than the name of the city

ADT Dictionary Operations Test whether dictionary is empty. Get number of items in dictionary. Insert new item into dictionary. Remove item with given search key from dictionary. Remove all items from dictionary. Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

ADT Dictionary Operations Get item with a given search key from dictionary. Test whether dictionary contains an item with given search key. Traverse items in dictionary in sorted search-key order. Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

ADT Dictionary View interface, Listing 18-1Listing 18-1 FIGURE 18-2 UML diagram for a class of dictionaries Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.htm code listing files must be in the same folder as the.ppt files for these links to work

Possible Implementations Sorted (by search key), array-based Sorted (by search key), link-based Unsorted, array-based Unsorted, link-based Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Possible Implementations FIGURE 18-3 A dictionary entry Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Possible Implementations FIGURE 18-4 The data members for two sorted linear implementations of the ADT dictionary for the data in Figure 18-1 : (a) array based; (b) link based View header file for class of dictionary entries, Listing 18-2 Listing 18-2 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Possible Implementations FIGURE 18-5 The data members for a binary search tree implementation of the ADT dictionary for the data in Figure 18-1 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Sorted Array-Based Implementation of ADT Dictionary Consider header file for the class ArrayDictionary, Listing 18-3Listing 18-3 Note definition of method add, Listing 18-AListing 18-A  Bears responsibility for keeping the array items sorted Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Binary Search Tree Implementation of ADT Dictionary Dictionary class will use composition  Will have a binary search tree as one of its data members  Reuses the class BinarySearchTree from Chapter 16 View header file, Listing 18-4Listing 18-4 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Selecting an Implementation Reasons for considering linear implementations  Perspective,  Efficiency  Motivation Questions to ask  What operations are needed?  How often is each operation required? Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Selecting an Implementation Three Scenarios  Insertion and traversal in no particular order  Retrieval – consider: Is a binary search of a linked chain possible? How much more efficient is a binary search of an array than a sequential search of a linked chain?  Insertion, removal, retrieval, and traversal in sorted order – add and remove must: Find the appropriate position in the dictionary. Insert into (or remove from) this position. Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Selecting an Implementation FIGURE 18-6 Insertion for unsorted linear implementations: (a) array based; (b) link based Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Selecting an Implementation FIGURE 18-7 Insertion for sorted linear implementations: (a) array based; (b) pointer based Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Selecting an Implementation FIGURE 18-8 The average-case order of the ADT dictionary operations for various implementations Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Hashing Binary search tree retrieval have order O(log 2 n) Need a different strategy to locate an item Consider a “magic box” as an address calculator  Place/retrieve item from that address in an array  Ideally to a unique number for each key Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Hashing FIGURE 18-9 Address calculator Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Hashing Pseudocode for getItem Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Hashing Pseudocode for remove Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Hash Functions Possible algorithms  Selecting digits  Folding  Modulo arithmetic  Converting a character string to an integer Use ASCII values Factor the results, Horner’s rule Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions FIGURE A collision Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions Approach 1: Open addressing  Probe for another available location  Can be done linearly, quadratically  Removal requires specify state of an item Occupied, emptied, removed  Clustering is a problem  Double hashing can reduce clustering Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions FIGURE Linear probing with h ( x ) = x mod 101 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions FIGURE Linear probing with h ( x ) = x mod 101 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions FIGURE Quadratic probing with h ( x ) = x mod 101 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions FIGURE Double hashing during the insertion of 58, 14, and 91 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions Approach 2: Restructuring the hash table  Each hash location can accommodate more than one item  Each location is a “bucket” or an array itself  Alternatively, design the hash table as an array of linked chains – called “separate chaining” Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Resolving Collisions FIGURE Separate chaining Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The Efficiency of Hashing Efficiency of hashing involves the load factor alpha (α) Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The Efficiency of Hashing Linear probing – average value for α Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The Efficiency of Hashing Quadratic probing and double hashing – efficiency for given α Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The Efficiency of Hashing Separate chaining – efficiency for given α Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The Efficiency of Hashing FIGURE The relative efficiency of four collision-resolution methods Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

The Efficiency of Hashing FIGURE The relative efficiency of four collision-resolution methods Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Maintaining Hashing Performance Collisions and their resolution typically cause the load factor α to increase To maintain efficiency, restrict the size of α  α  0.5 for open addressing  α  1.0 for separate chaining If load factor exceeds these limits  Increase size of hash table  Rehash with new hashing function Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

What Constitutes a Good Hash Function? Easy and fast to compute? Scatter data evenly throughout hash table? How well does it scatter random data? How well does it scatter non-random data? Note: traversal in sorted order is inefficient when using hashing Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Hashing and Separate Chaining for ADT Dictionary FIGURE A dictionary entry when separate chaining is used Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

Hashing and Separate Chaining for ADT Dictionary View Listing 18-5, The class HashedEntryListing 18-5 Note the definitions of the add and remove functions, Listing 18-BListing 18-B Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012

End Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012