CS 221 Analysis of Algorithms Data Structures Dictionaries, Hash Tables, Ordered Dictionary and Binary Search Trees.

Slides:



Advertisements
Similar presentations
© 2004 Goodrich, Tamassia Hash Tables
Advertisements

The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
© 2004 Goodrich, Tamassia Hash Tables1  
© 2004 Goodrich, Tamassia Dictionaries   
Chapter 9: Maps, Dictionaries, Hashing Nancy Amato Parasol Lab, Dept. CSE, Texas A&M University Acknowledgement: These slides are adapted from slides provided.
9.3 The Dictionary ADT    … 01/25/98, Taipei, … 03/04/98, Yunlin, … 02/15/98, Douliou, … 01/20/98,
Dictionaries1 © 2010 Goodrich, Tamassia m l h m l h m l.
Maps. Hash Tables. Dictionaries. 2 CPSC 3200 University of Tennessee at Chattanooga – Summer 2013 © 2010 Goodrich, Tamassia.
Data Structures Lecture 12 Fang Yu Department of Management Information Systems National Chengchi University Fall 2010.
Log Files. O(n) Data Structure Exercises 16.1.
Dictionaries and Their Implementations
© 2004 Goodrich, Tamassia Dictionaries   
CSC401 – Analysis of Algorithms Lecture Notes 5 Heaps and Hash Tables Objectives: Introduce Heaps, Heap-sorting, and Heap- construction Analyze the performance.
Dictionaries and Hash Tables1  
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Dictionaries 4/17/2017 3:23 PM Hash Tables  
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Maps, Dictionaries, Hashing
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Dictionaries and Hash Tables. Dictionary A dictionary, in computer science, implies a container that stores key-element pairs called items, and allows.
Hash Tables1   © 2010 Goodrich, Tamassia.
Dictionaries and Hash Tables1 Hash Tables  
CS 221 Analysis of Algorithms Ordered Dictionaries and Search Trees.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
© 2004 Goodrich, Tamassia Hash Tables1  
CS201: Data Structures and Discrete Mathematics I Hash Table.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++,
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Algorithms Design Fall 2016 Week 6 Hash Collusion Algorithms and Binary Search Trees.
Hash Tables 1/28/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Hashing (part 2) CSE 2011 Winter March 2018.
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
© 2013 Goodrich, Tamassia, Goldwasser
Dictionaries 9/14/ :35 AM Hash Tables   4
Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Data Structures and Database Applications Hashing and Hashtables in C#
Hash Table.
Dictionaries < > = Dictionaries Dictionaries
Data Structures and Database Applications Hashing and Hashtables in C#
Hash Tables 11/22/2018 3:15 AM Hash Tables  1 2  3  4
Dictionaries and Hash Tables
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Dictionaries < > = /9/2018 3:06 AM Dictionaries
Dictionaries and Hash Tables
Dictionaries 1/17/2019 7:55 AM Hash Tables   4
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Dictionaries < > = /17/2019 4:20 PM Dictionaries
Hash Tables Computer Science and Engineering
Hash Tables Computer Science and Engineering
Hash Tables Computer Science and Engineering
Dictionaries 4/5/2019 1:49 AM Hash Tables  
CS210- Lecture 17 July 12, 2005 Agenda Collision Handling
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Dictionaries < > = Dictionaries Dictionaries
Dictionaries and Hash Tables
Presentation transcript:

CS 221 Analysis of Algorithms Data Structures Dictionaries, Hash Tables, Ordered Dictionary and Binary Search Trees

Reading material  Goodrich and Tamassia, 2002 Chapter 2, section 2.5,pages see also section 2.6 Chapter 3, section 3.1 pages

Dictionaries  Remember Stacks and queues  LIFO, FIFO vectors and lists  Ranks and positions

Dictionaries  Dictionaries containers for storing, retrieving, modifying collections of objects We call the objects – items Items = (key, element) pair Access to item based on key –D(k,e) Can the element by the key?  D(e,e)

Dictionary ADT  The dictionary ADT models a searchable collection of key-element items  The main operations of a dictionary are searching, inserting, and deleting items  Multiple items with the same key are allowed  Applications: address book credit card authorization mapping host names (e.g., cs16.net) to internet addresses (e.g., )

Dictionary ADT  Dictionary ADT methods: findElement(k): if the dictionary has an item with key k, returns its element, else, returns the special element NO_SUCH_KEY insertItem(k, o): inserts item (k, o) into the dictionary removeElement(k): if the dictionary has an item with key k, removes it from the dictionary and returns its element, else returns the special element NO_SUCH_KEY size(), isEmpty() keys(), Elements()

Dictionaries  Unordered Dictionary Log files (or audit trails) Hash tables

Dictionaries  Log files vector or list store items as unsorted sequence Think of –  error log  text messages  …

Dictionaries  Log files – Performance insertItem(k,e)  very good – always insert items at end (or beginning of sequence)  O(1) findElement(k)  must scan sequence to look for key  worst case – unsuccessful O(n) removeElement(k)?

Dictionaries  Logfiles Good for applications like –  small unstructured sequences findElement(k), removeItem(k) times are trivial  Almost always insertItem rarely findElement, removeItem

Hash Tables  It is convenient to think of keys to be address of items in some sense.  In an sequence implemented as a vector- it would be nice if key = rank …but this often does not work  …consider student ID # as a key, as a rank

Hash Tables  Another strategy is to create a hash table  Essentially a hash table is a container where you can compute the address of the item based on the key

Hash Tables  We need two things a vector for holding items a hash function to compute the index of the vector based on key So, suppose we have a  vector container A  hash function on k h(k)  items are not stored at A[k], but  at A[h(k)]

Hash Functions and Hash Tables (§2.5.2)  A hash function h maps keys of a given type to integers in a fixed interval [0, N  1]  Example: h(x)  x mod N is a hash function for integer keys  The integer h(x) is called the hash value of key x  A hash table for a given key type consists of Hash function h Array (called table) of size N  When implementing a dictionary with a hash table, the goal is to store item (k, o) at index i  h(k)

Example  We design a hash table for a dictionary storing items (SSN, Name), where SSN (social security number) is a nine-digit positive integer  Our hash table uses an array of size N  10,000 and the hash function h(x)  last four digits of x     …

Hash Function  Hash function then is a function that maps a set of keys to a set of integers that represent the set of ranks (addresses) in our vector …but this usually requires to steps

Hash Functions (§ 2.5.3)  A hash function is usually specified as the composition of two functions: Hash code map: h 1 : keys  integers Compression map: h 2 : integers  [0, N  1]  The hash code map is applied first, and the compression map is applied next on the result, i.e., h(x) = h 2 (h 1 (x))  The goal of the hash function is to “disperse” the keys in an apparently random way

Hash functions  Hash code maps there are many – see Goodrich and Tamassia

Hash functions  Hash compression map to map the output of the hash code map (some set of integers) to integers in the range of [0,…n-1]  corresponds to set of ranks in vector container

Hash Tables  What happens with hash function values are not unique collisions collisions mean that two item has the same hash code which means that two items will have the same location in the vector Can we do that?

Hash Tables  Collision handling Chaining Open addressing

Hash Tables – Collision Handing  Chaining with a vector of simple items we can insert an item where one exists without destroying the existing one So, what can we do?

Hash Tables – Collision Handing  Chaining Suppose that each element of the dictionary D[i] is itself an unordered sequence The algorithm is  calc hash code  check hash table entry D[h(k)] if not occupied insertItem if occupied insertItem anyway, but after existing in same location of D Called Separate Chaining Rule

Collision Handling (§ 2.5.5)  Collisions occur when different elements are mapped to the same cell  Chaining: let each cell in the table point to a linked list of elements that map there  Chaining is simple, but requires additional memory outside the table   

Example  You create a vector for storing WVU student SSN data— vector has 100 slots (rank 0 – 99) you have 100 students with 9 digit SSNs Your hash code mapping function is  h1(SSN) = int(left3Characters(SSN)) Your hash compression mapping function is  h2(h1(SSN) = h1(SSN) mod 100

Hash Tables – collision handling  So what happens if a lot of keys  generate the same hash code?  …they pile up  Loading Factor want to keep loading factor low

Hash Tables – Collision Handing Open Addressing  Linear Probing Vector holds one item per rank  if rank at D[h(k)] is occupied, …  check D[h(k)+1], if unoccupied insertItem  if not, check D[h(k)+2], …

Linear Probing (§2.5.5)  Open addressing: the colliding item is placed in a different cell of the table  Linear probing handles collisions by placing the colliding item in the next (circularly) available table cell  Each table cell inspected is referred to as a “probe”  Colliding items lump together, causing future collisions to cause a longer sequence of probes  Example: h(x)  x mod 13 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order

Hash Tables – Collision Handing Open Addressing  Linear Probing causes lumping  alternatives quadratic probing double hashing

Performance of Hashing  In the worst case, searches, insertions and removals on a hash table take O(n) time  The worst case occurs when all the keys inserted into the dictionary collide  The load factor   n  N affects the performance of a hash table  Assuming that the hash values are like random numbers, it can be shown that the expected number of probes for an insertion with open addressing is 1  (1   )  The expected running time of all the dictionary ADT operations in a hash table is O(1)  In practice, hashing is very fast provided the load factor is not close to 100%  Applications of hash tables: small databases compilers browser caches