COSC 2007 Data Structures II Chapter 12 Advanced Implementation of Tables III.

Slides:



Advertisements
Similar presentations
COSC2007 Data Structures II Chapter 11 Tables & Priority Queues I.
Advertisements

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Year 6 mental test 5 second questions
Lecture 15 Linked Lists part 2
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
PP Test Review Sections 6-1 to 6-6
Chapter 17 Linked Lists.
Singly Linked Lists What is a singly-linked list? Why linked lists?
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Linked Lists Chapter 4.
Data Structures: A Pseudocode Approach with C
Data Structures ADT List
Chapter 24 Lists, Stacks, and Queues
11 Data Structures Foundations of Computer Science ã Cengage Learning.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 4: Linked Lists Data Abstraction & Problem Solving with.
Data Structures Using C++
Double-Linked Lists and Circular Lists
Chapter 1 Object Oriented Programming 1. OOP revolves around the concept of an objects. Objects are created using the class definition. Programming techniques.
Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach to the.
1 Hash Tables Saurav Karmakar. 2 Motivation What are the dictionary operations? What are the dictionary operations? (1) Insert (1) Insert (2) Delete (2)
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
CSCI 2720 Hashing   Spring 2005.
The ADT Hash Table What is a table?
Briana B. Morrison Adapted from William Collins
Dictionaries and Hash Tables
Analysis of Algorithms CS 477/677
Hash Tables.
Asst. Prof. Dr. İlker Kocabaş Hash Tables. 2 Overview Information Retrieval Binary Search Trees Hashing. Applications. Example. Hash Functions. Hash Tables.
1 Symbol Tables Chapter Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Dictionaries 4/1/2017 2:36 PM Hash Tables  …
Hashing.
Hash Tables Dr. Li Jiang School of Computer Science,
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Backup Slides. An Example of Hash Function Implementation struct MyStruct { string str; string item; };
Overview Hash Table Hash Function Hash table ADT operations
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Analyzing Genes and Genomes
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Foundations of Data Structures Practical Session #7 AVL Trees 2.
Energy Generation in Mitochondria and Chlorplasts
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Lecture 10 Sept 29 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.
COSC 2007 Data Structures II
Hashing as a Dictionary Implementation Chapter 19.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
1 the BSTree class  BSTreeNode has same structure as binary tree nodes  elements stored in a BSTree are a key- value pair  must be a class (or a struct)
CSC 143T 1 CSC 143 Highlights of Tables and Hashing [Chapter 11 p (Tables)] [Chapter 12 p (Hashing)]
Presentation transcript:

COSC 2007 Data Structures II Chapter 12 Advanced Implementation of Tables III

2 Topics Hashing Definition Hash function Key Hash value collision Open hashing

3 Common Problem A common pattern in many programs is to store and look up data Find student record, given ID# Find person address, given phone # Because it is so common, many data structures for it have been investigated

4 Phone Number Problem Problem: phone company wants to implement caller ID. given a phone number (the key), look up persons name or address(the data) lots of phone numbers (P= ) in a given area code only a small fraction of them are in use Nobody has a phone number : or

5 Comparison of Time Complexity (average) Operation Insertion Deletion Search Unsorted ArrayO(1)O(n) O(n) Unsorted reference O(1)O(n) O(n) Sorted Array O(n)O(n) O(logn) Sorted reference O(n)O(n) O(n) BST O(logn)O(logn) O(logn) Can we do better than O(logn)?

6 Can we do better than O(log N)? All previous searching techniques require a specified amount of time (O(logn) or O(n)) Time usually depends on number of elements (n) stored in the table In some situations searching should be almost instantaneous Examples 911 emergency system Air-traffic control system

7 Can we do better than O(log N)? Answer: Yes … sort of, if we're lucky. General idea: take the key of the data record youre inserting, and use that number directly as the item number in a list (array). Search is O(1), but huge amount of space wasted. Null Xu Null NullSub

8 Hashing Basic idea: Don't use the data value directly. Given an array of size B, use a hash function, h(x), which maps the given data record x to some (hopefully) unique index (bucket) in the array. 0 1 h(x) B-1 x h

9 What is Hash Table? The simplest kind of hash table is an array of records. This example has 101 records. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100]

10 What is Hash Table? Each record has a special field, called its key. In this example, the key is a long integer field called Number. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] [ 4 ] Number Queen St. Linda Kim

11 What is Hash Table? The number is person's phone number, and the rest is person name or address. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] [ 4 ] Number

12 What is Hash Table? When a hash table is in use, some spots contain valid records, and other spots are "empty". [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] Number Number Number Number

13 Inserting a New Record? In order to insert a new record, the key must somehow be converted to an array index. The index is called the hash value of the key. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] Number Number Number Number Number

14 Inserting a New Record? Typical way to create a hash value: [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] Number Number Number Number Number (Number mod 101) What is ( mod 101) ?

15 Inserting a New Record? Typical way to create a hash value: [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] Number Number Number Number Number (Number mod 101) What is ( mod 101) ? 3

16 Inserting a New Record? The hash value is used for the location of the new record. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] Number Number Number Number Number [3]

17 Inserting a New Record? The hash value is used for the location of the new record. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] Number Number Number Number Number

18 What is Hashing? What is hashing? Each item has a unique key. Use a large array called a Hash Table. Use a Hash Function. Hashing is like indexing in that it involves associating a key with a relative record address. Hashing, however, is different from indexing in two important ways: With hashing, there is no obvious connection between the key and the location. With hashing two different keys may be transformed to the same address. A Hash function is a function h(K) which transforms a key K into an address.

19 What is Hashing? An address calculator (hashing function) is used to determine the location of the item Address Calculator (Hash function) Array (Hash table) Search key N-1 0

20 What Can Be Hashed? Anything! Can hash on numbers, strings, structures, etc. Java defines a hashing method for general objects which returns an integer value.

21 Where do we use Hashing? Databases (phone book, student name list). Spell checkers. Computer chess games. Compilers.

22 Hashing and Tables Hashing gives us another implementation of Table ADT Hashing operations Initialize all locations in Hash Table are empty. Insert Search Delete Hash the key; this gives an index; use it to find the value stored in the table in O(1) Great improvement over Log N.

23 Hashing Insert pseudocode tableInsert (newItem) i = the array index that the address calculator gives you for the new items search key table[i]=newItem Retrieval pseudocode tableRerieve (searchKey) i = array index for searchKey given by the hash function if (table[i].getKey( ) == searchKey) return table[i] else return null

24 Hashing Deletion pseudocode tableDelete (searchKey) i = array index for searchKey given by the hash function success=(tabke[I].getKey() equals searchKey if (success) Delete the item from table[i] Return success

25 Hash Tables Table size Entries are numbered 0 to TSIZE-1 Mapping Simple to compute Ideally 1-1: not possible Even distribution Main problems Choosing table size Choosing a good hash function What to do on collisions

26 How to choose the Table Size? H (Key) = Key mod TSIZE TSIZE = , TSIZE = 11

27 How to choose a Hashing Function? The hash function we choose depends on the type of the key field (the key we use to do our lookup). Finding a good one can be hard Rule Be easy to calculate. Use all of the key. Spread the keys uniformly.

28 How to choose a Hashing Function? Example: Student Ids (integers) h(idNumber) = idNumber % B eg. h(678921) = % 100 = 21 Names (char strings) h(name) = (sum over the ascii values) % B eg. h(Bill) = ( ) % 101 = 86

29 Collision Here is another new record to insert, with a hash value of 2. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] An array of records... [100] Number Number Number Number Number Number My hash value is [2].

30 What to do on collisions? Open hashing (separate chaining) Close hashing (open address) Linear Probing Quadratic Probing Double hashing

31 Keep a list of all elements that hash to the same value. Open hashing (separate chaining)

32 Open hashing (separate chaining) Secondary Data Structure List Search tree another hash table We expect small collision List Simple Small overhead

33 Operations with Chaining Insert with chaining Apply hash function to get a position. Insert key into the Linked List at this position. Search with chaining Apply hash function to get a position. Search the Linked List at this position.

34 Open hashing (separate chaining) public class ChainNode { Private KeyedItem item; private ChainNode next; public ChainNode(KeyedItem newItem, ChainNode nextNode) { item = newItem; next= nextNode; // set and get methods } } // end of ChainNode

35 Open hashing (separate chaining) public class HashTable { private final int HASH_TABLE_SIZE = 101; // size of hash table private ChainNode [] table; //hash table private int size; //size of hash table public HashTable() { table = new ChainNode [HASH_TABLE_SIZE]; size =0; } public bool tableIsEmpty() { return size ==0;} public int tableLength() { return size;} public void tableInsert(KeyedItem newItem) throws HashException {} public boolean tableDelete(Comparable searchKey) {} public KeyedIten tableRetrieve(Comparable searchKey) {} } // end of hashtable

36 Open hashing (separate chaining) tableInsert(newItem) if (table is not full) { searchKey= the search key of newItem i = hashIndex (searchKey) node= reference to a new node containing newItem node.setNext (table[I]); table[I] = node } else //table full throw new HashException ()

37 Open hashing (separate chaining) tableRetrieve (searchKey) i = hashIndex (searchKey) node= table [I]; while ((node !=null)&& node.getItem().getKey()!= searchKey ) node=getNext () if (node !=null) return node.getITem() else return null

38 Evaluation of Chaining Disadvantages of Chaining More complex to implement. Search and Delete are harder. We need to know: The number of elements in the table (N); the number of buckets (B); the quality of the hash function Worse case (O(n)) for searching Advantage of Chaining Insertions is easy and quick. Allows more records to be stored. The size of table is dynamic