Download presentation
Presentation is loading. Please wait.
Published byDonald Mason Modified over 9 years ago
1
308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000
2
Dictionary An Abstract class which defines data-structures which support: void put(Object key, Object value) Object get(Object key) void remove(Object key)
3
Implementations for Dictionary? If we use an unsorted linked list: putO( 1 ) getO( n ) removeO( n ) Naïve solution: must search all possibilities
4
Implementations for Dictionary? If we use a binary tree (assume depth = d): putO( 1og d ) getO( log d ) removeO( log d ) Good, unless the tree is unbalanced…
5
Implementations for Dictionary? If we use a heap: putO( log n ) getO( n ) removeO( n ) Insert is easy, but finding arbitrary elements is hard…
6
Implementations for Dictionary? If we use a sorted array: putO(n ) getO( log n ) removeO( n ) Binary search is easy, but lots of copying is needed
7
Implementations for Dictionary? If we use an array with enough space for every possible key (not realistic): putO( 1 ) getO( 1 ) removeO( 1 ) All operations are quick and easy, but requires enormous (i.e. infinite) memory
8
Hashtables We can try to patch this “perfect solution” so that it is feasible.
9
The “Perfect” Solution If we had an array that was infinitely large and each key had it’s own slot, every access would be O( 1 ) [ and we would waste a lot of space on null pointers] 1234 j-1jj+1j+ 2 Key = 3 Key = j … …
10
Hash Function Definition: A hash function is a function which maps keys to a finite range of integers, called hashcodes: f: keys [ 0, (m-1) ]
11
Example Let the keys be non-negative integers: { 0, 1, … } Let the hash function be f(x) = x mod 7 For the keys (4, 15, 26): f(4) = 4 f(15) = 1 f(26) = 5
12
Example Let the keys be non-negative integers: { 0, 1, … } Let the hash function be f(x) = x mod 7 For the keys (4, 15, 26): f(4) = 4 f(15) = 1 f(26) = 5 426 Fits in an array of size 7 15 0123456
13
Collisions Problem: When two or more keys hash to the same slot, there is a possiblity of collision.
14
Open-Addressing A simple way to handle collisions When a collision occurs look for an empty slot elsewhere Some elements may end up in the slot corresponding a different hashcode
15
Linear Probing Find an alternative slot after collision by stepping sequentially through the slots, for example: 42615 0123456 Insert 18 : f(18) = 18 mod 7 = 4 18 Collision in slot 4!
16
Linear Probing Find an alternative slot after collision by stepping sequentially through the slots, for example 42615 0123456 Insert 18 : f(18) = 18 mod 7 = 4 18 Slot 5 is also taken
17
Linear Probing Find an alternative slot after collision by stepping sequentially through the slots, for example 42615 0123456 Insert 18 : f(18) = 18 mod 7 = 4 18 Slot 6 is free
18
Disadvantages In open-addressing, the table can fill up; Must have (n < m) Linear-probing leads to “primary clustering:” A run of filled slots is more likely to receive more collisions Although best-case access is O( 1 ), worst-case access O( m )
19
Chaining A (Better) Solution to Collisions: Use the flexibility of the linked-list, but only when needed, i.e. within a single slot where collisions may occur.
20
Example (chaining) 0123456 15 426 Insert 39 into the previous hashtable:
21
Example (chaining) 0123456 15 426 f(39) = 39 mod 7 = 4collision 39
22
Worst-Case If all elements hash to the same entry we get a linked list: Therefore put, get and remove are O(n) worst-case.
23
Best-Case 0123456 Equal distribution to each slot
24
Best Case Definition: The load factor for a hashtable with n elements hashed into m slots is the average number of elements per slot: = n / m
25
Best Case If every slot contains elements (uniformly distributed hashing): put, get and remove are O( )
26
Best Case If every slot contains elements (uniformly distributed hashing): put, get and remove are O( ) If the number of slots is allowed to grow as O( n ) : = n/m = n /O( n ) = O( 1 ) put, get and remove are O( 1 )
27
Average-Case More realistic analysis involves determination of statistics of the data and how well it will be hashed. Example: hashing olympic years by f(x) = x mod 4 would be a bad idea (always hash to the same slot)
28
Java Hashtable Class Constructor: Hashtable(int initialCapacity, float loadFactor) Default: initialCapacity = 101, loadFactor = 0.75f Collision resolution with chaining
29
Java Hashtable Class hashcode(): defined in java.lang.Object equals(): assumed defined for the entries Keys can be objects of any class provided the following is appropriately defined:
30
Java Hashtable Class Hashtables grow multiplicatively: Put() checks if the hashtable contains more than (m ) elements and if so m 2m+1 Hashtables only grow, never shrink, no matter how many elements you delete
31
Java Hashtable Class Other Features: elements() returns an enumeration of everything in the table. This works by keeping references into the table rather than by copying the table itself.
32
Any questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.