Presentation is loading. Please wait.

Presentation is loading. Please wait.

308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.

Similar presentations


Presentation on theme: "308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000."— Presentation transcript:

1 308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000

2 Dictionary An Abstract class which defines data-structures which support: void put(Object key, Object value) Object get(Object key) void remove(Object key)

3 Implementations for Dictionary? If we use an unsorted linked list: putO( 1 ) getO( n ) removeO( n ) Naïve solution: must search all possibilities

4 Implementations for Dictionary? If we use a binary tree (assume depth = d): putO( 1og d ) getO( log d ) removeO( log d ) Good, unless the tree is unbalanced…

5 Implementations for Dictionary? If we use a heap: putO( log n ) getO( n ) removeO( n ) Insert is easy, but finding arbitrary elements is hard…

6 Implementations for Dictionary? If we use a sorted array: putO(n ) getO( log n ) removeO( n ) Binary search is easy, but lots of copying is needed

7 Implementations for Dictionary? If we use an array with enough space for every possible key (not realistic): putO( 1 ) getO( 1 ) removeO( 1 ) All operations are quick and easy, but requires enormous (i.e. infinite) memory

8 Hashtables We can try to patch this “perfect solution” so that it is feasible.

9 The “Perfect” Solution If we had an array that was infinitely large and each key had it’s own slot, every access would be O( 1 ) [ and we would waste a lot of space on null pointers] 1234 j-1jj+1j+ 2 Key = 3 Key = j    … …

10 Hash Function Definition: A hash function is a function which maps keys to a finite range of integers, called hashcodes: f: keys [ 0, (m-1) ]

11 Example Let the keys be non-negative integers: { 0, 1, … } Let the hash function be f(x) = x mod 7 For the keys (4, 15, 26): f(4) = 4 f(15) = 1 f(26) = 5

12 Example Let the keys be non-negative integers: { 0, 1, … } Let the hash function be f(x) = x mod 7 For the keys (4, 15, 26): f(4) = 4 f(15) = 1 f(26) = 5 426 Fits in an array of size 7 15 0123456

13 Collisions Problem: When two or more keys hash to the same slot, there is a possiblity of collision.

14 Open-Addressing A simple way to handle collisions When a collision occurs look for an empty slot elsewhere Some elements may end up in the slot corresponding a different hashcode

15 Linear Probing Find an alternative slot after collision by stepping sequentially through the slots, for example: 42615 0123456 Insert 18 : f(18) = 18 mod 7 = 4 18 Collision in slot 4!

16 Linear Probing Find an alternative slot after collision by stepping sequentially through the slots, for example 42615 0123456 Insert 18 : f(18) = 18 mod 7 = 4 18 Slot 5 is also taken

17 Linear Probing Find an alternative slot after collision by stepping sequentially through the slots, for example 42615 0123456 Insert 18 : f(18) = 18 mod 7 = 4 18 Slot 6 is free

18 Disadvantages In open-addressing, the table can fill up; Must have (n < m) Linear-probing leads to “primary clustering:” A run of filled slots is more likely to receive more collisions Although best-case access is O( 1 ), worst-case access O( m )

19 Chaining A (Better) Solution to Collisions: Use the flexibility of the linked-list, but only when needed, i.e. within a single slot where collisions may occur.

20 Example (chaining) 0123456 15 426 Insert 39 into the previous hashtable:

21 Example (chaining) 0123456 15 426 f(39) = 39 mod 7 = 4collision 39

22 Worst-Case If all elements hash to the same entry we get a linked list: Therefore put, get and remove are O(n) worst-case.

23 Best-Case 0123456 Equal distribution to each slot

24 Best Case Definition: The load factor for a hashtable with n elements hashed into m slots is the average number of elements per slot:  = n / m

25 Best Case If every slot contains  elements (uniformly distributed hashing): put, get and remove are O(  )

26 Best Case If every slot contains  elements (uniformly distributed hashing): put, get and remove are O(  ) If the number of slots is allowed to grow as O( n ) :  = n/m = n /O( n ) = O( 1 ) put, get and remove are O( 1 )

27 Average-Case More realistic analysis involves determination of statistics of the data and how well it will be hashed. Example: hashing olympic years by f(x) = x mod 4 would be a bad idea (always hash to the same slot)

28 Java Hashtable Class Constructor: Hashtable(int initialCapacity, float loadFactor) Default: initialCapacity = 101, loadFactor = 0.75f Collision resolution with chaining

29 Java Hashtable Class hashcode(): defined in java.lang.Object equals(): assumed defined for the entries Keys can be objects of any class provided the following is appropriately defined:

30 Java Hashtable Class Hashtables grow multiplicatively: Put() checks if the hashtable contains more than (m  ) elements and if so m  2m+1 Hashtables only grow, never shrink, no matter how many elements you delete

31 Java Hashtable Class Other Features: elements() returns an enumeration of everything in the table. This works by keeping references into the table rather than by copying the table itself.

32 Any questions?


Download ppt "308-203A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000."

Similar presentations


Ads by Google