Download presentation
Presentation is loading. Please wait.
Published byGloria Alexander Modified over 9 years ago
1
CS 171: Introduction to Computer Science II Hashing and Priority Queues
2
Prepare to be aMAZEd! Find a path through a maze Open path Wall path, just use the walls Hand-in on Gimle Tuesday March 31 at 11:59pm.
3
A bag is a collection where removing items is not supported (a.k.a. multi-set) Useful if you don’t need removal Provides a client the ability to collect items and then to iterate through the collected items.
4
An abstract data type like a regular queue, but each element is associated with a priority value The element with highest priority will be removed first Queue: first in first out Stack: last in first out Priority queue: highest priority first out Priority queue applications Task scheduling Search and optimization
5
Implementing Priority Queue Assumes an element with a smaller value has higher priority Implementation using ordered array Insert – insert an element to the correct position Remove – delete the front element Implementation using unordered array Insert? Remove?
6
Priority Queue vs. Ordered Array So priority queue looks quite the same with an orderedArray. What are the differences? – You can only remove elements at the front one by one. You can’t remove arbitrary elements. – Priority queue only needs the capability of returning the highest priority element. There are efficient implementations (such as heaps) which do not require all elements to be sorted at all time.
7
Using Priority Queues Example: PriorityQueue s = new PriorityQueue(10); s.insert(25); s.insert(35); System.out.println(s.remove()); s.insert(45); s.insert(15); System.out.println(s.remove());
8
Set membership: Have I seen “x” before? One of the most common problems in practice Same interface as BST: search(“x”) ▪ Either returns the data or an error insert(“y”) ▪ Inserts to data structure if not there ▪ Up to you what to do with duplicates… We don’t need to preserve order
9
BST takes on average O(log(N)) time to search for. But sometimes we have ample space, much more than N, and really want to answer searches in nearly O(1) time. It’s okay if insertions take longer time Also, we sometimes have long and crazy keys E-mail addresses Full names of Icelanders URLs UUID (universal unique identifiers) How do we do that?
10
Suppose we have an array arr of length M For now, suppose all of our N keys are tiny In fact, they’re distinct integers between [0,M-1] What could we do to quickly look up if we have a given key?
11
Aha! We can use the key as an index into arr When somebody searches for key i We look in arr[i] If there is nothing there, we return Not Found error Otherwise, we return the value in arr[i] So that’s easy. But what if keys are long?
12
Can we “convert” each key into short, distinct numbers? Needs to be fast „Barbie“ Magic! 0 1 2 3 4 „Turtles“
13
Can we “convert” each key into short numbers? Needs to be fast „Barbie“ 0 1 2 3 4 „Turtles“
14
A hash function converts any given key into a number between [0,M-1]. It should be consistent Same key should always return the same number Moreover, we want it to distribute keys evenly across the possible numbers! That helps them be distinct
15
Suppose all the keys were large integers What would be a good hash function? Caveat: Ideally M should be prime 31337 num % M 0 1 2 3 4 3141592 num
16
What if keys are general strings? “Flugeldufel”, “Ausgeschnitzel”, … We could loop through the string, and add them up to a large integer, then use the modulo operation
17
But what if different keys produce the same hash value?
18
18 23 people in a room How likely that two people share the same birthday? Roughly: Answer: 50.7%!
19
19 Birthday paradox: Can’t avoid collisions unless you have ridiculous amount of memory How many collisions do we expect to see?
20
Need to deal with these hash collisions Several ways to deal with them Hashing with separate chaining:
21
Put keys that collide into a list associated with an index
24
Hashing with linear probing
31
Average case lookup (without resize): O(1 + M/N) Worst case lookup: O(N) If the hash table is close to filling up, we could resize it and rehash every element Usually one would double the its, or grow by x%
36
Hashing In practice, most often you’ll want to use hashing ▪ Best to allocate about 150% of the space you’ll need to the table Fast search, O(1) average case (unless close to full) Dead simple to use for standard data types Binary search trees Doesn’t require building efficient uniform hash functions for your data Don’t need to worry about fullness of the table or resizing Guaranteed worst-case performance (“red-black” BSTs)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.