Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)

Similar presentations


Presentation on theme: "CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)"— Presentation transcript:

1 CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
< pivot q > pivot p r Randomized‐Select(A,p,r,i) 1 if p==r //base case return A[p] 3 q = Randomized-partition(A, p, r) 4 k = q-p+1 //number elements from left up to pivot 5 if i==k // pivot is the ith smallest return A[q] 7 elseif i<k return Randomized-Select(A,p,q-1, i) //ith smallest left 9 else return Randomized-Select(A,q+1,r,i-k) //on right Example: A = [ ], find the 3rd smallest value CSC317

2 CSC317 Selection problem q p r
< pivot q > pivot p r How is this different from randomized version of Quicksort? Answer: Only one recursion (left or right); not two CSC317

3 CSC317 Selection problem q p r Analysis: Worst case:
< pivot q > pivot p r Analysis: Worst case: When we always partition around largest remaining Element, and recursion on array size n‐1 (which takes θ(n)) Worse than a good sorting scheme. CSC317

4 CSC317 Selection problem q p r
< pivot q > pivot p r Analysis: However a 1/10 and 9/10 scheme isn’t bad … Remember master theorem: Therefore: Average solution isn’t bad either (θ(n), Proof similar to quicksort). CSC317

5 We’ve been talking a lot about efficiency
in computing and run time …. But haven’t said anything about data structures CSC317

6 Dynamic sets … CSC317 Set size changes over time
Elements could have identifying keys, and could also have satellite data example: key corresponding to friend name, with satellite data corresponding to , phone, favorite hobbies, etc. CSC317

7 Dynamic sets CSC317 What about operations on dynamic sets …? Search
Insert Delete Min/Max Successor/Predecessor Which data structure? Depends on what you want to do … (hash table, stack, queue, linked list, trees, ...) CSC317

8 Data structures CSC317 Hash table Insert, Delete, Search/lookup
We don’t maintain order information Applications (Later …)? We’ll see that operations are on average O(1) CSC317

9 Data structures CSC317 Stack Last-in-first-out Insert = push
Delete = pop Applications? Run time of push and pop? O(1) But limited operations… (eg, if you want to search it’s not efficient) push pop CSC317

10 CSC317 Data structures Queue first-in-first-out Insert = enqueue
Delete= dequeue Run time for enqueue/dequeue: O(1) Fast, but limited operations CSC317

11 CSC317 Data structures Linked List Search Insert Delete
Linked lists (example of double linked) Run time? • Search O(n) [limitation if lots of searches] • Insert O(1) • Delete O(1) [unless first searching for key] CSC317

12 CSC317 Data structures Binary tree Search Min/Max
Predecessor/Successor Insert/Delete Later;; basic operations take height of tree, complete binary tree Θ(logn) CSC317

13 Data structures Heap: main operations: (discussed in sorting chapter) a heap is a specialized tree structure that satisfies the heap property: in a min heap the parent is always smaller than the child node. In a max heap the parent is always bigger than the child node. Insert Θ(log n) Remove object that is min (or max, but not both) Θ(log n) Technically, can be implemented via a complete binary tree Application: Heap sort (we’ll discuss finding median dynamically) CSC317

14 Finding median dynamically
Input: numbers presented one by one: x1, x2, x3, …, xn Output: At each time step, the median Run time? We know we can do O(n) but dynamically each time we add a number, would like to do better and not have to recompute with O(n) Using two heaps: one for max and one for min O(log k) in each step CSC317

15 Finding median dynamically
Low (Max) Heap holding smaller numbers: performs max operation in O(log k) time High (Min) Heap holding larger numbers: performs min Invariant: half smallest number of elements so far in low heap; half highest in high heap In this arrangement we can find out in O(1) time whether a new number would go to the upper half or lower half. All we need to do is to compare the new number with the head of two heaps and insert it in O(log n) time. What about if heaps are unbalanced? If Low (Max heap) has 6 elements and High (Min heap) has 5 elements, and next element is less than max of Low, insert in low and move min of High to Low… CSC317

16 Finding median dynamically
What about if heaps are unbalanced? If Low (Max heap) has 6 elements and High (Min heap) has 5 elements, and next element is less than max of Low, insert in low and move min of High to Low… CSC317

17 Hash table CSC317 We have elements with key and satellite data
Operations performed: Insert, Delete, Search/lookup We don’t maintain order information We’ll see that all operations on average O(1) But worse case can be O(n) CSC317

18 Hash table Simple implementation: If universe of keys comes from a small set of integers [0..9], we can store directly in array using the keys as indices into the slots This is also called a direct-address table Search time just like in array – O(1)! CSC317

19 CSC317 Example: Array versus Hash table
Imagine we have keys corresponding to friends that we want to store Could use huge array, with each friend’s name mapped to some slot in array (eg, one slot in array for every possible name; each letter one of 26 characters, n letters in each name…) We could insert, find key, and delete element in O(1) time – very fast! But huge waste of memory, with many slots empty in many applications! John = A[34] Jane = A[33334] CSC317

20 CSC317 Example: versus Linked List
An alternative might be to use a linked list with all the friend names linked (e.g. John -> Jane -> Bill) Pro: This is not wasteful because we only store the names that we want Con: Search goes with O(n) Can we have an approach that is fast and not wasteful (like best of both worlds)? CSC317

21 Hash table Extremely useful alternative to static array for insert, search, and delete in O(1) time (on average) – VERY FAST Useful when the universe is large, but at any given time number of keys stored is small relative to total number of possible keys (not wasteful like a huge static array) We usually don’t store key directly as index into array, but rather compute a hash function of the key k, h(k), as index Problem: Collision (i.e. two keys map to the same slot) CSC317

22 CSC317 Collisions Guaranteed to happen, when more keys than slots
Or if “bad” hashing function – all keys were hashed to just one slot of hash table (more later …) Even by chance, collisions are likely to happen. Consider keys that are birthdays. Recall the birthday paradox – room of 28 people, then 2 people have a 50 percent chance to have same birthday Resolution? Chaining: CSC317

23 Analysis Worst case: all n elements map to one slot (one big linked list…). O(n) Average case: Need to define: m number of slots n number of keys in hashtable alpha = load factor Intuitively, alpha is average number elements per linked list. CSC317

24 CSC317 Hash table analyses Example: let’s take n = m; alpha = 1
Good hash function: each element of hash table has one linked list Bad hash function: hash function always maps to first slot of hash table, one linked list size n, and all other slots are empty. Define: Unsuccessful search: new key searched for doesn’t exist in hash table (we are searching for a new friend, Sarah, who is not yet in hash table) Successful search: key we are searching for already exists in hash table (we are searching for Tom, who we have already stored in hash table) CSC317

25 CSC317 Hash table analyses Theorem:
Assuming uniform hashing, unsuccessful search takes on average O(α + 1). Here the actual search time is O(α) and the added 1 is the constant time to compute a hash function on the key that is being searched. Interpretation: n = m, θ(1+1) = θ(1) n = 2m, θ(2+1) = θ(1) n = m3, θ(m2+1) ≠ θ(1) we say constant time on average when n and m similar order, but not generally guaranteed CSC317

26 CSC317 Hash table analyses Theorem:
Intuitively: Search for key k, hash function will map onto slot h(k). We need to search through linked list in the slot mapped to, up until the end of the list (because key is not found = unsuccessful search). For n=2m, on average linked list is length 2. More generally, on average length is α, our load factor. CSC317


Download ppt "CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)"

Similar presentations


Ads by Google