LECTURE 40: SELECTION CSC 212 – Data Structures
Sequence of Comparable elements available Only care implementation has O(1) access time Elements unordered within the Sequence Easy finding smallest & largest elements What about if we want the k th largest element? Real function used surprisingly often Statistical analyses Database lookups Ranking teams in league Selection Problem
Could sort Collection as a first step Then just return k th element Sorting is slow Sorting takes at least O ( n *log n ) time Ordered Collection faster for selection time Selection becomes simple O(1) process O(n) insertion time would also result Usually this is not a winning tradeoff Selection Problem k =
Works like binary search finds specific value Come from same family of algorithms Divide-and-conquer algorithms work similarly Divide into easier sub-problems to be solved Recursively solve sub-problems Conquer by combining solutions to the sub-problems Quick-Select
prune-and-search Also known as prune-and-search Read this as divide-and-conquer Prune: Pick a pivot & split into 2 Collection s L will get all elements less than pivot Equal and larger elements go into G Keep pivot element off to the side Search: Solve only for Collection with solution When k < L. size(), continue search in L pivot is solution when k = L. size()+1 Else, element in G solves this problem Quick-Select
Quick-Select In Pictures x x L G if k < L.size() then return quickSelect(k, L) if k > L.size() then k = k – (L.size() + 1) return quickSelect(k, G) if k == L.size() then return x
Quick-Select Visualization Draw Collection and k for each recursive call k =5, C =( ) 5 k =2, C =( ) k =2, C =( ) k =1, C =(7 6 5)
Quick-Select Running Time Each call of the algorithm would take time: So the worst-case running time: We would expect the running time to be:
Expected Running Time Recursive call for Collection of size s Good call: L & G each have less than ¾ * s elements Bad call: More than ¾ * s elements for L or G (Note: they cannot both be larger than ¾* s) Good call Bad call
How often can we expect to make a good call? Ultimately it will all depend on selection of pivot ½ of possible pivot s would create good split So with random guess, get good call 50% of the time Expected Running Time Good pivots Bad pivots
Expected Running Time
Probabilities Probability Fact #1: After 2 coin tosses, expect to see heads at least once Probability Fact #2: Expectations are additive E(X + Y) = E(X) + E(Y) E(c * X) = c * E(X)
expected Let T ( n ) = expected execution time T(n) < b * n * (# calls before good call) + T(¾ * n) T(n) < b * n * 2 + T(¾ * n) T(n) < b * n * 2 + b * ¾ * n * 2 + T((¾) 2 * n) T(n) < b * n * 2 + b * ¾ * n * 2 + b * (¾) 2 * n * 2 +… T(n) < O(n) + O(n) + O(n) +… … then a mathematical miracle occurs… T ( n ) = O ( n ) More Big-Oh
Finish week #15 assignment Due on Friday at 5PM Before Next Lecture…