Lower Bounds for Sorting, Searching and Selection Lecture 9 Lower Bounds for Sorting, Searching and Selection
Plan Finish Heaps Lower Bounds Selection (Find Min): adversary arguments Sorting: information theory lower bound for comparison based algorithms Searching: ITLB
Finish Heaps Building a heap from an array, in O(n) time
Number of vertices at height h 3 2 1 n/2h+1
Finish Heaps Building a heap from an array, in O(n) time Idea: The leaves are already heaps. Joining two adjacent (sub) heaps with a common root, it suffices to heapify (trickle down from the root). It takes O(h) time (h = distance from the local root to the leaves), for at most n/2h+1 nodes on that level. Total time: S n h/2h+1 = O(n), because S h/2h <2
Recap Linear Time Algorithms: O(n log n) Time Algorithms for sorting: Compute Sum, Product of n numbers Find Min/Max of n numbers Merge 2 arrays of n elements (total) Partition an array into 2 around a pivot O(n log n) Time Algorithms for sorting: Merge Sort Heap Sort Quick Sort (on average) O(log n) Time algorithms: Binary search
Lower Bounds Can we do better? Why not? Lower bounds prove that we cannot hope for a better algorithm, no matter how smart we are. Only very few lower bound proofs are known Most notorious open problems in Theoretical Computer Science are related to proving lower bounds for very important problems Reading: Ch. 13 textbook
Input Lower Bound Compute the sum of n numbers: all numbers must be looked at, otherwise the answer might not be correct Adversary argument: assume there is a smart algorithm which computes the sum without looking at all the n inputs. An adversary goes and modifies the input not looked at, then run the algorithm again. It should give the same answer (because it didn’t look at the modified input data), but this is not the correct answer.
Adversary Arguments Hmmm… Yes! Is a7 < a9? Mr. Algorithm: thinks he has a fast way of solving the problem Ms. Adversary: forces algorithm to work hard by given the worst possible answer
Adversary Arguments The answer is 3 This was the input 4 3 5 2 6 Now try again The answer is 3
Adversary Arguments Wrong! The input was 4 7 5 2 6 this time! The answer is 3 If some questions were not asked, the Adversary tricks the poor Algorithm to try again, on a different input data, with the same answers to the same questions but with a different correct final answer….
Adversary Argument for FindMin: need n-1 questions No. Is a1 < a2? a3=7 a4=6 No. Is a3 < a4? Is a4 > a5? a5=5 Yes. Minimum is a5! Wrong! It is a2!
If less than n-1 questions, the graph of comparisons is disconnected The adversary can re-arrange the data so that the answer is different
Binary Decision Trees ? Yes No Model algorithms based on successive answers to yes/no questions
? Yes No ? Yes No Answer 1 ? Yes No Answer 2 Answer 3 Answer 4
Worst case time: depth of tree A binary tree of depth h has < 2h leaves A binary tree with N leaves must have depth at least log2N This gives a lower bound on the worst case time to find an answer If the number of possible answers is N, then the algorithm MUST ask at least log N questions
Lower Bound for Sorting Number of possible sorted orders = number of all possible permutations of n elements = n! Hence any comparison-based algorithm for sorting must take at least log n! = O(n log n) time
Lower Bound for Searching In-class exercise How many possible answers for the searching question? What is the log of that? What is the lower bound for searching?