CSE 5311 Advanced Algorithms Instructor: Dr. Gautam Das Week 3 Class Notes Submitted By: Abhishek Hemrajani & Pinky Dewani
CSE 5311 Advanced Algorithms Topics Covered in Week 3: Median Finding Algorithm (Description and Analysis) Binary Search Tree
CSE 5311 Advanced Algorithms Given a set of “n” numbers we can say that, Mean: Average of the “n” numbers Median: Having sorted the “n” numbers, the value which lies in the middle of the list such that half the numbers are higher than it and half the numbers are lower than it. Thus the problem of finding the median can be generalized to finding the kth largest number where k = n/2.
CSE 5311 Advanced Algorithms kth largest number can be found using: Scan Approach with a time complexity T(n) = kn Sort Approach with a time complexity T(n) = nlogn
CSE 5311 Advanced Algorithms Linear Time Median Finding Algorithm Input: S = {a1,a2,…….,an} k such that 1 ≤ k ≤ n Output: kth largest number Algorithm: kth_largest (S, k)
CSE 5311 Advanced Algorithms Steps: Group the numbers into sets of 5 Sort individual groups and find the median of each group Let “M” be set of medians and find median of “M” using MedianOfMedian (MOM) = kth_largest(M,M/2) Partition original data around the MOM such that values less than it are in set “L” and values greater than it are in set “R”
CSE 5311 Advanced Algorithms Steps Continued: 5) If |L| = k-1, then return MOM else If |L| > k-1, then return kth_largest(L,k) else return kth_largest(R,k-|L|)
CSE 5311 Advanced Algorithms Example: (……..2,5,9,19,24,54,5,87,9,10,44,32,21,13,24,18,26,16,19,25,39,47,56,71,91,61,44,28………) is a set of “n” numbers
CSE 5311 Advanced Algorithms Step1: Group numbers in sets of 5 (Vertically) 2 2 54 54 44 44 4 4 25 25 ……………….. ……………….. ……………….. ……………….. ……………….. ……………….. 5 5 32 32 18 18 39 39 ……………….. 5 5 ……………….. ……………….. 47 47 21 21 26 26 9 9 87 87 ……………….. 13 13 16 16 56 56 19 19 9 9 ……………….. ……………….. ……………….. ……………….. 2 2 19 19 71 71 ……………….. ……………….. 24 24 10 10 ……………….. ………………..
CSE 5311 Advanced Algorithms Step2: Find Median of each group 2 5 2 4 25 ……………….. ……………….. ……………….. 5 13 16 39 ……………….. 9 ……………….. 9 10 18 47 21 ……………….. 32 19 56 19 54 ……………….. ……………….. 44 26 71 ……………….. 24 87 ……………….. Median of each group
CSE 5311 Advanced Algorithms Step3: Find the MedianOfMedians 2 5 2 4 25 ……………….. ……………….. 3.n/10 ……………….. 5 13 16 39 ……………….. 9 ……………….. 47 18 9 10 21 ……………….. 32 19 56 19 54 ……………….. ……………….. 44 26 71 ……………….. 24 87 ……………….. Find m ,the median of medians
CSE 5311 Advanced Algorithms Step4: Partition original data around the MOM M L R 3n/10<L<7n/10 3n/10<R<7n/10 Step5: If |L| = k-1, then return MOM else If |L| > k-1, then return kth_largest(L,k) else return kth_largest(R,k-|L|)
CSE 5311 Advanced Algorithms Time Analysis: Step Task Complexity 1 Group into sets of 5 O (n) 2 Find Median of each group 3 Find MOM T (n/5) 4 Partition around MOM 5 Condition T (7n/10) {Worst Case}
CSE 5311 Advanced Algorithms Time Complexity of Algorithm: T(n) = O (n) + T (n/5) + T (7n/10) T(1) = 1 Assume T (n) ≤ Cn (For it to be linear time) L.H.S = Cn R.H.S = C1n + Cn/5 + 7Cn/10 = (C1 + 9/10C)n Hence, L.H.S = R.H.S if C = 10C1 Thus it is a Linear Time Algorithm
CSE 5311 Advanced Algorithms Why sets of 5? Assuming sets of 3, T(n) = O (n) + T (n/3) + T (2n/3) T(1) = 1 Assume T (n) ≤ Cn (For it to be linear time) L.H.S = Cn R.H.S = C1n + Cn/3 + 2Cn/3 = (C1 + C) n Hence, L.H.S = R.H.S if C1 = 0 Thus this is invalid assumption. Hence we do not use groups of 3!
CSE 5311 Advanced Algorithms Assuming sets of 7, T(n) = O (n) + T (n/7) + T (5n/7) T(1) = 1 Assume T (n) ≤ Cn (For it to be linear time) L.H.S = Cn R.H.S = C1n + Cn/7 + 5Cn/7 = (C1 + 6C/7) n Hence, L.H.S = R.H.S if C = 7C1 However, constant factor of O (n) term increases to a sub-optimal value as size of set increases!
CSE 5311 Advanced Algorithms Points to remember: This algorithm uses Divide and Conquer strategy The technique is Recursion Most realistic approaches randomly select the MOM
CSE 5311 Advanced Algorithms Maintenance of Dynamic Sets: Given a Set “S” and a value “X”, we need to perform the following three operations: Insert (X, S) Delete (X,S) Find (X, S) Example: Customer Database, Yellow Pages etc.
CSE 5311 Advanced Algorithms Data Structures used for Dynamic Sets: Operation Unordered Array Sorted Array Unordered List Goal Insert (X, S) O (1) O (n) O (logn) Delete (X,S) Find (X, S)
CSE 5311 Advanced Algorithms Binary Search Tree: A Binary Search Tree is a data structure such that at any node, the values in the left subtree are smaller than the node and the values in the right subtree are greater than the node. Example: Given S = {3, 2, 5, 4, 7, 6, 1}, the Binary Search Tree is: 3 2 5 1 7 4 6
CSE 5311 Advanced Algorithms All operations performed on the Binary Search Tree are of the order O (path from root to leaf) which can be O (logn) in a favorable case or O (n) in the worst case (Sorted Input).
CSE 5311 Advanced Algorithms Sorting using In-Order Traversal: Binary Search Tree can be sorted using In-Order Traversal as follows: Sort (T) { Sort (LeftChild (T)) Write (T) Sort (RightChild (T)) } Here, T (n) = T (m) + T (n-m-1) + O (1)
CSE 5311 Advanced Algorithms For a Balanced Binary Search Tree: T(n) = T (L.H.S) + T (R.H.S) + O (1) = 2 T (n/2) + 1 = O (n)
CSE 5311 Advanced Algorithms Points to remember: Pointers do not give advantage of Random Access but make the data structure dynamic. B Tree and B+ Tree are optimizations of Binary Search Tree to give guaranteed O (logn) performance.