Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Advertisements

1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.
Quicksort Lecture 3 Prof. Dr. Aydın Öztürk. Quicksort.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
1 Today’s Material Medians & Order Statistics – Ch. 9.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
Spring 2015 Lecture 5: QuickSort & Selection
Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)
Median/Order Statistics Algorithms
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
1 Introduction to Randomized Algorithms Md. Aashikur Rahman Azim.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Analysis of Algorithms CS 477/677
1 QuickSort Worst time:  (n 2 ) Expected time:  (nlgn) – Constants in the expected time are small Sorts in place.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
Order Statistics(Selection Problem)
Deterministic and Randomized Quicksort Andreas Klappenecker.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
1 Chapter 7 Quicksort. 2 About this lecture Introduce Quicksort Running time of Quicksort – Worst-Case – Average-Case.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
ENEE641: Mathematical Foundations for Computer Engineering Home page:
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Order Statistics Comp 550, Spring 2015.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Chapter 9: Medians and Order Statistics
Order Statistics Comp 122, Spring 2004.
Chapter 9: Selection of Order Statistics
The Selection Problem.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
CS200: Algorithm Analysis
Medians and Order Statistics
Presentation transcript:

Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro. To Algorithms” book website (copyright McGraw Hill) adapted and supplemented

CLRS “Intro. To Algorithms” Ch. 9: Median and Order Statistics

Exactly n-1 comparisons are made by MINIMUM. Is this best possible? Can the minimum element be found with fewer than n-1 comparisons? Why? How to find both the maximum and minimum of n elements? Can one do better than separately finding the maximum and minimum, which requires 2n-2 comparisons?

Selection Problem: Given input a set A of n numbers and an integer i, with 1 ≤ i ≤ n, find the i th smallest element x of A, i.e., the element x that is greater than exactly i -1 other elements of A. Note : When i = floor( (n +1)/2 ), x is called the median. RANDOMIZED-SELECT returns the i th smallest element of the array A [p..r].

Analysis of RANDOMIZED-SELECT Worst-case:  (n 2 ). Why? Expected-case: Let the random variable T(n) represent the time taken by RANDOMIZED-SELECT on input array A[p..r] containing n elements. If q is returned by RANDOMIZED-PARTITION, for 1 ≤ k ≤ n, let the indicator random variable X k = I{the subarray A[p..q] has exactly k elements} As the pivot is chosen randomly from A[p..r] E[X k ] = 1/n Therefore, T(n) ≤ ∑ k=1..n X k * T( max(k-1, n-k) ) + O(n) (why?)

E[ T(n) ] ≤ E[ ∑ k=1..n X k * T( max(k-1, n-k) ) + O(n) ] = E[ ∑ k=1..n X k * T( max(k-1, n-k) ) ] + O(n) = ∑ k=1..n E[X k ] * E[ T( max(k-1, n-k) ) ] + O(n) = ∑ k=1..n 1/n * E[ T( max(k-1, n-k) ) ] + O(n) ≤ 2/n ∑ k=floor(n/2)..n-1 E[ T(k) ] + O(n) (counting the number of times T(k) occurs in the summation of the previous line) Therefore, we have a recurrence relation E[ T(n) ] ≤ 2/n ∑ k=floor(n/2)..n-1 E[ T(k) ] + αn where α is a constant. Inductively assuming that T(k) ≤ ck for all k < n, for some constant c to be chosen later, we’ll show that indeed T(n) ≤ cn, proving T(n) = O(n).

Proof E[ T(n) ] ≤ 2/n ∑ k=floor(n/2)..n-1 E[ T(k) ] + αn ≤ 2/n ∑ k=floor(n/2)..n-1 ck + αn (inductive hypothesis) = 2c/n (  k=1..n-1 k –  k=1..floor(n/2)-1 k ) + αn = 2c/n ( (n – 1)n/2 – (floor(n/2) – 1)floor(n/2)/2 ) + αn ≤ 2c/n ( (n – 1)n/2 – (n/2 – 2)(n/2 – 1)/2 ) + αn … = c (3n/4 + 1/2 - 2/n) + αn ≤ 3cn/4 + c/2 + αn = cn – (cn/4 – c/2 – αn) If cn/4 – c/2 – αn ≥ 0, then E[ T(n) ] ≤ cn and the inductive proof is complete. Now, indeed cn/4 – c/2 – αn ≥ 0 if n ≥ 2c/(c – 4α), presuming c > 4α.

Selection in Worst-Case Linear Time SELECT 1. Divide the n elements into floor(n/5) groups of 5 elements each and at most one more group of the remaining n mod 5 elements. 2. Find the median of each of the ceil(n/5) groups simply by sorting (e.g., insertion sort) and picking the middle element. 3. Use SELECT recursively to find the median x of the ceil(n/5) medians found in Step Partition the input array using the median-of-medians x as pivot. Let k be one more than the number of elements on the low side of the partition, so that x is the k th smallest element and there are n-k elements on the high side of the partition. 5. If i = k, return x. Otherwise, use SELECT recursively to find the i th smallest element on the low side if i k.

Analyzing SELECT At least half of the medians found in step 2 are greater than or equal to the median-of-medians x. Thus, at least half of the ceil(n/5) groups contribute 3 elements greater than x, except, possibly, for one group containing less than 5 elements and the group containing x itself. Therefore, the number of elements > x is at least 3(1/2 * ceil(n/5) – 2) ≥ 3n/10 – 6 Similarly, the number of elements < x is at least 3n/10 – 6. Therefore, in the worst case SELECT is called recursively on at most n – (3n/10 – 6) = 7n/ elements Steps 1, 2 and 4 of SELECT take O(n) time each. Step 3 takes T(ceil(n/5)) time and step 5 at most T(7n/10 + 6) time. The following recurrence obtains therefore:

T(n) ≤  (1) if n ≤ 140 T(ceil(n/5)) + T(7n/10 + 6) + O(n) if n > 140 The first line is valid because T(n) is at most a constant for a bounded number of elements. The number 140 is chosen to simplify future calculation – nothing special about it! Assume inductively that T(n) ≤ cn for some constant c and all n less than some fixed n (assume this fixed n is > 140). Then T(n) ≤ c(ceil(n/5)) + c(7n/10 + 6) + αn (where α is a constant) ≤ cn/5 + c + 7cn/10 + 6c + αn = cn + (-cn/10 + 7c + αn) which is at most cn if -cn/10 + 7c + αn ≤ 0, which is true if c ≥ 10α(n/(n – 70)). Since n > 140, n/(n – 70) < 2. Therefore, if c ≥ 20α, then indeed c ≥ 10α(n/(n – 70)), and so T(n) ≤ cn completing the induction. This proves that T(n) = O(n), so SELECT is worst-case linear time.

Problems Ex Ex Ex (think tail recursion) Ex Ex Ex Ex Prob. 9-2