Median/Order Statistics Algorithms

Slides:



Advertisements
Similar presentations
©2001 by Charles E. Leiserson Introduction to AlgorithmsDay 9 L6.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 6 Prof. Erik Demaine.
Advertisements

Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.
Lecture 3: Randomized Algorithm and Divide and Conquer II: Shang-Hua Teng.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
1 Today’s Material Medians & Order Statistics – Ch. 9.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
Median Finding, Order Statistics & Quick Sort
Divide-and-Conquer Matrix multiplication and Strassen’s algorithm Median Problem –In general finding the kth largest element of an unsorted list of numbers.
Spring 2015 Lecture 5: QuickSort & Selection
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized.
The Substitution method T(n) = 2T(n/2) + cn Guess:T(n) = O(n log n) Proof by Mathematical Induction: Prove that T(n)  d n log n for d>0 T(n)  2(d  n/2.
Chapter 7: Sorting Algorithms
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
Lecture 7COMPSCI.220.FS.T Algorithm MergeSort John von Neumann ( 1945 ! ): a recursive divide- and-conquer approach Three basic steps: –If the.
Updated QuickSort Problem From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j]
WS Algorithmentheorie 03 – Randomized Algorithms (Overview and randomised Quicksort) Prof. Dr. Th. Ottmann.
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
1 Introduction to Randomized Algorithms Md. Aashikur Rahman Azim.
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different but similar analyses –Probabilistic analysis of a deterministic algorithm.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Analysis of Algorithms CS 477/677
Selection1. 2 The Selection Problem Given an integer k and n elements x 1, x 2, …, x n, taken from a total order, find the k-th smallest element in this.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
CS2420: Lecture 11 Vladimir Kulyukin Computer Science Department Utah State University.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Analysis of Algorithms CS 477/677
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Order Statistics(Selection Problem)
Deterministic and Randomized Quicksort Andreas Klappenecker.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
1 Chapter 7 Quicksort. 2 About this lecture Introduce Quicksort Running time of Quicksort – Worst-Case – Average-Case.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Randomized Algorithms
Randomized Algorithms
Medians and Order Statistics
Order Statistics Comp 550, Spring 2015.
Chapter 9: Medians and Order Statistics
Topic: Divide and Conquer
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Order Statistics Comp 122, Spring 2004.
The Selection Problem.
Quicksort and Randomized Algs
CS200: Algorithm Analysis
Presentation transcript:

Median/Order Statistics Algorithms Minimum and Maximum Selection in expected linear time Selection in worst-case linear time

Minimum and Maximum How many comparisons are sufficient to find minimum/maximum? How many comparisons are sufficient to find both minimum AND maximum? Show n + log n - 2 comparisons are sufficient to find second minimum (and minimum)

Median Problem How quickly can we find the median (or in general the kth largest element) of an unsorted list of numbers? Two approaches Quicksort partition algorithm expected Q (n) time but W(n2) time in the worst-case Deterministic Q(n) time in the worst-case

Quicksort Approach int Select(int A[], k, low, high) Choose a pivot item Determine rank of pivot element in current partition Compare all items to this pivot element If pivot is kth item, return pivot Else update low and high and recurse on partition that contains kth item

Example k=5 low high rank 17 12 6 23 19 8 5 10 1 8 17 12 6 23 19 8 5 10 1 8 6 8 5 10 17 12 23 19 5 8 4 17 12 19 23 5 6 7 12 17 found: 5

Probabilistic Analysis Assume each of n! permutations is equally likely Modify earlier indicator variable analysis of quicksort to handle this k-selection problem What is probability ith smallest item is compared to jth smallest item? If k is contained in (i..j)? If k ≤ i? If k ≥ j? 2/(j-i+1) 2/(j-k+1) 2/(k-i+1)

Cases where (i..j) do not contain k Case k ≥ j: S(i=1 to k-1)Sj = i+1 to k 2/(k-i+1) = Si=1 to k-1 (k-i) 2/(k-i+1) = Si=1 to k-1 2i/(i+1) [replace k-i with i] = 2 Si=1 to k-1 i/(i+1) ≤ 2(k-1) Case k ≤ i: S(j=k+1 to n)Si = k to j-1 2/(j-k+1) = Sj=k+1 to n (j-k) 2/(j-k+1) = Sj = 1 to n-k 2j/(j+1) [replace j-k with j and change bounds] = 2 Sj=1 to n-k j/(j+1) ≥ 2(n-k) Total for both cases is ≤ 2n-2

Case where (i..j) contains k At most 1 interval of size 3 contains k i=k-1, j=k+1 At most 2 intervals of size 4 contain k i=k-1, j=k+2 and i=k-2, j= k+1 In general, at most q-2 intervals of size q contain k Thus we get S(q=3 to n) (q-2)2/q ≤ S(q=3 to n) 2 = 2(n-2) Summing together all cases we see the expected number of comparisons is less than 4n

Best case, Worst-case Best case running time? What happens in the worst-case? Pivot element chosen is always what? This leads to comparing all possible pairs This leads to Q(n2) comparisons

Deterministic O(n) approach Need to guarantee a good pivot element while doing O(n) work to find the pivot element int Select(int A[], k, low, high) Choosing pivot element Divide into groups of 5 For each group of 5, find that group’s median Use median of the medians as pivot element Determine rank of pivot element Compare some remaining items directly to median Update low and high and recurse on partition that contains kth item (or return kth item if it is pivot)

Guarantees on the pivot element Median of medians is guaranteed to be smaller than all the red colored items Why? How many red items are there? Likewise, median of medians is guaranteed to be larger than the blue colored items Thus median of medians is in the range: What elements do we need to compare to pivot to determine its rank? How many of these are there? 3n/10 ignoring non-perfect division issues

Analysis of number of comparisons int Select(int A[], k, low, high) Choosing pivot element For each group of 5, find that group’s median Find the median of the medians Compare remaining items directly to median Recurse on correct partition Analysis Choosing pivot element c1 n/5 c1 for median of 5 Recurse on problem of size n/5 c2 n comparisons Recurse on problem of size 7n/10 T(n) =

Solving recurrence relation T(n) = T(7n/10) + T(n/5) + O(n) Key observation: 7/10 + 1/5 = 9/10 < 1 Prove T(n) ≤ cn for some constant n by induction on n T(n) = 7cn/10 + cn/5 + dn = 9cn/10 + dn Need 9cn/10 + dn ≤ cn Thus c/10 ≥ d  c ≥ 10d