Order Statistics.

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Advertisements

1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
Spring 2015 Lecture 5: QuickSort & Selection
Median/Order Statistics Algorithms
Quicksort Ack: Several slides from Prof. Jim Anderson’s COMP 750 notes. UNC Chapel Hill1.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Median and Order Statistics Jeff Chastine. Medians and Order Statistics The i th order statistic of a set of n elements is the i th smallest element The.
Computer Algorithms Lecture 10 Quicksort Ch. 7 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Order Statistics(Selection Problem)
Deterministic and Randomized Quicksort Andreas Klappenecker.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Analysis of Algorithms CS 477/677
Quick Sort Divide: Partition the array into two sub-arrays
Order Statistics Comp 122, Spring 2004.
Introduction to Algorithms Prof. Charles E. Leiserson
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
CSC 413/513: Intro to Algorithms
Order Statistics(Selection Problem)
Quick Sort (11.2) CSE 2011 Winter November 2018.
CO 303 Algorithm Analysis And Design Quicksort
Ch 7: Quicksort Ming-Te Chi
Randomized Algorithms
Medians and Order Statistics
Topic: Divide and Conquer
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
CS 3343: Analysis of Algorithms
Order Statistics Comp 550, Spring 2015.
CS 583 Analysis of Algorithms
EE 312 Software Design and Implementation I
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
CS 332: Algorithms Quicksort David Luebke /9/2019.
Ack: Several slides from Prof. Jim Anderson’s COMP 202 notes.
Chapter 7 Quicksort.
Chapter 9: Medians and Order Statistics
Topic: Divide and Conquer
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Data Structures & Algorithms
Order Statistics Comp 122, Spring 2004.
Chapter 9: Selection of Order Statistics
CS200: Algorithm Analysis
CS 583 Analysis of Algorithms
The Selection Problem.
Quicksort and Randomized Algs
Algorithms CSCI 235, Spring 2019 Lecture 19 Order Statistics
Quicksort Quick sort Correctness of partition - loop invariant
CS200: Algorithm Analysis
Medians and Order Statistics
Presentation transcript:

Order Statistics

Order Statistic ith order statistic: ith smallest element of a set of n elements. Minimum: first order statistic. Maximum: nth order statistic. Median: “half-way point” of the set. Unique, when n is odd – occurs at i = (n+1)/2. Two medians when n is even. Lower median, at i = n/2. Upper median, at i = n/2+1. For consistency, “median” will refer to the lower median.

Selection Problem Selection problem: Input: A set A of n distinct numbers and a number i, with 1 i  n. Ouput: the element x  A that is larger than exactly i – 1 other elements of A. Can be solved in O(n lg n) time. How? We will study linear-time algorithms. For the special cases when i = 1 and i = n. For the general problem.

Minimum (Maximum) No. of comparisons: n – 1. Can we do better? Minimum (A) 1. min  A[1] 2. for i  2 to length[A] 3. do if min > A[i] 4. then min  A[i] 5. return min Maximum can be determined similarly. T(n) = (n). No. of comparisons: n – 1. Can we do better?

Minimum (Maximum) No. of comparisons: n – 1. Minimum (A) 1. min  A[1] 2. for i  2 to length[A] 3. do if min > A[i] 4. then min  A[i] 5. return min Maximum can be determined similarly. T(n) = (n). No. of comparisons: n – 1. Can we do better? Why not? Minimum(A) has worst-case optimal # of comparisons.

Selection Problem Selection problem: Input: A set A of n distinct numbers and a number i, with 1 i  n. Ouput: the element x  A that is larger than exactly i – 1 other elements of A. Seems harder than min or max, but has the same asymptotic running time. QuickMedian: expected linear time (randomized) BFPRT73: linear time (deterministic) Key idea: geometric series sum to linear

Randomized Quicksort: review Quicksort(A, p, r) if p < r then q := Rnd-Partition(A, p, r); Quicksort(A, p, q – 1); Quicksort(A, q + 1, r) fi Rnd-Partition(A, p, r) i := Random(p, r); A[r]  A[i]; x, i := A[r], p – 1; for j := p to r – 1 do if A[j]  x then i := i + 1; A[i]  A[j] fi od; A[i + 1]  A[r]; return i + 1 A[p..r] 5 A[p..q – 1] A[q+1..r] Partition 5  5  5

Selection in Expected Linear Time Key idea: Quicksort, but recur only on list containing i Exploit two abilities of Randomized-Partition (RP). RP returns the index k in the sorted order of a randomly chosen element (pivot). If the order statistic i = k, then we are done. Else reduce the problem size using its other ability. RP rearranges all other elements around the pivot. If i < k, selection can be narrowed down to A[1..k – 1]. Else, select the (i – k)th element from A[k+1..n].

Randomized-Select Randomized-Select(A, p, r, i) // select ith order statistic. 1. if p = r 2. then return A[p] 3. q  Randomized-Partition(A, p, r) 4. k  q – p + 1 5. if i = k 6. then return A[q] 7. elseif i < k 8. then return Randomized-Select(A, p, q – 1, i) 9. else return Randomized-Select(A, q+1, r, i – k)

Randomized-Select Example 8 1 5 3 4 Goal: Find 3rd smallest element

Randomized-Select Example 8 1 5 3 4

Randomized-Select Example 8 1 5 3 4 1 3 8 5 4

Randomized-Select Example 8 1 5 3 4 1 3 8 5 4 8 5 4

Randomized-Select Example 8 1 5 3 4 1 3 8 5 4 8 5 4

Randomized-Select Example 8 1 5 3 4 1 3 8 5 4 4 5 8

Randomized-Select Example 8 1 5 3 4 1 3 8 5 4 4 5 8 4

Analysis Worst-case Complexity: Average-case Complexity: (n2) – As we could get unlucky and always recurse on a subarray that is only one element smaller than the previous subarray. (T(n) = T(n-1) + (n) ) Average-case Complexity: (n) – Intuition: Because the pivot is chosen at random, we expect that we get rid of half of the list each time we choose a random pivot q. Why (n) and not (n lg n)?

Analysis Average-case Complexity - more intuition If we get rid of 10% of the list each time we choose a random pivot q... T(n) = T(9/10 n) + (n)

Analysis Average-case Complexity - more intuition If we get rid of 10% of the list each time we choose a random pivot q... T(n) = T(9/10 n) + (n) Let a  1 and b > 1 ,T(n) = a T(n/b) + c nk, n  0 1. If a > bk, then T(n) = ( nlog_b a ). 2. If a = bk, then T(n) = ( nk lg n ). 3. If a < bk, then T(n) = ( nk ).

Analysis Average-case Complexity - more intuition T(n) = (n) If we get rid of 10% of the list each time we choose a random pivot q... T(n) = T(9/10 n) + (n) T(n) = (n) Let a  1 and b > 1 ,T(n) = a T(n/b) + c nk, n  0 1. If a > bk, then T(n) = ( nlog_b a ). 2. If a = bk, then T(n) = ( nk lg n ). 3. If a < bk, then T(n) = ( nk ).

Selection in Worst-Case Linear Time Algorithm Select: Like RandomizedSelect, finds the desired element by recursively partitioning the input array. Unlike RandomizedSelect, is deterministic. Uses a variant of the deterministic Partition routine. Partition is told which element to use as the pivot. Achieves linear-time complexity in the worst case by Guaranteeing that the split is always “good” at each Partition. How can a good split be guaranteed?

Choosing a Pivot Median-of-Medians: Divide the n elements into n/5 groups. n/5 groups contain 5 elements each. 1 group may contain n mod 5 < 5 elements. Determine the median of each of the groups. Recursively find the median x of the n/5 medians. n/5 groups of 5 elements each. n/5th group of n mod 5 elements.

Example

Algorithm Select Determine the median-of-medians x (on previous slide.) Partition input array around x (Partition from Quicksort). Let k be the index of x that Partition returns. If k = i, then return x. Else if i < k, apply Select recursively to A[1..k–1] to find the ith smallest element. Else if i > k, apply Select recursively to A[k+1..n] to find the (i – k)th smallest element.

Worst-case Split Arrows point from larger to smaller elements. n/5 groups of 5 elements each. Elements < x n/5th group of n mod 5 elements. Median-of-medians, x Elements > x

Worst-case Split Assumption: Elements are distinct. Why? At least  n/5 /2 groups have 3 of their 5 elements ≥ x. Ignore the last group if it has fewer than 5 elements. Hence, the no. of elements ≥ x is at least 3(n–4)/10. Likewise, the no. of elements ≤ x is at least 3(n–4)/10. Thus, in the worst case, Select is called recursively on at most (7n+12)/10 elements.

Recurrence for worst-case running time T(Select)  T(Median-of-medians) + T(Partition) + T(recursive call to select) T(n)  O(n) + T(n/5) + O(n) + T((7n+12)/10) = T(n/5) + T(7n/10+1.2) + O(n) Base: for n  24, assume we just use Insertionsort. So T(n)  24n for all n  24. T(Median-of-medians) T(Partition) T(recursive call)

Solving the recurrence Base: for all n  24, T(n)  24n For n > 24, T(n) ≤ an+ T(n/5) + T(7n/10+1.2) We want to find c>0 so for all n>0 T(n) ≤ cn. Base implies c ≥ 24 T(n) ≤ an+ T(n/5) + T(7n/10+1.2) ?≤ an+ c n/5 + c 7n/10 + 1.2c = cn – (c n/10 – an – 1.2c) = cn – ((c/20 – a)n + (n/20 – 1.2)c) ≤ cn, as long as c ≥ 20a. So, c = max(24, 20a) works

Conclusions We can find the ith largest in an unordered list in Θ(n) worst-case time Let us do Quicksort in worst-case Θ(n lg n) . That constant, 20× partition cost, was high; use RandomizedSelect (aka QuickMedian) in practice.