CS 3343: Analysis of Algorithms

Slides:



Advertisements
Similar presentations
©2001 by Charles E. Leiserson Introduction to AlgorithmsDay 9 L6.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 6 Prof. Erik Demaine.
Advertisements

Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
Spring 2015 Lecture 5: QuickSort & Selection
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Analysis of Algorithms CS 477/677
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
David Luebke 1 7/2/2015 Medians and Order Statistics Structures for Dynamic Sets.
Median and Order Statistics Jeff Chastine. Medians and Order Statistics The i th order statistic of a set of n elements is the i th smallest element The.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
10/13/20151 CS 3343: Analysis of Algorithms Lecture 9: Review for midterm 1 Analysis of quick sort.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Introduction to Algorithms Jiafen Liu Sept
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Order Statistics David Kauchak cs302 Spring 2012.
Order Statistics(Selection Problem)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
Randomized Quicksort (8.4.2/7.4.2) Randomized Quicksort –i = Random(p, r) –swap A[p]  A[i] –partition A(p, r) Average analysis = Expected runtime –solving.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
CS 3343: Analysis of Algorithms
Order Statistics.
Quick Sort Divide: Partition the array into two sub-arrays
Order Statistics Comp 122, Spring 2004.
Introduction to Algorithms Prof. Charles E. Leiserson
CPSC 311 Section 502 Analysis of Algorithm
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
Quicksort "There's nothing in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting Hat, Harry Potter.
CSC 413/513: Intro to Algorithms
Order Statistics(Selection Problem)
Dr. Yingwu Zhu Chapter 9, p Linear Time Selection Dr. Yingwu Zhu Chapter 9, p
Ch 7: Quicksort Ming-Te Chi
Randomized Algorithms
Medians and Order Statistics
Topic: Divide and Conquer
Order Statistics Comp 550, Spring 2015.
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
CS 332: Algorithms Quicksort David Luebke /9/2019.
Chapter 9: Medians and Order Statistics
Topic: Divide and Conquer
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Data Structures & Algorithms
Order Statistics Comp 122, Spring 2004.
Chapter 9: Selection of Order Statistics
CS200: Algorithm Analysis
The Selection Problem.
Quicksort and Randomized Algs
Algorithms CSCI 235, Spring 2019 Lecture 17 Quick Sort II
CS200: Algorithm Analysis
Medians and Order Statistics
Chapter 11 Sets, and Selection
Presentation transcript:

CS 3343: Analysis of Algorithms Order Statistics

Order statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is the nth order statistic The median is the n/2 order statistic If n is even, there are 2 medians How can we calculate order statistics? What is the running time?

Order statistics – selection problem Select the ith smallest of n elements Naive algorithm: Sort. Worst-case running time Q(n log n) using merge sort or heapsort (not quicksort). We will show: A practical randomized algorithm with Q(n) expected running time A cool algorithm of theoretical interest only with Q(n) worst-case running time

Recall: Quicksort £ x x ³ x r p q k The function Partition gives us the rank of the pivot If we are lucky, k = i. done! If not, at least get a smaller subarray to work with k > i: ith smallest is on the left subarray k < i : ith smallest is on the right subarray Divide and conquer If we are lucky, k close to n/2, or desired # is in smaller subarray If unlucky, desired # is in larger subarray (possible size n-1) £ x x ³ x r p q k

Randomized divide-and-conquer algorithm RAND-SELECT(A, p, q, i) ⊳ i th smallest of A[ p . . q] if p = q & i > 1 then error! r  RAND-PARTITION(A, p, q) k  r – p + 1 ⊳ k = rank(A[r]) if i = k then return A[ r] if i < k then return RAND-SELECT( A, p, r – 1, i ) else return RAND-SELECT( A, r + 1, q, i – k ) £ A[r] ³ A[r] r p q k

Randomized Partition Randomly choose an element as pivot Every time need to do a partition, throw a die to decide which element to use as the pivot Each element has 1/n probability to be selected Rand-Partition(A, p, q){ d = random(); // draw a random number between 0 and 1 index = p + floor((q-p+1) * d); // p<=index<=q swap(A[p], A[index]); Partition(A, p, q); // now use A[p] as pivot }

Select the 6 – 4 = 2nd smallest recursively. Example Select the i = 6th smallest: 7 10 5 8 11 3 2 13 i = 6 pivot 3 2 5 7 11 8 10 13 Partition: k = 4 Select the 6 – 4 = 2nd smallest recursively.

Complete example: select the 6th smallest element. i = 6 7 10 5 8 11 3 2 13 3 2 5 7 11 8 10 13 k = 4 i = 6 – 4 = 2 10 8 11 13 k = 3 i = 2 < k Note: here we always used first element as pivot to do the partition (instead of rand-partition). 8 10 k = 2 i = 2 = k 10

Intuition for analysis (All our analyses today assume that all elements are distinct.) Lucky: T(n) = T(9n/10) + Q(n) = Q(n) CASE 3 Unlucky: T(n) = T(n – 1) + Q(n) = Q(n2) arithmetic series Worse than sorting!

Running time of randomized selection T(max(0, n–1)) + n if 0 : n–1 split, T(max(1, n–2)) + n if 1 : n–2 split, M T(max(n–1, 0)) + n if n–1 : 0 split, T(n) ≤ For upper bound, assume ith element always falls in larger side of partition The expected running time is an average of all cases Expectation

Substitution method Want to show T(n) = O(n). So need to prove T(n) ≤ cn for n > n0 Assume: T(k) ≤ ck for all k < n if c ≥ 4 Therefore, T(n) = O(n)

Summary of randomized selection Works fast: linear expected time. Excellent algorithm in practice. But, the worst case is very bad: Q(n2). Q. Is there an algorithm that runs in linear time in the worst case? IDEA: Generate a good pivot recursively. A. Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973].

Worst-case linear-time selection if i = k then return x elseif i < k then recursively SELECT the i th smallest element in the lower part else recursively SELECT the (i–k)th smallest element in the upper part SELECT(i, n) Divide the n elements into groups of 5. Find the median of each 5-element group by rote. Recursively SELECT the median x of the ën/5û group medians to be the pivot. Partition around the pivot x. Let k = rank(x). Same as RAND-SELECT

Choosing the pivot

Choosing the pivot Divide the n elements into groups of 5.

Choosing the pivot lesser greater Divide the n elements into groups of 5. Find the median of each 5-element group by rote.

Choosing the pivot x lesser greater Divide the n elements into groups of 5. Find the median of each 5-element group by rote. Recursively SELECT the median x of the ë n/5û group medians to be the pivot.

Analysis x lesser greater At least half the group medians are £ x, which is at least ë ë n/5û /2û = ë n/10û group medians.

Analysis x lesser greater At least half the group medians are £ x, which is at least ë ë n/5û /2û = ë n/10û group medians. Therefore, at least 3 ë n/10û elements are £ x. (Assume all elements are distinct.)

Analysis x lesser greater At least half the group medians are £ x, which is at least ë ë n/5û /2û = ë n/10û group medians. Therefore, at least 3 ë n/10û elements are £ x. Similarly, at least 3 ë n/10û elements are ³ x.

Need “at most” for worst-case runtime Analysis Need “at most” for worst-case runtime At least 3 ë n/10û elements are £ x  at most n-3 ë n/10û elements are  x At least 3 ë n/10û elements are  x  at most n-3 ë n/10û elements are  x The recursive call to SELECT in Step 4 is executed recursively on at most n-3 ë n/10û elements. 3 ë n/10û Possible position for pivot

Analysis Use fact that ë a/bû > a/b-1 n-3 ë n/10û < n-3(n/10-1)  7n/10 + 3  3n/4 if n ≥ 60 The recursive call to SELECT in Step 4 is executed recursively on at most 7n/10+3 elements.

Developing the recurrence T(n) if i = k then return x elseif i < k then recursively SELECT the i th smallest element in the lower part else recursively SELECT the (i–k)th smallest element in the upper part SELECT(i, n) Divide the n elements into groups of 5. Find the median of each 5-element group by rote. Recursively SELECT the median x of the ën/5û group medians to be the pivot. Partition around the pivot x. Let k = rank(x). Q(n) T(n/5) Q(n) T(7n/10+3)

Solving the recurrence Assumption: T(k) £ ck for all k < n if n ≥ 60 if c ≥ 20 and n ≥ 60

Conclusions Since the work at each level of recursion is basically a constant fraction (19/20) smaller, the work per level is a geometric series dominated by the linear work at the root. In practice, this algorithm runs slowly, because the constant in front of n is large. The randomized algorithm is far more practical. Exercise: Try to divide into groups of 3 or 7. Exercise: Think about an application in sorting.