Introduction to Algorithms Jiafen Liu Sept. 2013.

Slides:



Advertisements
Similar presentations
©2001 by Charles E. Leiserson Introduction to AlgorithmsDay 9 L6.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 6 Prof. Erik Demaine.
Advertisements

Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Quicksort Lecture 3 Prof. Dr. Aydın Öztürk. Quicksort.
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Spring 2015 Lecture 5: QuickSort & Selection
Median/Order Statistics Algorithms
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Analysis of Algorithms CS 477/677
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
David Luebke 1 7/2/2015 Medians and Order Statistics Structures for Dynamic Sets.
10/15/2002CSE More on Sorting CSE Algorithms Sorting-related topics 1.Lower bound on comparison sorting 2.Beating the lower bound 3.Finding.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Introduction to Algorithms Jiafen Liu Sept
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
10/13/20151 CS 3343: Analysis of Algorithms Lecture 9: Review for midterm 1 Analysis of quick sort.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
Introduction to Algorithms Jiafen Liu Sept
David Luebke 1 6/3/2016 CS 332: Algorithms Analyzing Quicksort: Average Case.
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Order Statistics David Kauchak cs302 Spring 2012.
Order Statistics(Selection Problem)
Deterministic and Randomized Quicksort Andreas Klappenecker.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Introduction to Algorithms Prof. Charles E. Leiserson
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
Order Statistics(Selection Problem)
Dr. Yingwu Zhu Chapter 9, p Linear Time Selection Dr. Yingwu Zhu Chapter 9, p
Randomized Algorithms
Medians and Order Statistics
CS 3343: Analysis of Algorithms
Order Statistics Comp 550, Spring 2015.
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Chapter 9: Medians and Order Statistics
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Order Statistics Comp 122, Spring 2004.
The Selection Problem.
CS200: Algorithm Analysis
Medians and Order Statistics
Presentation transcript:

Introduction to Algorithms Jiafen Liu Sept. 2013

Today’s Tasks Order Statistics –Randomized divide and conquer –Analysis of expected time –Worst-case linear-time order statistics –Analysis

Order statistics Given n elements in array, try to select the ith smallest of n elements (the element with rank i)? This has various applications. –i=1, find the minimum element. –i=n, find the maximum element. –Find the median: –i= (n+1)/2 (odd)or i=n/2 and n/2+1(even) –This is useful in statistics.

How to find the ith element? Naïve algorithm? –Sort array A, and find the element A[i]. –If we use merge sort or randomized quicksort –Worst-case running time= Θ(nlgn) + Θ(1) = Θ(nlgn) Can we do better than that ? –Related with sorting, but different. –Our expected time is Θ(n).

Randomized divide-and-conquer algorithm In which Rand-Partition(A,p,q) seems familiar?

Partitioning subroutine P ARTITION (A, p, q) //A[p.. q] x←A[p] //pivot= A[p] i←p for j← p+1 to q do if A[j] ≤x then i←i+ 1 exchange A[i] ↔ A[j] exchange A[p] ↔ A[i] return i

Randomized divide-and-conquer algorithm

Example Select the i= 7th smallest: Partition

Algorithm Analysis (All our analyses today assume that all elements are distinct.) Like Quicksort, our algorithm depends on the effect of partition. Recall what’s the lucky case of Partition? –Median –1/10:9/10? –Each case is lucky except 0:n-1or n-1:0

Lucky or Unlucky? Lucky: –Let’s take 1/10:9/10 partition as an example –T(n)= T(9n/10) + Θ(n) –How to solve it? –Master Method: –T(n)= Θ(n)

Lucky or Unlucky? Unlucky: –0:n-1or n-1:0 partition –T(n)= T(n-1) + Θ(n) –T(n)= Θ(n 2 ) –That’s like arithmetic series. –Even worse than sorting first and then select!

Analysis of Expected Time We have deal with expected running time of Quicksort algorithm in Lecture 4. –Recall how we handle that? –We have n possibilities in partition, how to express them all in an expression? –Indicator random variable.

Analysis of Expected Time Let T(n) = the running time of RAND-SELECT on an input of size n, assuming random numbers are independent. To obtain an upper bound, assume that the ith element always falls in the larger side of the partition:

Analysis of Expected Time For k= 0, 1, …, n–1, define the indicator random variable:

Computing of Expected Time Independence!

Computing of Expected Time How to solve ? Substitution Method –We guess the answer is Θ(n) –Prove: E[T(n)] ≤ cn for some constant c. –Try to do the rest of this by yourself. if c is chosen large enough so that cn/4 dominates Θ(n). That’s the end of proof? The Base Case

Summary of randomized order- statistic selection Works fast: linear expected time. the worst case is bad: Θ(n 2 ). Still an excellent algorithm in practice. Questions: Is there an algorithm that runs in linear time even in the worst case? Pick the pivot randomly is simple, but is not good.

Improvement of randomized selection Due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973]. IDEA: Generate a really good pivot recursively. How can we make the complexity of recursion less than Θ(n)?

Worst-case linear-time order statistics SELECT(i, n) –Divide the n elements into └ n/5 ┘ groups of 5 elements. Find the median of each 5- elements group by rote. –Recursively select the median x of the └ n/5 ┘ group medians to be the pivot. –Partition around the pivot x. Let k= rank(x). If i=k then return x Else if i< k then recursively select the ith smallest element in the lower part Else recursively select the (i–k)th smallest element in the upper part

Choosing the pivot Divide n elements into └ n/5 ┘ groups of 5 elements. Reorganize five elements in each group so that –the middle one is the median. –the upper two are less than the median. –the lower two are bigger. 5 └ n/5 ┘

Choosing the pivot How much time does it takes? Θ(n)

Choosing the pivot Recursively select the median x of the └ n/5 ┘ group medians to be the pivot. Rearranged these groups by these medians.

Choosing the pivot Suppose that the whole SELECT(i, n) algorithm takes T(n), What’s the running time of this step? T( └ n/5 ┘ )=T(n/5) Now what do we know about all these elements?

Analysis Rest of the algorithm –Partition around the pivot x. Let k= rank(x). If i=k then return x Else if i< k then recursively select the ith smallest element in the lower part Else recursively select the (i–k)th smallest element in the upper part The whole cost we expected is Θ(n), so the rest cost must strictly less than T(4n/5), why? –We have already a recursive call of T(n/5).

Analysis Look at this figure carefully –there are some directed paths and gives us more information than we just had.

Analysis Look at this figure carefully –there are some directed paths and gives us more information than we just had. –All the elements in the block are ≤ x. –How many elements are there?

Analysis At least half the group medians are ≤x, which is at least └└ n/5 ┘ /2 ┘ = └ n/10 ┘ group medians. Therefore, at least 3 └ n/10 ┘ elements are ≤x.

Analysis Look at this figure carefully –there are some directed paths and gives us more information than we just had. –Now all the elements in the block are ≥ x. –How many elements are there?

Analysis At least half the group medians are ≥ x, which is at least └└ n/5 ┘ /2 ┘ = └ n/10 ┘ group medians. Therefore, at least 3 └ n/10 ┘ elements are ≥ x. Similarly, at least 3 └ n/10 ┘ elements are ≤ x.

Analysis Then, what’s the expression of the cost of 3-case recursion? –One side with at least 3 └ n/10 ┘ elements –The other side with at most 7 └ n/10 ┘ elements –Then the cost is T(7 └ n/10 ┘ ) For n≥50, we have 3 └ n/10 ┘ ≥ n/4 –It means, for n≥50 we have 7 └ n/10 ┘ ≤ 3n/4 –T ( 3n/4) is even better than our expectation 4n/5. For n ≤ 50, we have T(n) = Θ(1).

Total Running Time

Solving the recurrence How? Substitution Method residual desired If c is chosen large enough to handle Θ(n)

Conclusions Since the work at each level of recursion is a constant fraction (19/20) smaller, the work per level is a geometric series. In practice, this algorithm runs slowly, because the constant in front of n is large. The randomized algorithm is far more practical.

Further Thought Why did we use groups of five? Why not groups of three? How about 7?