Medians and Order Statistics

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Advertisements

1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.
CSE 3101: Introduction to the Design and Analysis of Algorithms
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
Spring 2015 Lecture 5: QuickSort & Selection
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Order Statistics. Order statistics Given an input of n values and an integer i, we wish to find the i’th largest value. There are i-1 elements smaller.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
1 Prune-and-Search Method 2012/10/30. A simple example: Binary search sorted sequence : (search 9) step 1  step 2  step 3  Binary search.
Order Statistics(Selection Problem)
Deterministic and Randomized Quicksort Andreas Klappenecker.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Algorithms CSCI 235, Fall 2015 Lecture 19 Order Statistics II.
Decision Problems Optimization problems : minimum, maximum, smallest, largest Satisfaction (SAT) problems : Traveling salesman, Clique, Vertex-Cover,
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
Randomized Quicksort (8.4.2/7.4.2) Randomized Quicksort –i = Random(p, r) –swap A[p]  A[i] –partition A(p, r) Average analysis = Expected runtime –solving.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
Analysis of Algorithms CS 477/677
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Algorithms and Data Structures Lecture VI
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
Insertion Sort
Order Statistics(Selection Problem)
Chapter 4: Divide and Conquer
Quick Sort (11.2) CSE 2011 Winter November 2018.
Dr. Yingwu Zhu Chapter 9, p Linear Time Selection Dr. Yingwu Zhu Chapter 9, p
Ch 7: Quicksort Ming-Te Chi
Randomized Algorithms
Medians and Order Statistics
Topic: Divide and Conquer
CS 3343: Analysis of Algorithms
Order Statistics Comp 550, Spring 2015.
CS 583 Analysis of Algorithms
EE 312 Software Design and Implementation I
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Chapter 7 Quicksort.
Chapter 9: Medians and Order Statistics
Topic: Divide and Conquer
Algorithms CSCI 235, Spring 2019 Lecture 20 Order Statistics II
Data Structures & Algorithms
Order Statistics Comp 122, Spring 2004.
Chapter 9: Selection of Order Statistics
CS 583 Analysis of Algorithms
The Selection Problem.
Quicksort and Randomized Algs
Algorithms CSCI 235, Spring 2019 Lecture 19 Order Statistics
Richard Anderson Lecture 14 Divide and Conquer
CS200: Algorithm Analysis
Presentation transcript:

Medians and Order Statistics i-th order statistic: i-th smallest element n elements: median is n odd: (n+1)/2 n even: n/2 or n/2+1 Assume distinct numbers. Input: A, n, 1<=i<=n Output: element x of A larger than i-1 elements of A.

Solutions O(n log n) time based on … O(n) time average. O(n) time worst case.

Minimum and Maximum How many comparisons? At most n-1. Examine each element and keep trach of smallest one: Comparison based Each element must be compared Each must loose once (except winner). What about simultaneous min and max?

Min & Max Can do with 2n-2 comparisons. Can do better Form pairs of elements Compare elements in each pair Pair (ai, ai+1), assume ai < ai+1, then Compare (min,ai), (ai+1,max) 3 comparisions for each pair.

Average Time Median Selection Divide-and-Conquer (prune-and-search). Randomized: behavior determined by output of random number generator. Based on QuickSort: Partition input array recursively, but Work only on one side!

Randomized Selection QuickSort(A,p,r) RandSelect(A,p,r,i) If p < r then q=partition(A,p,r) QuickSort(A,p,q) QuickSort(A,q+1,r). First call: QuickSort(A,1,n) After partition(A,p,r): A[i]<A[q}, i<q; A[q]<A[j}, q<j. RandSelect(A,p,r,i) If p == r then return A[p] q=RandPartition(A,p,r) k=q-p+1 /* size of A[p..q] If i ≤ k then return RandSelect(A,p,q,i) Else return RandSelect(A,q+1,r,i-k). First call: RandSelect(A,1,n,i). Returns the i-th smallest element in A[p..r].

Selection (cont.) RandPartition (see 8.3, 8.4 textbook) gives partition with low side: 1 element with probability 2/n j elements with probability 1/n, for j=2,3,…,n. Assume i-th element always on larger side: T(n)≤(T(max(1,n-1)+Σk=1..n-1T(max(k,n-k)))/n+O(n) ≤(T(n-1)+2 Σk=n/2..n-1T(k))/n+O(n) =2(Σk=n/2..n-1T(k))/n+O(n), since T(n-1)=O(n2). Then T(n)=O(n) (proof by substitution).

Worst Case Linear Time Selection O(n) worst case algorithm. Works in similar way: recursively partition input array Idea: guarantee good split E.g., in QuickSort assume at each recursion level have T(n)=T(9n/10)+T(n/10)+O(n). Then, T(n)=O(n log n). Use deterministic partitioning: Compute the element to partition around.

Steps to find i-th smallest element Algorithm Select Divide elements in n/5 groups of 5 elements, plus at most one group with (n mod 5) elements. Find median of each group: Insertion sort: O(1) time (at most 5 elements). Take middle element (largest if two medians). Use Select recursively to find median x of medians.

Algorithm Select (cont.) Partition input array around median-of-medians x. Let k be the number of elements on low side, n-k on high side. a1,a2,…,ak | ak+1,ak+2,…,an ai < aj, for 1 ≤ i ≤ k, k+1 ≤ j ≤ n. Use Select recursively to: Find i-th smallest element on low side, if i ≤ k Find (i-k)-th smallest on high side, if i > k.

Analysis Find lower bound on number of elements greater than x. At least half of medians in step 2 greater than x. Then, At least half of the groups contribute 3 elements that are greater than x, except: Last group (if less than 5 elements); x own group. Discard those two groups: Number of elements greater than x is ≥ 3((n/5)/2-2)=3n/10-6. Similarly, number of elements smaller than x is ≥3n/10-6. Then, in worst case, Select is called recursively in Step 5 on at most 7n/10+6 elements (upper bound).

Analysis (cont.) Steps 1,2 and 4: O(n) time. Step 3: T(n/5) Step 5: at most T(7n/10+6) 7n/10+6 < n for n > 20. T(n) ≤ T(|¯n/5¯|)+T(7n/10+6)+O(n), n > n1. Use substitution to solve: Assume T(n) ≤ cn, for n > n1; find n1 and c.

Analysis (cont.) T(n) ≤ c|¯n/5¯| + c(7n/10+6) + O(n) ≤ cn/5 + c + 7cn/10 + 6c +O(n) = 9cn/10 + 7c + O(n) Want T(n) ≤ cn: Pick c such that c(n/10-7) ≥ c1n, where c1 is constant from O(n) above (n1 = 80).

Questions Why not groups of 7 elements? Why not groups of 3 elements? T(n)=O(?)