Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms.

Slides:



Advertisements
Similar presentations
Quantum Lower Bounds You probably Havent Seen Before (which doesnt imply that you dont know OF them) Scott Aaronson, UC Berkeley 9/24/2002.
Advertisements

Randomized Algorithms Introduction Rom Aschner & Michal Shemesh.
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Models of Computation Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms Week 1, Lecture 2.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Heapsort O(n lg n) worst case Another design paradigm –Use of a data structure (heap) to manage information during execution of algorithm Comparision-based.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
CSE 3101: Introduction to the Design and Analysis of Algorithms
Medians and Order Statistics
Median Finding, Order Statistics & Quick Sort
Deterministic Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms.
Using Divide and Conquer for Sorting
Algorithms Lecture 10 Lecturer: Moni Naor. Linear Programming in Small Dimension Canonical form of linear programming Maximize: c 1 ¢ x 1 + c 2 ¢ x 2.
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
Analysis of Algorithms
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
1 Introduction to Randomized Algorithms Md. Aashikur Rahman Azim.
Quick-Sort     29  9.
Design & Analysis of Algorithms COMP 482 / ELEC 420 John Greiner.
Advanced Topics in Algorithms and Data Structures Lecture 6.1 – pg 1 An overview of lecture 6 A parallel search algorithm A parallel merging algorithm.
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different but similar analyses –Probabilistic analysis of a deterministic algorithm.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
General Computer Science for Engineers CISC 106 James Atlas Computer and Information Sciences 10/23/2009.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
CHAPTER 11 Sorting.
Tutorial 4 The Quicksort Algorithm. QuickSort  Divide: Choose a pivot, P Form a subarray with all elements ≤ P Form a subarray with all elements > P.
Sorting Lower Bound Andreas Klappenecker based on slides by Prof. Welch 1.
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
Sorting Lower Bound1. 2 Comparison-Based Sorting (§ 4.4) Many sorting algorithms are comparison based. They sort by making comparisons between pairs of.
Sorting in Linear Time Lower bound for comparison-based sorting
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Chapter 14 Randomized algorithms Introduction Las Vegas and Monte Carlo algorithms Randomized Quicksort Randomized selection Testing String Equality Pattern.
Chapter 7 Quicksort Ack: This presentation is based on the lecture slides from Hsu, Lih- Hsing, as well as various materials from the web.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
Order Statistics. Order statistics Given an input of n values and an integer i, we wish to find the i’th largest value. There are i-1 elements smaller.
Search Algorithms Prepared by John Reif, Ph.D. Analysis of Algorithms.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Lecture 2 Sorting. Sorting Problem Insertion Sort, Merge Sort e.g.,
Probability Theory Overview and Analysis of Randomized Algorithms Prepared by John Reif, Ph.D. Analysis of Algorithms.
Princeton University COS 423 Theory of Algorithms Spring 2001 Kevin Wayne Average Case Analysis.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
Computer Science 101 Fast Algorithms. What Is Really Fast? n O(log 2 n) O(n) O(n 2 )O(2 n )
Sorting 1. Insertion Sort
Data Structures Haim Kaplan & Uri Zwick December 2013 Sorting 1.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
Review 1 Insertion Sort Insertion Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.
1 Chapter 8-1: Lower Bound of Comparison Sorts. 2 About this lecture Lower bound of any comparison sorting algorithm – applies to insertion sort, selection.
Lecture 2 Sorting.
Probability Theory Overview and Analysis of Randomized Algorithms
Randomized Algorithms
Sorting atomic items Chapter 5 Distribution based sorting paradigms.
Analysis of Algorithms
Dr. Yingwu Zhu Chapter 9, p Linear Time Selection Dr. Yingwu Zhu Chapter 9, p
Randomized Algorithms
Medians and Order Statistics
Data Structures Sorting Haim Kaplan & Uri Zwick December 2014.
Chapter 4.
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Tomado del libro Cormen
Compact routing schemes with improved stretch
The Selection Problem.
Algorithms CSCI 235, Spring 2019 Lecture 17 Quick Sort II
CS200: Algorithm Analysis
Medians and Order Statistics
Presentation transcript:

Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms

Randomized Algorithms for Selection and Sorting a)Randomized Sampling b)Selection by Randomized Sampling c)Sorting by Randomized Splitting: Quicksort and Multisample Sorts

Readings Main Reading Selections: CLR, Chapters 9

Comparison Problems Input set X of N distinct keys total ordering < over X Problems 1)For each key x X rank(x,X) = |{x’ X| x’ < x}| + 1 2)For each index i {1, …, N) select (i,X) = the key x X where i = rank(x,X) 3)Sort (X) = (x 1, x 2, …, x n ) where x i = select (i,X)

Randomized Comparison Tree Model

Algorithm Samplerank s (x,X) begin Let S be a random sample of X-{x} of size s output 1+ N/s [rank(x,S)-1] end

Algorithm Samplerank s (x,X) (cont’d) Lemma 1 –The expected value of samplerank s (x,X) is rank (x,X)

Algorithm Samplerank s (x,X) (cont’d)

S is Random Sample of X of Size s

More Precise Bounds on Randomized Sampling (cont’d)

More Precise Bounds on Randomized Sampling Let S be a random sampling of X Let r i = rank(select(i,S),X)

More Precise Bounds on Randomized Sampling (cont’d) Proof We can bound r i by a Beta distribution, implying Weak bounds follow from Chebychev inequality The Tighter bounds follow from Chernoff Bounds

Subdivision by Random Sampling Let S be a random sample of X of size s Let k 1, k 2, …, k s be the elements of S in sorted order These elements subdivide X into

Subdivision by Random Sampling (cont’d) How even are these subdivisions?

Subdivision by Random Sampling (cont’d) Lemma 3 If random sample S in X is of size s and X is of size N, then S divides X into subsets each of

Subdivision by Random Sampling (cont’d) Proof The number of (s+1) partitions of X is The number of partitions of X with one block of size  v is

Subdivision by Random Sampling (cont’d) So the probability of a random (s+1) partition having a block size  v is

Randomized Algorithms for Selection “canonical selection algorithm” Algorithm can select (i,X) input set X of N keys index i  {1, …, N} [0] if N=1 then output X [1] select a bracket B of X, so that select (i,X) B with high prob. [2] Let i 1 be the number of keys less than any element of B [3] output can select (i-i 1, B) B found by random sampling

Hoar’s Selection Algorithm Algorithm Hselect (i,X) where 1  i  N begin if X = {x} then output x else choose a random splitter k  X let B = {x  X|x < k} if |B| i then output Hselect(i,B) else output Hselect(i-|B|, X-B) end

Hoar’s Selection Algorithm (cont’d) Sequential time bound T(i,N) has mean

Hoar’s Selection Algorithm (cont’d) Random splitter k  X Hselect (i,X) has two cases

Hoar’s Selection Algorithm (cont’d) Inefficient: each recursive call requires N comparisons, but only reduces average problem size to ½ N

Improved Randomized Selection By Floyd and Rivest Algorithm FRselect(i,X) begin if X = {x} then output x else choose k 1, k 2  X such that k 1 < k 2 let r 1 = rank(k 1, X), r 2 = rank(k 2, X) if r 1 > i then FRselect(i, {x  X|x < k 1 }) else if r 2 > i then FRselect(i-r 1, {x  X|k 1 x k 2 }) else FRselect(i-r 2, {x  X|x > k 2 }) end

Choosing k 1, k 2 We must choose k 1, k 2 so that with high likelihood, k 1  select(i,X)  k 2

Choosing k 1, k 2 (cont’d) Choose random sample S  X size s Define

Choosing k 1, k 2 (cont’d) Lemma 2 implies

Expected Time Bound

Small Cost of Recursions Note With prob  1 - 2N - each recursive call costs only O(s) = o(N) rather than N in previous algorithm

Randomized Sorting Algorithms “canonical sorting algorithm” Algorithm cansort(X) begin if x={x} then output X else choose a random sample S of X of size s Sort S S subdivides X into s+1 subsets X 1, X 2, …, X s+1 output cansort (X 1 ) cansort (X 2 )… cansort(X s+1 ) end

Using Random Sampling to Aid Sorting Problem: –Must subdivide X into subsets of nearly equal size to minimize number of comparisons –Solution: random sampling!

Hoar’s Randomized Sorting Algorithm Uses sample size s=1 Algorithm quicksort(X) begin if |X|=1 then output X else choose a random splitter k  X output quicksort ({x  X|x k}) end

Expected Time Cost of Hoar’s Sort Inefficient: Better to divide problem size by ½ with high likelihood!

Improved Sort using Better choice of splitter is k = sample select s (N/2, N) Algorithm samplesort s (X) begin if |X|=1 then output X choose a random subset S of X size s = N/log N k  select (S/2,S) cost time o(N) output samplesort s ({x  X|x k}) end

Randomized Approximant of Mean By Lemma 2, rank(k,X) is very nearly the mean:

Improved Expected Time Bounds Is optimal for comparison trees!

Open Problems in Selection and Sorting 1)Improve Randomized Algorithms to exactly match lower bounds on number of comparisons 2)Can we de randomize these algorithms – i.e., give deterministic algorithms with the same bounds?

Randomized Algorithms for Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms