Lecture 10. Paradigm #8: Randomized Algorithms Back to the “majority problem” (finding the majority element in an array A). FIND-MAJORITY(A, n) while (true)

Slides:



Advertisements
Similar presentations
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Advertisements

Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Lecture 3: Randomized Algorithm and Divide and Conquer II: Shang-Hua Teng.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Medians and Order Statistics
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
1 Today’s Material Medians & Order Statistics – Ch. 9.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: CLRS “Intro.
Divide and Conquer. Recall Complexity Analysis – Comparison of algorithm – Big O Simplification From source code – Recursive.
Spring 2015 Lecture 5: QuickSort & Selection
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)
Median/Order Statistics Algorithms
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
Algorithm Design Strategy Divide and Conquer. More examples of Divide and Conquer  Review of Divide & Conquer Concept  More examples  Finding closest.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection1. 2 The Selection Problem Given an integer k and n elements x 1, x 2, …, x n, taken from a total order, find the k-th smallest element in this.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Lecture 2 We have given O(n 3 ), O(n 2 ), O(nlogn) algorithms for the max sub-range problem. This time, a linear time algorithm! The idea is as follows:
Divide-and-Conquer 7 2  9 4   2   4   7
2.4 Sequences and Summations
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Project 2 due … Project 2 due … Project 2 Project 2.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Order Statistics(Selection Problem)
Deterministic and Randomized Quicksort Andreas Klappenecker.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
Data Structures Haim Kaplan & Uri Zwick December 2013 Sorting 1.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
Lecture # 6 1 Advance Analysis of Algorithms. Divide-and-Conquer Divide the problem into a number of subproblems Similar sub-problems of smaller size.
Randomized Quicksort (8.4.2/7.4.2) Randomized Quicksort –i = Random(p, r) –swap A[p]  A[i] –partition A(p, r) Average analysis = Expected runtime –solving.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
Analysis of Algorithms CS 477/677
Order Statistics.
Randomized Algorithms
QuickSort QuickSort Best, Worst Average Cases K-th Ordered Statistic
Order Statistics(Selection Problem)
Randomized Algorithms
Topic: Divide and Conquer
CS 3343: Analysis of Algorithms
Order Statistics Comp 550, Spring 2015.
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Topic: Divide and Conquer
Chapter 9: Selection of Order Statistics
The Selection Problem.
Quick-Sort 5/25/2019 6:16 PM Selection Selection.
Divide-and-Conquer 7 2  9 4   2   4   7
CS200: Algorithm Analysis
Presentation transcript:

Lecture 10. Paradigm #8: Randomized Algorithms Back to the “majority problem” (finding the majority element in an array A). FIND-MAJORITY(A, n) while (true) do randomly choose 1 ≤ I ≤ n; if A[i] is the majority then return (A[i]); If there is a majority element, we get it with probability > ½ each round. The expected number of rounds is 2

Computing ∑ s in S Pr[s] s Each time a new index of the array is guessed, we find the majority element with probability p > 1/2. So the expected number of guesses is p*1 + (1-p)p*2 + (1-p) 2 p* = Σ i ≥ 1 (1-p) i-1 ⋅ p ⋅ i = x. To evaluate this, multiply x by (1-p) and add y = ∑ i ≥ 1 (1-p) i p. We get ∑ i ≥ 1 (1-p) i p (i+1), which is just x-p. Now, using the sum of a geometric series, Σ i ≥ 0 ar i = a (1-r ∞ )/ (1-r) y evaluates to 1-p. So (1-p)x + (1-p) = x-p. So x = 1/p < 2

Another example: order statistics Problem: Given a list of numbers of length n, and an integer i (with 1 ≤ i ≤ n), determine the i-th smallest member in the list. This problem is also called "computing order statistics". Special cases: computing the minimum corresponds to i = 1, computing the maximum corresponds to i = n, computing the 2nd largest corresponds to i = n-1, and computing the median corresponds to i = ceil(n/2). Obviously you could compute the i'th smallest by sorting, but that takes O(n log n) time, which is not optimal. We know how to solve it when i = 1, n-1, n in linear time. We now give a randomized algorithm to do selection in expected linear time. By expected (or average) linear time, we mean the average over all random choices used by the algorithm, and not depending on the particular input.

Randomized selection Remember the PARTITION algorithm for Quicksort in CS240. RANDOMIZED-PARTITION chooses a random index j to partition. RANDOMIZED-SELECT(A,p,r,i) /* Find the i'th smallest element from the array A[p..r] */ if p = r then return A[p]; q := RANDOMIZED-PARTITION(A,p,r); k := q-p+1; /* number of elements in left side of partition */ if i = k then return A[q] ; else if i < k then return RANDOMIZED-SELECT(A,p,q-1,i) ; else return RANDOMIZED-SELECT(A,q+1,r,i-k);

Expected running time: O(n) Expected running time: T(n) ≤ (1/n)*Σ 1 ≤ k ≤ n T(max(k-1,n-k)) + O(n) Prove T(n) ≤ cn by induction. Assume it is true for n ≥ 3. Now assume it is true for n' < n. Then T(n) ≤ ( (1/n)*Σ 1 ≤ k ≤ n T(max(k-1,n-k)) ) + O(n) ≤ ((2/n)*Σ floor(n/2) ≤ k ≤ n-1 T(k)) + O(n) ≤ ((2/n)*Σ floor(n/2) ≤ k ≤ n-1 ck) + an (for some a ≥ 1) = ( (2c/n)*Σ floor(n/2) ≤ k ≤ n-1 k) + an = ( (2c/n)*(Σ 1 ≤ k ≤ n-1 k - Σ 1 ≤ k ≤ floor(n/2)-1 k ) + an ≤ (2c/n)* ( n(n-1)/2 - (n/2 - 1)(n/2-2)/2) + an = (2c/n)*(n 2 - n - n 2 /4 + n + n/2 - 2)/2 + an = (c/n)*(3n 2 /4 + n/2 - 2) + an ≤ 3cn/4 + c/2 + an < cn for c large enough.

Example: Matrix multiplication Given nxn matrices, A,B,C, test: C=AxB Trivial O(n 3 ) Strassen O(n log7 ) Best deterministic alg O(n ) Consider the following randomized alg: repeat k times generate a random nx1 vector x in {-1,1} n ; if A(Bx) ≠ Cx, then return AB ≠ C; return AB = C;

Error bound. Theorem. This algorithm errs with prob ≤ 2 -k. Proof. If C’=AB ≠ C, then for some i, j, c ij ≠ ∑ l=1..n a il b lj =c’ ij Then C’x ≠ Cx for either x j = 1 or x j = -1. Thus with probability ½, one round will tell if AB≠C. After k rounds, error probability is ≤ 2 -k.

Some comments about Randomness You might object: in practice, we do not use truly random numbers, but rather "pseudo- random" numbers. The difference may be crucial. For example, this is one way a pseudorandom number is generated X n+1 = a X n + c (mod m) As von Neumann said in 1951: “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."

Which sequence is random? If you go to a casino, you flip a coin they provided, and you get 100 heads in a row. You complain to the casino: “Your coin is biased, H 100 is not a random sequence!” The casino owner would demand: “You give me a random sequence.” You confidently flipped your own coin 100 times and obtained a sequence S, and tell the casino that you would trust S to be random. But casino can object: the two sequences have the same probability! Why is S more random than H 100

How to define Randomness Laplace once said: a sequence is “extraordinary” (non-random) when it contains regularity. Solomonoff, Kolmogorov, Chatin (1960’s): a random sequence is a sequence that is not (algorithmically) compressible (by a computer / Turing machine). Remember, we have used such sequences to obtain the average case complexity of algorithms. In your assignment 2, Problem 2, you have shown: random sequences exist (in fact, most sequences are random sequences).