Study Group Randomized Algorithms Jun 7, 2003 Jun 14, 2003
Randomized Algorithms A randomized algorithm is defined as an algorithm that is allowed to access a source of independent, unbiased random bits, and it is then allowed to use these random bits to influence its computation. Input Output Algorithm Random bits
Monte Carlo and Las Vegas There are two kinds of randomized algorithms: –Monte Carlo: A Monte Carlo algorithm runs for a fixed number of steps for each input and produces an answer that is correct with a bounded probability –Las Vegas: A Las Vegas algorithm always produces the correct answer, but its runtime for each input is a random variable whose expectation is bounded.
Question Is the max-cut algorithm that we discussed previously a Monte Carlo or Las Vegas algorithm? We will see two other examples today.
Randomized Quick Sort In traditional Quick Sort, we will always pick the first element as the pivot for partitioning. The worst case runtime is O(n 2 ) while the expected runtime is O(nlogn) over the set of all input. Therefore, some input are born to have long runtime, e.g., an inversely sorted list.
Randomized Quick Sort In randomized Quick Sort, we will pick randomly an element as the pivot for partitioning. The expected runtime of any input is O(nlogn).
Analysis of Randomized QS Let s(i) be the i th smallest element in the input list S. X ij is a random variable such that X ij = 1 if s(i) is compared with s(j); X ij = 0 otherwise. Expected runtime t of randomized QS is: E[X ij ] is the expected value of X ij over the set of all random choices of the pivots, which is equal to the probability p ij that s(i) will be compared with s(j).
Analysis of Randomized QS We can represent the whole sorting process by a binary tree T: Notice that s(i) will be compared with s(j) where i<j if and only if s(i) or s(j) is the first one among the set {s(i), s(i+1), …, s(j)} to be selected as the pivot. Note that p ij = 2/(j-i+1). Why? st pivot 2 nd pivot 3 rd pivot 4 th pivot 5 th pivot
Analysis of Randomized QS Therefore, the expected runtime t: Note that Randomized QS is a Las Vegas algorithm.
Randomized Min-cut Given an undirected, connected multi- graph G(V,E), we want to find a cut (V 1,V 2 ) such that the number of edges between V 1 and V 2 is minimum. This problem can be solved optimally by applying the max-flow min-cut algorithm O(n 2 ) time by trying all pairs of source and destination.
Randomized Min-cut In randomized Min-cut, we repeatedly do the following: Pick randomly an edge e(u,v). Merge u and v, and remove all the edges between u and v. For example: until there are only 2 vertices left. We will report the cut between these 2 vertices as the min-cut. uv x y z u,v x y z
Analysis of Randomized Min-cut Let k be the min-cut of the given graph G(E,V) where |V|=n. Then |E| ≥ kn/2. The probability q 1 of picking one of those k edges in the first merging step ≤ 2/n The probability p 1 of not picking any of those k edges in the first merging step ≥ (1-2/n) Repeat the same argument for the first n-2 merging steps. Probability p of not picking any of those k edges in all the merging steps ≥ (1-2/n)(1-2/(n-1))(1-2/(n-2))…(1-2/3)
Analysis of Randomized Min-cut Therefore, the probability of finding the min-cut: If we repeat the whole procedure n 2 /2 times, the probability of not finding the min-cut is at most Randomized Min-cut is a Monte Carlo Algorithm.
Question What will happen if we apply a similar approach to find the max-cut instead? Will it be better or worse than the previous method of random assignment?
Complexity Classes There are some interesting complexity classes involving randomized algorithms: –Randomized Polynomial time (RP) –Zero-error Probabilistic Polynomial time (ZPP) –Probabilistic Polynomial time (PP) –Bounded-error Probabilistic Polynomial time (BPP)
RP Definition: The class RP consists of all languages L that have a randomized algorithm A running in worst-case polynomial time such that for any input x in ∑*:
RP Independent repetitions of the algorithms can be used to reduce the probability of error to exponentially small. Notice that the success probability can be changed to an inverse polynomial function of the input size without affecting the definition of RP. Why?
ZPP Definition: The class ZPP is the class of languages which have Las Vegas algorithms running in expected polynomial time. ZPP = RP ∩ co-RP. Why? (Note that a language L is in co-X where X is a complexity class if and only if it’s complement ∑*-L is in X.)
PP Definition: The class PP consists of all languages L that have a randomized algorithm A running in worst-case polynomial time such that for any input x in ∑*:
PP To reduce the error probability, we can repeat the algorithm several times on the same input and produce the output which occurs in the majority of those trials. However, the definition of PP is quite weak since we have no bound on how far from ½ the probabilities are. It may not be possible to use a small number (e.g., polynomial no.) of repetitions to obtain a significantly small error probability.
Question Consider a randomized algorithm with 2- sided error as in the definition of PP. Show that a polynomial no. of independent repetitions of this algorithm needs not suffice to reduce the error probability to ¼. (Hint: Consider the case where the error probability is ½ - ½ n. )
BPP Definition: The class BPP consists of all languages L that have a randomized algorithm A running in worst-case polynomial time such that for any input x in ∑*:
BPP For this class of algorithms, the error probability can be reduced to ½ n with only a polynomial number of iterations. In fact, the probability bounds ¾ and ¼ can be changed to ½ +1/p(n) and ½ -1/p(n) respectively where p(n) is a polynomial function of the input size n without affecting the definition of BPP. Why?