Download presentation
Presentation is loading. Please wait.
1
CS 3343: Analysis of Algorithms
Order Statistics
2
Order statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is the nth order statistic The median is the n/2 order statistic If n is even, there are 2 medians How can we calculate order statistics? What is the running time?
3
Order statistics – selection problem
Select the ith smallest of n elements Naive algorithm: Sort. Worst-case running time Q(n log n) using merge sort or heapsort (not quicksort). We will show: A practical randomized algorithm with Q(n) expected running time A cool algorithm of theoretical interest only with Q(n) worst-case running time
4
Recall: Quicksort £ x x ³ x r p q k
The function Partition gives us the rank of the pivot If we are lucky, k = i. done! If not, at least get a smaller subarray to work with k > i: ith smallest is on the left subarray k < i : ith smallest is on the right subarray Divide and conquer If we are lucky, k close to n/2, or desired # is in smaller subarray If unlucky, desired # is in larger subarray (possible size n-1) £ x x ³ x r p q k
5
Randomized divide-and-conquer algorithm
RAND-SELECT(A, p, q, i) ⊳ i th smallest of A[ p . . q] if p = q & i > 1 then error! r RAND-PARTITION(A, p, q) k r – p + 1 ⊳ k = rank(A[r]) if i = k then return A[ r] if i < k then return RAND-SELECT( A, p, r – 1, i ) else return RAND-SELECT( A, r + 1, q, i – k ) £ A[r] ³ A[r] r p q k
6
Randomized Partition Randomly choose an element as pivot
Every time need to do a partition, throw a die to decide which element to use as the pivot Each element has 1/n probability to be selected Rand-Partition(A, p, q){ d = random(); // draw a random number between 0 and 1 index = p + floor((q-p+1) * d); // p<=index<=q swap(A[p], A[index]); Partition(A, p, q); // now use A[p] as pivot }
7
Select the 6 – 4 = 2nd smallest recursively.
Example Select the i = 6th smallest: 7 10 5 8 11 3 2 13 i = 6 pivot 3 2 5 7 11 8 10 13 Partition: k = 4 Select the 6 – 4 = 2nd smallest recursively.
8
Complete example: select the 6th smallest element.
i = 6 7 10 5 8 11 3 2 13 3 2 5 7 11 8 10 13 k = 4 i = 6 – 4 = 2 10 8 11 13 k = 3 i = 2 < k Note: here we always used first element as pivot to do the partition (instead of rand-partition). 8 10 k = 2 i = 2 = k 10
9
Intuition for analysis
(All our analyses today assume that all elements are distinct.) Lucky: T(n) = T(9n/10) + Q(n) = Q(n) CASE 3 Unlucky: T(n) = T(n – 1) + Q(n) = Q(n2) arithmetic series Worse than sorting!
10
Running time of randomized selection
T(max(0, n–1)) + n if 0 : n–1 split, T(max(1, n–2)) + n if 1 : n–2 split, M T(max(n–1, 0)) + n if n–1 : 0 split, T(n) ≤ For upper bound, assume ith element always falls in larger side of partition The expected running time is an average of all cases Expectation
11
Substitution method Want to show T(n) = O(n).
So need to prove T(n) ≤ cn for n > n0 Assume: T(k) ≤ ck for all k < n if c ≥ 4 Therefore, T(n) = O(n)
12
Summary of randomized selection
Works fast: linear expected time. Excellent algorithm in practice. But, the worst case is very bad: Q(n2). Q. Is there an algorithm that runs in linear time in the worst case? IDEA: Generate a good pivot recursively. A. Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973].
13
Worst-case linear-time selection
if i = k then return x elseif i < k then recursively SELECT the i th smallest element in the lower part else recursively SELECT the (i–k)th smallest element in the upper part SELECT(i, n) Divide the n elements into groups of 5. Find the median of each 5-element group by rote. Recursively SELECT the median x of the ën/5û group medians to be the pivot. Partition around the pivot x. Let k = rank(x). Same as RAND-SELECT
14
Choosing the pivot
15
Choosing the pivot Divide the n elements into groups of 5.
16
Choosing the pivot lesser greater Divide the n elements into groups of 5. Find the median of each 5-element group by rote.
17
Choosing the pivot x lesser greater Divide the n elements into groups of 5. Find the median of each 5-element group by rote. Recursively SELECT the median x of the ë n/5û group medians to be the pivot.
18
Analysis x lesser greater At least half the group medians are £ x, which is at least ë ë n/5û /2û = ë n/10û group medians.
19
Analysis x lesser greater At least half the group medians are £ x, which is at least ë ë n/5û /2û = ë n/10û group medians. Therefore, at least 3 ë n/10û elements are £ x. (Assume all elements are distinct.)
20
Analysis x lesser greater At least half the group medians are £ x, which is at least ë ë n/5û /2û = ë n/10û group medians. Therefore, at least 3 ë n/10û elements are £ x. Similarly, at least 3 ë n/10û elements are ³ x.
21
Need “at most” for worst-case runtime
Analysis Need “at most” for worst-case runtime At least 3 ë n/10û elements are £ x at most n-3 ë n/10û elements are x At least 3 ë n/10û elements are x at most n-3 ë n/10û elements are x The recursive call to SELECT in Step 4 is executed recursively on at most n-3 ë n/10û elements. 3 ë n/10û Possible position for pivot
22
Analysis Use fact that ë a/bû > a/b-1
n-3 ë n/10û < n-3(n/10-1) 7n/10 + 3 3n/4 if n ≥ 60 The recursive call to SELECT in Step 4 is executed recursively on at most 7n/10+3 elements.
23
Developing the recurrence
T(n) if i = k then return x elseif i < k then recursively SELECT the i th smallest element in the lower part else recursively SELECT the (i–k)th smallest element in the upper part SELECT(i, n) Divide the n elements into groups of 5. Find the median of each 5-element group by rote. Recursively SELECT the median x of the ën/5û group medians to be the pivot. Partition around the pivot x. Let k = rank(x). Q(n) T(n/5) Q(n) T(7n/10+3)
24
Solving the recurrence
Assumption: T(k) £ ck for all k < n if n ≥ 60 if c ≥ 20 and n ≥ 60
25
Conclusions Since the work at each level of recursion is basically a constant fraction (19/20) smaller, the work per level is a geometric series dominated by the linear work at the root. In practice, this algorithm runs slowly, because the constant in front of n is large. The randomized algorithm is far more practical. Exercise: Try to divide into groups of 3 or 7. Exercise: Think about an application in sorting.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.