4/17/2003CSE 202 - More on Sorting CSE 202 - Algorithms Sorting-related topics 1.Lower bound on comparison sorting 2.Beating the lower bound 3.Finding.

Slides:



Advertisements
Similar presentations
Lower bound: Decision tree and adversary argument
Advertisements

Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
CSE 3101: Introduction to the Design and Analysis of Algorithms
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
MS 101: Algorithms Instructor Neelima Gupta
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms Jiafen Liu Sept
1 SORTING Dan Barrish-Flood. 2 heapsort made file “3-Sorting-Intro-Heapsort.ppt”
Sorting Heapsort Quick review of basic sorting methods Lower bounds for comparison-based methods Non-comparison based sorting.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Divide and Conquer Sorting
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Analysis of Algorithms CS 477/677
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
10/15/2002CSE More on Sorting CSE Algorithms Sorting-related topics 1.Lower bound on comparison sorting 2.Beating the lower bound 3.Finding.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Analysis of Algorithms CS 477/677
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Lower Bounds & Sorting in Linear Time
Sorting.
Order Statistics.
Order Statistics Comp 122, Spring 2004.
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
MCA 301: Design and Analysis of Algorithms
Introduction to Algorithms
Hash Functions/Review
Linear Sorting Sections 10.4
Randomized Algorithms
CS 3343: Analysis of Algorithms
Order Statistics Comp 550, Spring 2015.
Lower Bounds & Sorting in Linear Time
Linear Sorting Section 10.4
Linear-Time Sorting Algorithms
CSE 326: Data Structures Sorting
CSE 332: Data Abstractions Sorting I
CS 3343: Analysis of Algorithms
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Order Statistics Comp 122, Spring 2004.
Chapter 9: Selection of Order Statistics
Chapter 8: Overview Comparison sorts: algorithms that sort sequences by comparing the value of elements Prove that the number of comparison required to.
CS 583 Analysis of Algorithms
The Selection Problem.
Algorithms CSCI 235, Spring 2019 Lecture 19 Order Statistics
CS200: Algorithm Analysis
Presentation transcript:

4/17/2003CSE More on Sorting CSE Algorithms Sorting-related topics 1.Lower bound on comparison sorting 2.Beating the lower bound 3.Finding medians and order statistics (chapters 8 & 9)

CSE More on Sorting2 The game of “20 questions” Suppose I choose one of k objects. –We both know the set of objects, e.g. {1,2,...,k}. You ask me yes-no questions. –I answer truthfully. How many questions do you need to ask (worst case)? A binary decision tree for {1,2,3,4,5}... odd? 2? 3? 5? y n

CSE More on Sorting3 How many comparisons for sorting? Comparison sorts asks only yes-no questions. –“Is x(i) > x(j)” A sorting algorithm must get a different sequence of answers on each distinct input. For n elements, there are n! possible inputs. Thus, we need at least lg (n!) comparisons.

CSE More on Sorting4 Estimating lg(n!) Direct computation: –For n>1, n! < n n, so lg(n!) < n lg n. so lg (n!) is O(n lg n). –For n>1, n! > (n/2) n/2. Obvious for n even. Hand waving for n odd. Thus, lg(n!) > (n/2) lg (n/2) = ½ n (lg n – 1). For n>4, (lg n – 1) > lg n - (lg n)/2 = lg n /2. Thus, lg(n!) > ¼ n lg n, proving lg(n!) is  (n lg n). Or use Stirling’s formula: n!  (2  n) ½ (n/e) n. Yadda, yadda, yadda... (Gives a tighter bound). This is third time I’ve used this trick!

CSE More on Sorting5 Best known comparison sort n  lg n!  Merge sort Best possible ? Source: Sloan’s “Encyclopedia of Integer Sequences” (try Google on “sloane sequence”)

CSE More on Sorting6 Radix Sort (not a comparison sort) Given a list of n k-digit numbers, For i = 1 to k { partition data into bins according to the i-th digit; reassemble bins into one list; } At each iteration, keep the data in each bin in the same order as it was in the list. Result: you’ll sort the entire list. Practical considerations: How do you manage storage? How do you “reassemble”? Important! “First digit” means the low-order one.

CSE More on Sorting7 Analysis of Radix Sort Assuming “digit” means “base 10 digit”... –What is the complexity? –Have we accomplished anything? What if one used some other base?? Is this a linear time algorithm??? –One “random access” step (with b possible choices) may be worth lg b Yes-No questions. –If you can arrange things right.

CSE More on Sorting8 Bucket Sort Given N data items, uniformly distributed in [0,1]. –A “reason 2” scenario. Initialize N Buckets to “empty”; For I = 1 to N Put A[I] into Bucket  N A[I]  ; For I = 1 to N Sort Bucket I; /* N^2 method is OK */ Concatenate Buckets; Analysis Let Xij = 1 if A[i] and A[j] end up in same bucket, 0 otherwise. Xij is a random variable. (What is the sample space??) Let T(N) =   Xij. T(N) is upper bound on comparisons needed. E(Xij) = 1/N, so E(T(N)) =   1/N = N. (Other steps are  (N).) i=1j=1 NN i=1 j=1 NN why ?? Fine print: This is all messed up for i=j. Can you fix it?

CSE More on Sorting9 Sorting small numbers Suppose you have N numbers in the range of 0 to K-1. –N may be larger than K (repeats are possible) What does Radix Sort look like? What does Bucket Sort look like?

CSE More on Sorting10 Sorting small numbers Suppose you have N numbers in the range of 0 to K-1. –N may be larger than K (repeats are possible) What does Radix Sort look like? What does Bucket Sort look like? Both are closely related to Counting Sort for i = 1 to N, Count[A[i]]++; for j = 2 to K, Count[j] += Count[j-1]; for i = 1 to N, B[Count[A[i]]--] = A[i];

CSE More on Sorting11 Summary Radix sort and bucket sort are linear time under certain assumptions –Radix sort – numbers aren’t too long. For instance, n numbers in {1, 2,..., n 2 } –Bucket sort – average case; must know distribution. Sorting n n-bit long numbers in linear time is an open problem. There’s a O(n lg lg n lg lg lg n) technique know. “Linear for all reasonable values of n”, but unlikely to be used in practice. consider n = 2 100

CSE More on Sorting12 Stability A sorting algorithm is stable if elements with equal keys stay in the same order. If Sort({5, 8, 3, 10, 8}) returned {10, 8, 8, 5, 3}, it wouldn’t be stable (since 8 and 8 got swapped). Stable sorting is useful; e.g. you might want to first sort by one key field, then by another. Is Heapsort stable? Is Quicksort stable?

CSE More on Sorting13 Order statistics Select(A,k) – returns k th smallest from n-element set A. Median(A) = Select (A,  n/2  ). Consider only comparison-based methods. Select(A,1) – needs exactly n-1 comparisons. Tree-based tournament or single pass needs only n-1. Can’t do better - every element except minimum must “lose”. Select(A,2) – can be done with n +  lg n  comparisons. Double elimination tournament. Select(A,k) – can be done with n + k 2 lg n

CSE More on Sorting14 What about linear-time Select? (from now on, assume no duplicates in A) Given x, in n-1 comparisons, you can find its rank and partition A into A lo (items smaller than x) and A hi. If rank of x is i, and A = A lo  {x}  A hi, then –if j<i, Select(A, j) = Select(A lo, j)... or... –if j>i Select(A, j) = Select(A hi, j-i). This suggests using divide and conquer –Find some x “near” the median “quickly”. –Partition A into A lo  {x}  A hi using n-1 comparisons. –Reduce problem to “about” half the size. –“Almost” gives recurrence T(n) < T(n/2) + c n. which implies T(n) is O(n).

CSE More on Sorting15 Does this really work?? 1.Let B = half of A; free 2.Let x = Median(B); T(n/2) 3.Find i=rank(x), A = A lo  {x}  A hi ; < n 4.If (k<i) Select (A lo, k); T(3n/4) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < T(n/2) + T(3n/4) + cn Hmmm... need to try something different

CSE More on Sorting16 Does this really work (attempt #2) 1.Let B1, B2, B3 be thirds of A; free 2.Let x j = Median(Bj); x= Median({x j }); 3T(n/3)+3 3.Find i=rank(x), A = A lo  {x}  A hi ; < n 4.If (k<i) Select (A lo, k); T( ?? ) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < 3T(n/3) + T( ?? ) + cn Not particularly better... need to try something different

CSE More on Sorting17 Does this really work (attempt #3) 1.Let B1, B2,..., Bn/3 each have size 3; free 2.Let x j = Median(Bj); n/3 x 3 = n 3.x = Median({xi}); T(n/3) 4.i = rank(x), A = A lo  {x}  A hi ; < n 5.If (k<i) Select (A lo, k); T( ?? ) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < T(n/3) + T( ?? ) + cn Are we getting anywhere?? Don’t give up !! One more idea and it can be done.

CSE More on Sorting18 Does this really work (attempt #4) 1.Let B1, B2,..., B(n/5) each have size 5; free 2.Let xi = Median(Bi); n/5 x 7 < 2n 3.x= Median({xi}); T(n/5) 4.i = rank(x), A = A lo  {x}  A hi ; < n 5.If (k<i) Select (A lo, k); T( 7n/10) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < T(n/5) + T(7n/10 ) + cn Yes!! Best known results: can find median in 3n comparisons, lower bound is 2n.

CSE More on Sorting19 Proof that recursion for median algorithm is O(n) Given T(n) = T(  n/5  ) + T(  7n/10  ) + f(n), T(0)=0, and f(n) is O(n). We know  n 0, c 0 s.t.  n>n 0, f(n)  c 0 n. (Call this equation [1].) Let c = max ( 10c 0, max {T(n)/n} ). So c 0  c/10 [2] and  n  n 0, cn  T(n). [3] Claim:  n>0, T(n)  c n. Proof by induction on n. Let P(n) be the statement, “T(n)  c n”. Bases cases (P(n) for n = 1,..., n 0 ) : These all follow from [3]. Inductive step: Given n>n 0, assume  k<n P(k) (i.e. assume  k<n T(k)  c k). In particular, since  n/5  < n, T(  n/5  )  c  n/5 , which is  cn/5. [4] Similarly, T(  7n/10  )  c  7n/10   7cn/10, [5] Then T(n) = T(  n/5  ) + T(  7n/10  ) + f(n) (definition of T(n).)  cn/5 + 7cn/10 + c 0 n (from [4], [5], and [1],)  cn/5 + 7cn/10 + cn/10 (from [2].)  cn(1/5 + 7/10 + 1/10) = cn. Q.E.D. 0<n  n 0

CSE More on Sorting20 What happens if we change floors to ceilings?? Given T(n) = T(  n/5  ) + T(  7n/10  ) + f(n), T(0)=0, and f(n) is O(n). We could argue that for n>100,  n/5  <.21n and  7n/10  <.71n. We’d also can change definition of c to ensure c 0 .08c. To do so, we’d say, “Let c = max ( c 0 /.08, max {T(n)/n} ).” Then, when we get to... “Then T(n) = T(  n/5  ) + T(  7n/10  ) + f(n)” we’ll be able to argue that “T(n) .21cn +.71cn +.08cn = cn.” and be done. 0<n  n 0