10/15/2002CSE 202 - More on Sorting CSE 202 - Algorithms Sorting-related topics 1.Lower bound on comparison sorting 2.Beating the lower bound 3.Finding.

Slides:



Advertisements
Similar presentations
Analysis of Algorithms
Advertisements

Lower bound: Decision tree and adversary argument
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
1 More Sorting; Searching Dan Barrish-Flood. 2 Bucket Sort Put keys into n buckets, then sort each bucket, then concatenate. If keys are uniformly distributed.
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
CSE 3101: Introduction to the Design and Analysis of Algorithms
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
MS 101: Algorithms Instructor Neelima Gupta
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
Lectures on Recursive Algorithms1 COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski.
Using Divide and Conquer for Sorting
Algorithms Recurrences. Definition – a recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs Example.
Spring 2015 Lecture 5: QuickSort & Selection
Sorting Heapsort Quick review of basic sorting methods Lower bounds for comparison-based methods Non-comparison based sorting.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Algorithm Design Techniques: Induction Chapter 5 (Except Section 5.6)
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Divide and Conquer Sorting
Analysis of Algorithms CS 477/677
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Introduction to Algorithms Jiafen Liu Sept
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Instructor Neelima Gupta Table of Contents Review of Lower Bounding Techniques Decision Trees Linear Sorting Selection Problems.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Sorting & Lower Bounds Jeff Edmonds York University COSC 3101 Lecture 5.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
4/17/2003CSE More on Sorting CSE Algorithms Sorting-related topics 1.Lower bound on comparison sorting 2.Beating the lower bound 3.Finding.
Lower Bounds & Sorting in Linear Time
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
Introduction to Algorithms
Dr. Yingwu Zhu Chapter 9, p Linear Time Selection Dr. Yingwu Zhu Chapter 9, p
Randomized Algorithms
Topic: Divide and Conquer
Lower Bounds & Sorting in Linear Time
Linear-Time Sorting Algorithms
CSE 326: Data Structures Sorting
The Selection Problem.
CSE 332: Sorting II Spring 2016.
CS200: Algorithm Analysis
Presentation transcript:

10/15/2002CSE More on Sorting CSE Algorithms Sorting-related topics 1.Lower bound on comparison sorting 2.Beating the lower bound 3.Finding medians and order statistics (chapters 8 & 9)

CSE More on Sorting2 The game of “20 questions” Suppose I choose one of k objects. –We both know the set of objects, e.g. {1,2,...,k}. You ask me yes-no questions. –I answer truthfully. How many questions do you need to ask (worst case)? A binary decision tree for {1,2,3,4,5}... odd? 2? 3? 5? y n

CSE More on Sorting3 How many comparisons for sorting? Comparison sorts asks only yes-no questions. –“Is x(i) > x(j)” A sorting algorithm must get a different sequence of answers on each distinct input. For n elements, there are n! possible inputs. Thus, we need at least lg (n!) comparisons.

CSE More on Sorting4 Estimating lg(n!) Direct computation: –For n>1, n! < n n, so lg(n!) < n lg n. so lg (n!) is O(n lg n). –For n>1, n! > (n/2) n/2. Obvious for n even. Hand waving for n odd. Thus, lg(n!) > (n/2) lg (n/2) = ½ n (lg n – 1). For n>4, (lg n – 1) > lg n - (lg n /2) = lg n /2. Thus, lg(n!) > ¼ n lg n, proving lg(n!) is  (n lg n). Using Stirling’s formula: n!  (2  n) ½ (n/e) n. Yadda, yadda, yadda... (Gives a tighter bound).

CSE More on Sorting5 Best known comparison sort n  lg n!  Merge sort Best known ? Source: Sloan’s “Encyclopedia of Integer Sequences” (try Google on “sloane sequence”)

CSE More on Sorting6 Radix Sort (not a comparison sort) Given a list of n k-digit numbers, For i = 1 to k { partition data into bins according to the i-th digit; reassemble bins into one list; } At each iteration, keep the data in each bin in the same order as it was in the list. Result: you’ll sort the entire list. Practical considerations: How do you manage storage? How do you “reassemble”? Important! “First digit” means the low-order one.

CSE More on Sorting7 Analysis of Radix Sort Assuming “digit” means “base 10 digit”... –What is the complexity? –Have we accomplished anything? What if one used some other base?? Is this a linear time algorithm??? –One “random access” step (with b possible choices) may be worth lg b Yes-No questions. –If you can arrange things right.

CSE More on Sorting8 Bucket Sort Given N data items, uniformly distributed in [0,1]. –A “reason 2” scenario. Initialize N Buckets to “empty”; For I = 1 to N Put A[I] into Bucket  N A[I]  ; For I = 1 to N Sort Bucket I; /* N^2 method is OK */ Concatenate Buckets; Analysis Let Xij = 1 if A[i] and A[j] end up in same bucket, 0 otherwise. Xij is a random variable. (What is the sample space??) Let T(N) =   Xij. T(N) is upper bound on comparisons needed. E(Xij) = 1/N, so E(T(N)) =   1/N = N. (Other steps are  (N).) i=1j=1 NN i=1 j=1 NN why ??

CSE More on Sorting9 Summary Radix sort and bucket sort are linear time under certain assumptions –Radix sort – numbers aren’t too long. For instance, n numbers in {1, 2,..., n 2 } –Bucket sort – expected time, must know distribution. Sorting n n-bit long numbers in linear time is an open problem. There’s a O(n lg lg n lg lg lg n) technique know. “Linear for all reasonable values of n”, but unlikely to be used in practice. consider n = 2 100

CSE More on Sorting10 Order statistics Select(A,k) – returns k th smallest from n-element set A. Median(A) = Select (A,  n/2  ). Consider only comparison-based methods. Select(A,1) – needs exactly n-1 comparisons. Tree-based tournament or single pass needs only n-1. Can’t do better - every element except minimum must “lose”. Select(A,2) – can be done with n +  lg n  comparisons. Double elimination tournament. Select(A,k) – can be done with n + k 2 lg n

CSE More on Sorting11 What about linear-time Select? (from now on, assume no duplicates in A) Given x, in n-1 comparisons, you can find its rank and partition A into A lo (items smaller than x) and A hi. If rank of x is i, and A = A lo  {x}  A hi, then –if j<i, Select(A, j) = Select(A lo, j)... or... –if j>i Select(A, j) = Select(A hi, j-i). This suggests using divide and conquer –Find some x “near” the median “quickly”. –Partition A into A lo  {x}  A hi using n-1 comparisons. –Reduce problem to “about” half the size. –“Almost” gives recurrence T(n) < T(n/2) + c n. which implies T(n) is O(n).

CSE More on Sorting12 Does this really work?? 1.Let B = half of A; free 2.Let x = Median(B); T(n/2) 3.Find i=rank(x), A = A lo  {x}  A hi ; < n 4.If (k<i) Select (A lo, k); T(3n/4) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < T(n/2) + T(3n/4) + cn Hmmm... need to try something different

CSE More on Sorting13 Does this really work (attempt #2) 1.Let B1, B2, B3 be thirds of A; free 2.Let x j = Median(Bj); x= Median({x j }); 3T(n/3)+3 3.Find i=rank(x), A = A lo  {x}  A hi ; < n 4.If (k<i) Select (A lo, k); T( ?? ) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < 3T(n/3) + T( ?? ) + cn Not particularly better... need to try something different

CSE More on Sorting14 Does this really work (attempt #3) 1.Let B1, B2,..., Bn/3 each have size 3; free 2.Let x j = Median(Bj); n/3 x 3 = n 3.x = Median({xi}); T(n/3) 4.i = rank(x), A = A lo  {x}  A hi ; < n 5.If (k<i) Select (A lo, k); T( ?? ) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < T(n/3) + T( ?? ) + cn Are we getting anywhere?? Don’t give up !! One more idea and it can be done.

CSE More on Sorting15 Does this really work (attempt #4) 1.Let B1, B2,..., B(n/5) each have size 5; free 2.Let xi = Median(Bi); n/5 x 7 < 2n 3.x= Median({xi}); T(n/5) 4.i = rank(x), A = A lo  {x}  A hi ; < n 5.If (k<i) Select (A lo, k); T( 7n/10) else Select (A hi, k-i); (in worst case) Gives recurrence, T(n) < T(n/5) + T(7n/10 ) + cn Yes!! Best known results: can find median in 3n comparisons, lower bound is 2n.

CSE More on Sorting16 Proof that recursion for median algorithm is O(n) Given T(n) = T(  n/5  ) + T(  7n/10  ) + f(n), T(0)=0, and f(n) is O(n). We know  n 0, c 0 s.t.  n  n 0, f(n)  c 0 n. (Call this equation [1].) Let c = max ( 10c 0, max {T(n)/n} ). So c 0  c/10 [2] and  n  n 0, cn  T(n). [3] Claim:  n>0, T(n)  c n. Proof by induction on n. Bases cases (n = 0, 1,..., n 0 ) : These all follow from [3]. Inductive step: Assume n>n 0 and  k<n, T(k)  c k. In particular, since  n/5  < n, T(  n/5  )  c  n/5 , which is  cn/5, [4] Similarly, T(  7n/10  )  c  7n/10   7cn/10, [5] Then T(n) = T(  n/5  ) + T(  7n/10  ) + f(n) (definition of T(n).)  cn/5 + 7cn/10 + c 0 n (from [4], [5], and [1],)  cn/5 + 7cn/10 + cn/10 (from [2].)  cn(1/5 + 7/10 + 1/10) = cn. Q.E.D. 0<n  n 0

CSE More on Sorting17 What happens if we change floors to ceilings?? Given T(n) = T(  n/5  ) + T(  7n/10  ) + f(n), T(0)=0, and f(n) is O(n). We could argue that for n>100,  n/5  <.21n and  7n/10  <.71n. We’d also can change definition of c to ensure c 0 .08c. To do so, we’d say, “Let c = max ( c 0 /.08, max {T(n)/n} ).” Then, when we get to... “Then T(n) = T(  n/5  ) + T(  7n/10  ) + f(n)” we’ll be able to argue that “T(n) .21cn +.71cn +.08cn = cn.” and be done. THERE ARE SEVERAL HOLES IN THIS REVISED PROOF! They are small detail that needs to be handled. EXTRA CREDIT TO ANY PERSON OR GROUP FOR A PERFERCTED PROOF!! 0<n  n 0