Sorting atomic items Chapter 5 Distribution based sorting paradigms.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
QuickSort 4 February QuickSort(S) Fast divide and conquer algorithm first discovered by C. A. R. Hoare in If the number of elements in.
Spring 2015 Lecture 5: QuickSort & Selection
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Computer Algorithms Lecture 10 Quicksort Ch. 7 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
Lecture 2 MAS 714 Hartmut Klauck
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Chapter 7 Quicksort Ack: This presentation is based on the lecture slides from Hsu, Lih- Hsing, as well as various materials from the web.
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Analysis of Algorithms CS 477/677
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Sorting CS 105 See Chapter 14 of Horstmann text. Sorting Slide 2 The Sorting problem Input: a collection S of n elements that can be ordered Output: the.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Randomized Algorithms CSc 4520/6520 Design & Analysis of Algorithms Fall 2013 Slides adopted from Dmitri Kaznachey, George Mason University and Maciej.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
Sorting Data Structures and Algorithms (60-254). Sorting Sorting is one of the most well-studied problems in Computer Science The ultimate reference on.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Computer Science 101 Fast Algorithms. What Is Really Fast? n O(log 2 n) O(n) O(n 2 )O(2 n )
Chapter 9 Sorting 1. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Review 1 Insertion Sort Insertion Sort Algorithm Time Complexity Best case Average case Worst case Examples.
QuickSort. Yet another sorting algorithm! Usually faster than other algorithms on average, although worst-case is O(n 2 ) Divide-and-conquer: –Divide:
Computer Sciences Department1. Sorting algorithm 4 Computer Sciences Department3.
329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]
Advanced Sorting.
CMPT 438 Algorithms.
Sorting.
Analysis of Algorithms CS 477/677
Sorts, CompareTo Method and Strings
Sorting Mr. Jacobs.
Introduction to Algorithms Prof. Charles E. Leiserson
Algorithm Analysis CSE 2011 Winter September 2018.
Teach A level Computing: Algorithms and Data Structures
Data Structures and Algorithms
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Quicksort and Mergesort
Advanced Sorting Methods: Shellsort
Quick Sort (11.2) CSE 2011 Winter November 2018.
CO 303 Algorithm Analysis And Design Quicksort
Ch 7: Quicksort Ming-Te Chi
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
CS 583 Analysis of Algorithms
Chapter 4.
Algorithms: the big picture
Algorithm Efficiency and Sorting
Analysis of Algorithms
Tomado del libro Cormen
Topic: Divide and Conquer
Data Structures & Algorithms
Chapter 9: Selection of Order Statistics
Searching and Sorting Hint at Asymptotic Complexity
Quicksort and Randomized Algs
Design and Analysis of Algorithms
Divide and Conquer Merge sort and quick sort Binary search
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Advanced Sorting Methods: Shellsort
Presentation transcript:

Sorting atomic items Chapter 5 Distribution based sorting paradigms

The distribution-based sorting QuickSort is an in place algorithm, but… Consider the stack for the recursive calls. For balanced partitions: O(logn) space Worst case of unbalanced partitions: Ω(n) calls, Θ(n) space!! QuickSort is modified. We can bound the recursive depth. Algorithm Bounded Based on the fact that: Quicksort does not depend on the order in which recursive calls are executed! Small arrays can be better sorted with InsertionSort (when n is typically of the order of tens). The modified version mixes one recursive call with an iterative while loop.

Algorithm Bounded(S, I, j) n > no n ≤ no

Algorithm Bounded(S, I, j) The recursive call is executed on the smaller part of the partition It drops the recursive call on the larger part of the partition in favor of another execution of the while-loop. Ex: Partition Only one recursive call on S(i,p-1); For the larger part S(i+1,j) we iterate to the while loop Technique: Elimination of Tail Recursion Bounded takes O(nlogn) time in average and O(logn) space. 10 3 7 15 21 18 2 11 pivot =S[2]=3 2 3 10 7 15 21 18 11

Multi-way QuickSort 2-level model: M =internal memory size ; B = block size ; Split the sequence S into k = Θ(M/B) sub-sequences using k-1 pivots. We would like balanced partitions, that is of Θ(n/k) items each. Select k-1 s1, s2, …sk-1 “good pivots” is not a trivial task! Let bucket Bi the portion of S between pivot si-1 and si. We want guarantee that |Bi|= Θ(n/k) for all buckets. So, at the first step of Multi-way QuickSort size of portions n/k at the second step the size of portions n/k2 at the third step the size of portions n/k3 ………. Stop when n/ki ≤ M: n/M ≤ ki , i is the number of steps i ≥ logk n/M = logM/B n/M This number of steps is enough to have portions shorter than M, and sorted in internal memory! Partition takes O(n/B) I/O’s (dual to MS) : 1 input block, k output blocks.

Multi-way QuickSort 2-level model: M =internal memory size ; B = block size ; Split the sequence S into k = Θ(M/B) sub-sequences using k-1 pivots. We would like balanced partitions, that is of Θ(n/k) items each. Select k-1 s1, s2, …sk-1 “good pivots” is not a trivial task! Let bucket Bi the portion of S between pivot si-1 and si. We want guarantee that |Bi|= Θ(n/k) for all buckets. So, at the first step of Multi-way QuickSort size of portions n/k at the second step the size of portions n/k2 at the third step the size of portions n/k3 ………. Stop when n/ki ≤ M: n/M ≤ ki , i is the number of steps i ≥ logk n/M = logM/B n/M This number of steps is enough to have portions shorter than M, and sorted in internal memory! Partition takes O(n/B) I/O’s (dual to MS) : 1 input block, k output blocks.

Multi-way QuickSort Find k good pivots efficiently. Randomized strategy called oversampling. Θ(ak) items are sampled, a≥0 parameter of the oversampling. Balanced selection of si =A[(a+1)i] should provide good pivots!! Θ(ak) candidates Θ(ak)log(ak) time Select k-1 pivots evenly distributed

Multi-way QuickSort K=5 a+1 a+1 a+1 a+1 sample

Multi-way QuickSort The larger is a the closer to Θ(n/k) If a=n/k the elements of set A cannot be sorted in M ! If a = 0 the selection is fast, but unbalanced partitions are is more probable. Good choice for a: Θ(log k). Pivot-selection costs Θ(klog2k) Lemma. Let k ≥ 2, a + 1 =12 ln k. A sample of (a+1)k-1 suffices to ensure that all buckets receive less than 4n/k elements, with probability at least ½. Proof. We find un upper bound to complement event: there exists 1 bucket containing more than 4n/k elements with probability at most ½. Failure sampling. Consider the sorted version of S, S’. Logically split S’ in k/2 segments (t1, t2, …, tk/2) of 2n/k elements each.

Multi-way QuickSort The event is that there exists a bucket Bi with more that 4n/k items. It spans more than one segment: pivots si-1 and si fall outside t2. In t2 fall less than (a+1) samples (see selection algorithm: between 2 pivots there are a+1 samples, hence in t2 there are less). Pr (exists Bi : |Bi| ≥ 4 n/k) ≤ Pr (exists tj : contains < (a+1) samples) ≤ k/2 Pr (a specific segment contain < (a+1) samples) Since k/2 is the number of segments. si-1 si

Multi-way QuickSort Pr (1 sample goes in a given segment) = (2n/k )/n = 2/k If drawn uniformly at random from S (and S’). Let X the number of samples going in a given segment, we want to compute: Pr(X < a+1) Observe : E(X) = ((a+1)k-1) × 2/k ≥ 2(a+1)-2/k, per k ≥ 2 E(X) ≥ 2(a+1) – 1 ≥ 3/2(a+1) for all a ≥ 1 a+1 ≤ 2/3 E(X) = (1-1/3) E(X) By Chernoff bound Pr (X < (1 - δ) E(X) ≤ e^{δ2 /2)E(X)} ) Setting δ=1/3 and assume a+1 = 12ln k Pr (X < a+1 ) ≤ P(X ≤ (1-1/3)E(X) ) ≤ e-E(X)/18 ≤ e-(a+1)/12 = e-lnk = 1/k

Multi-way QuickSort Pr (X < a+1 ) ≤ 1/k We have already derived: Pr (exists Bi : |Bi| ≥ 4 n/k) ≤ k/2×Pr (a segment contain< (a+1) samples) Pr (exists Bi : |Bi| ≥ 4 n/k) ≤ 1/2 complement event of the lemma All buckets receives less than 4n/k elements with probability > 1/2

Dual Pivot QuickSort p, q pivots l, k, g indices  4 pieces Good strategy in practice no theoretical result. Empirical good results in average. p, q pivots l, k, g indices  4 pieces items smaller than p items larger or equal to p and smaller or equal to q. items not jet considered items greater than q < p l p ≤ i ≤ q k ? g > q

Dual Pivot QuickSort Similar to the 3-ways Partition: maintains the invariants. Items equal to the pivot are not treated separately. 2 indices move rightward , l and k, while g moves leftward. Termination: k ≥ g. For item k, compare S[k] : p, if S[k] < p exchange S[k] and S[l] and increment pointers else if S[k] > q decrease g while S[g] > q and g≠k the last value of g : S[g] ≤ q exchange S[k] and S[g] …… The comparison with S[k] drives the phases possibly including a long shift to the left. The nesting of comparison is the key for the efficiency of the algortihm.

Dual Pivot Partition l k g 5 12 9 12 13 15 17 19 12 26 18 22 20 p=12 q= 17 exchange 19 and 17, k++, g— l k g 5 12 9 12 13 15 17 19 12 26 18 22 20 l++ exchange 12 and 13, k++ l k g 5 12 9 12 12 15 17 19 12 13 26 18 22 20 no exchange

Dual Pivot QuickSort You can find the complete code description and the visualization of the algorithm on youtube by searching for Dual pivot QuickSort. Conclusions: Even a very old, classic algorithm such as QuickSort can be speed up and innovated!