Using Divide and Conquer for Sorting

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

David Luebke 1 4/22/2015 CS 332: Algorithms Quicksort.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Spring 2015 Lecture 5: QuickSort & Selection
© 2004 Goodrich, Tamassia Quick-Sort     29  9.
Sorting Algorithms and Average Case Time Complexity
Chapter 19: Searching and Sorting Algorithms
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Quick-Sort     29  9.
CS 171: Introduction to Computer Science II Quicksort.
September 19, Algorithms and Data Structures Lecture IV Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 4 Comparison-based sorting Why sorting? Formal analysis of Quick-Sort Comparison.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Quicksort CIS 606 Spring Quicksort Worst-case running time: Θ(n 2 ). Expected running time: Θ(n lg n). Constants hidden in Θ(n lg n) are small.
Divide and Conquer Sorting
Analysis of Algorithms CS 477/677
Design and Analysis of Algorithms – Chapter 51 Divide and Conquer (I) Dr. Ying Lu RAIK 283: Data Structures & Algorithms.
1 QuickSort Worst time:  (n 2 ) Expected time:  (nlgn) – Constants in the expected time are small Sorts in place.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
Sorting and Asymptotic Complexity
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
1 More Sorting radix sort bucket sort in-place sorting how fast can we sort?
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Analysis of Algorithms CS 477/677
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
1 CSE 326: Data Structures A Sort of Detour Henry Kautz Winter Quarter 2002.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.
Asymptotic Behavior Algorithm : Design & Analysis [2]
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
LIMITATIONS OF ALGORITHM POWER
Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Towers of Hanoi Move n (4) disks from pole A to pole B such that a larger disk is never put on a smaller disk A BC ABC.
Nothing is particularly hard if you divide it into small jobs. Henry Ford Nothing is particularly hard if you divide it into small jobs. Henry Ford.
QuickSort. Yet another sorting algorithm! Usually faster than other algorithms on average, although worst-case is O(n 2 ) Divide-and-conquer: –Divide:
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
CS6045: Advanced Algorithms Sorting Algorithms. Sorting Input: sequence of numbers Output: a sorted sequence.
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Quick-Sort 9/13/2018 1:15 AM Quick-Sort     2
Ch8: Sorting in Linear Time Ming-Te Chi
Lecture 3 / 4 Algorithm Analysis
Quick-Sort 2/25/2019 2:22 AM Quick-Sort     2
Topic 5: Heap data structure heap sort Priority queue
Quick-Sort 4/25/2019 8:10 AM Quick-Sort     2
Chapter 8: Overview Comparison sorts: algorithms that sort sequences by comparing the value of elements Prove that the number of comparison required to.
CS 583 Analysis of Algorithms
Algorithms CSCI 235, Spring 2019 Lecture 17 Quick Sort II
Presentation transcript:

Using Divide and Conquer for Sorting QuickSort Algorithm Using Divide and Conquer for Sorting

Topics Covered QuickSort algorithm analysis Randomized Quick Sort A Lower Bound on Comparison-Based Sorting

Quick Sort Divide and conquer idea: Divide problem into two smaller sorting problems. Divide: Select a splitting element (pivot) Rearrange the array (sequence/list)

Quick Sort Result: All elements to the left of pivot are smaller or equal than pivot, and All elements to the right of pivot are greater or equal than pivot pivot in correct place in sorted array/list Need: Clever split procedure (Hoare)

Quick Sort Divide: Partition into subarrays (sub-lists) Conquer: Recursively sort 2 subarrays Combine: Trivial

QuickSort (Hoare 1962) Problem: Sort n keys in nondecreasing order Inputs: Positive integer n, array of keys S indexed from 1 to n Output: The array S containing the keys in nondecreasing order. quicksort ( low, high ) 1. if high > low 2. then partition(low, high, pivotIndex) 3. quicksort(low, pivotIndex -1) 4. quicksort(pivotIndex +1, high)

Partition array for Quicksort partition (low, high, pivot) 1. pivotitem = S [low] 2. k = low 3. for j = low +1 to high 4. do if S [ j ] < pivotitem 5. then k = k + 1 6. exchange S [ j ] and S [ k ] 7. pivot = k 8. exchange S[low] and S[pivot]

Input low =1, high = 4 pivotitem = S[1]= 5 5 3 6 2 k j j,k 5 3 2 6 after line3 after line5 after line6 after loop 5 3 2 6 pivot k

Partition on a sorted list 3 4 6 3 4 6 after line3 k j k j after loop 3 4 6 pivot k How does partition work for S = 7,5,3,1 ? S= 4,2,3,1,6,7,5

Worst Case Call Tree (N=4) Q(1,4) S =[ 1,3,5,7 ] Left=1, pivotitem = 1, Right =4 Q(2,4) Left =2,pivotItem=3 Q(1,0) S =[ 3,5,7 ] Q(2,1) Q(3,4) pivotItem = 5, Left = 3 S =[ 5,7 ] Q(4,4) Q(3,2) S =[ 7 ] Q(4,3) Q(5,4)

Worst Case Intuition n-1 n-1 t(n) = n-2 n-2 n-3 n-3 n-4 n-4 1 1 å n-2 n-2 n-3 n-3 n-4 n-4 . 1 1 å k = 1 n Total = k = (n+1)n/2

Recursion Tree for Best Case Partition Comparisons n n Nodes contain problem size n n/2 n/2 n/4 n/4 n/4 n/4 n n/8 . . > n/8 . . > n/8 n/8 n/8 n/8 n . . > . . > Sum =(n lgn)

Another Example of O(n lg n) Comparisons Assume each application of partition () partitions the list so that (n/9) elements remain on the left side of the pivot and (8n/9) elements remain on the right side of the pivot. We will show that the longest path of calls to Quicksort is proportional to lgn and not n The longest path has k+1 calls to Quicksort = 1 + log 9/8 n  1 + lgn / lg (9/8) = 1 + 6lg n Let n = 1,000,000. The longest path has 1 + 6lg n = 1 + 620 = 121 << 1,000,000 calls to Quicksort. Note: best case is 1+ lg n = 1 +7 =8

Recursion Tree for Magic pivot function that Partitions a “list” into 1/9 and 8/9 “lists” (log9 n) n/81 8n/81 8n/81 64n/81 n . . > . . > 256n/729 . . > (log9/8 n) n/729 9n/729 . . > n 0/1 ... <n 0/1 0/1 <n 0/1

Intuition for the Average case worst partition followed by the best partition 1 (n-1)/2 Vs n 1+(n-1)/2 (n-1)/2 This shows a bad split can be “absorbed” by a good split. Therefore we feel running time for the average case is O(n lg n)

T(n) = max ( T(q-1) + T(n - q) )+ Q (n) Recurrence equation: Worst case T(n) = max ( T(q-1) + T(n - q) )+ Q (n) 0 £ q £ n-1 Average case n A(n) = (1/n) å (A(q -1) + A(n - q ) ) + Q (n) q = 1

Sorts and extra memory When a sorting algorithm does not require more than Q(1) extra memory we say that the algorithm sorts in-place. The textbook implementation of Mergesort requires Q(n) extra space The textbook implementation of Heapsort is in-place. Our implement of Quick-Sort is in-place except for the stack.

Quicksort - enhancements Choose “good” pivot (random, or mid value between first, last and middle) When remaining array small use insertion sort

Randomized algorithms Uses a randomizer (such as a random number generator) Some of the decisions made in the algorithm are based on the output of the randomizer The output of a randomized algorithm could change from run to run for the same input The execution time of the algorithm could also vary from run to run for the same input

Randomized Quicksort Choose the pivot randomly (or randomly permute the input array before sorting). The running time of the algorithm is independent of input ordering. No specific input elicits worst case behavior. The worst case depends on the random number generator. We assume a random number generator Random. A call to Random(a, b) returns a random number between a and b.

RQuicksort-main procedure // S is an instance "array/sequence" // terminate recursionquicksort ( low, high ) 1. if high > low 2a. then i=random(low, high); 2b. swap(S[high], S[I]); 2c. partition(low, high, pivotIndex) 3. quicksort(low, pivotIndex -1) 4. quicksort(pivotIndex +1, high)

Randomized Quicksort Analysis We assume that all elements are distinct (to make analysis simpler). We partition around a random element, all partitions from 0:n-1 to n-1:0 are equally likely Probability of each partition is 1/n.

Average case time complexity

Summary of Worst Case Runtime exchange/insertion/selection sort = Q(n 2) mergesort = Q(n lg n ) quicksort = Q(n 2 ) average case quicksort = Q(n lg n ) heapsort = Q(n lg n )

Sorting So far, our best sorting algorithms can run in Q(n lg n) in the worst case. CAN WE DO BETTER??

Goal Show that any correct sorting algorithm based only on comparison of keys needs at least nlgn comparisons in the worst case. Note: There is a linear general sorting algorithm that does arithmetic on keys. (not based on comparisons) Outline: 1) Representing a sorting algorithm with a decision tree. 2) Cover the properties of these decision trees. 3) Prove that any correct sorting algorithm based on comparisons needs at least nlgn comparisons.

Decision Trees A decision tree is a way to represent the working of an algorithm on all possible data of a given size. There are different decision trees for each algorithm. There is one tree for each input size n. Each internal node contains a test of some sort on the data. Each leaf contains an output. This will model only the comparisons and will ignore all other aspects of the algorithm.

For a particular sorting algorithm One decision tree for each input size n. We can view the tree paths as an unwinding of actual execution of the algorithm. It is a tree of all possible execution traces.

S is a,b,c else if a < c then S is a,c,b else S is c,a,b sortThree a<- S[1]; b<- S[2]; c<- S[3] a<b yes no if a < b then if b < c then S is a,b,c else if a < c then S is a,c,b else S is c,a,b else if b < c then if a < c then S is b,a,c else S is b,c,a else S is c,b,a b<c b<c yes no yes no a,b,c a<c c,b,a a<c yes no yes no b,a,c a,c,b c,a,b b,c,a Decision tree for sortThree Note: 3! leaves representing 6 permutations of 3 distinct numbers. 2 paths with 2 comparisons 4 paths with 3 comparisons total 5 comparison

Exchange Sort 1. for (i = 1; i  n -1; i++) 2. for (j = i + 1; j  n ; j++) 3. if ( S[ j ] < S[ i ]) 4. swap(S[ i ] ,S[ j ]) At end of i = 1 : S[1] = minS[i] At end of i = 2 : S[2] = minS[i] At end of i = 3 : S[3] = minS[i] 1 i  n 2 i  n n- 1 i  n

Decision Tree for Exchange Sort for N=3 Example =(7,3,5) a,b,c s[2]<s[1] i=1 3 7 5 b,a,c a,b,c ab s[3]<s[1] s[3]<s[1] 3 7 5 b,a,c c,a,b cb c,b,a ca a,b,c s[3]<s[2] s[3]<s[2] s[3]<s[2] s[3]<s[2] cb ab ca ab c,b,a c,a,b b,c,a c,a,b c,b,a a,c,b a,b,c b,a,c 3 5 7 Every path and 3 comparisons Total 7 comparisons 8 leaves ((c,b,a) and (c,a,b) appear twice.

Questions about the Decision Tree For a Correct Sorting Algorithm Based ONLY on Comparison of Keys What is the length of longest path in an insertion sort decision tree? merge sort decision tree? How many different permutation of a sequence of n elements are there? How many leaves must a decision tree for a correct sorting algorithm have? Number of leaves  n ! What does it mean if there are more than n! leaves?

Proposition: Any decision tree that sorts n elements has depth (n lg n ). Consider a decision tree for the best sorting algorithm (based on comparison). It has exactly n! leaves. If it had more than n! leaves then there would be more than one path from the root to a particular permutation. So you can find a better algorithm with n! leaves. We will show there is a path from the root to a leaf in the decision tree with nlgn comparison nodes. The best sorting algorithm will have the "shallowest tree"

Proposition: Any Decision Tree that Sorts n Elements has Depth (n lg n ). Depth of root is 0 Assume that the depth of the "shallowest tree" is d (i.e. there are d comparisons on the longest from the root to a leaf ). A binary tree of depth d can have at most 2d leaves. Thus we have : n!  2d ,, taking lg of both sides we get d  lg (n!). It can be shown that lg (n !) = (n lg n ). QED

Implications The running time of any whole key-comparison based algorithm for sorting an n-element sequence is (n lg n ) in the worst case. Are there other kinds of sorting algorithms that can run asymptotically faster than comparison based algorithms?