Peter Andreae Computer Science Victoria University of Wellington Copyright: Peter Andreae, Victoria University of Wellington Fast Sorting COMP 103 2012.

Slides:



Advertisements
Similar presentations
Recursion Chapter 14. Overview Base case and general case of recursion. A recursion is a method that calls itself. That simplifies the problem. The simpler.
Advertisements

Introduction to Algorithms Quicksort
Garfield AP Computer Science
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
Efficient Sorts. Divide and Conquer Divide and Conquer : chop a problem into smaller problems, solve those – Ex: binary search.
ISOM MIS 215 Module 7 – Sorting. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.
Copyright (C) Gal Kaminka Data Structures and Algorithms Sorting II: Divide and Conquer Sorting Gal A. Kaminka Computer Science Department.
Sorting Algorithms and Average Case Time Complexity
CMPS1371 Introduction to Computing for Engineers SORTING.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
CS 171: Introduction to Computer Science II Quicksort.
Insertion sort, Merge sort COMP171 Fall Sorting I / Slide 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the.
Sorting21 Recursive sorting algorithms Oh no, not again!
Lecture 25 Selection sort, reviewed Insertion sort, reviewed Merge sort Running time of merge sort, 2 ways to look at it Quicksort Course evaluations.
TDDB56 DALGOPT-D DALG-C Lecture 8 – Sorting (part I) Jan Maluszynski - HT Sorting: –Intro: aspects of sorting, different strategies –Insertion.
Cmpt-225 Sorting. Fundamental problem in computing science  putting a collection of items in order Often used as part of another algorithm  e.g. sort.
Mergesort and Quicksort Chapter 8 Kruse and Ryba.
Sorting II/ Slide 1 Lecture 24 May 15, 2011 l merge-sorting l quick-sorting.
CSE 373 Data Structures Lecture 19
Analysis of Algorithms Dilemma: you have two (or more) methods to solve problem, how to choose the BEST? One approach: implement each algorithm in C, test.
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
1 Data Structures and Algorithms Sorting. 2  Sorting is the process of arranging a list of items into a particular order  There must be some value on.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
A review session 2013-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
FASTER SORTING using RECURSION : QUICKSORT COMP 103.
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Peter Andreae Computer Science Victoria University of Wellington Copyright: Peter Andreae, Victoria University of Wellington ArraySet and Binary Search.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
FASTER SORTING using RECURSION : QUICKSORT 2014-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus.
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
SORTING 2014-T2 Lecture 13 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
Chapter 8 Sorting and Searching Goals: 1.Java implementation of sorting algorithms 2.Selection and Insertion Sorts 3.Recursive Sorts: Mergesort and Quicksort.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
FASTER SORT using RECURSION : MERGE SORT 2015-T2 Lecture 15 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus.
2015-T2 Lecture 17 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, John Lewis,
Sorting – Part II CS 367 – Introduction to Data Structures.
QUICKSORT 2015-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
Data Structures - CSCI 102 Selection Sort Keep the list separated into sorted and unsorted sections Start by finding the minimum & put it at the front.
Intro To Algorithms Searching and Sorting. Searching A common task for a computer is to find a block of data A common task for a computer is to find a.
ICS201 Lecture 21 : Sorting King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
Sorting divide and conquer. Divide-and-conquer  a recursive design technique  solve small problem directly  divide large problem into two subproblems,
Divide and Conquer Sorting Algorithms COMP s1 Sedgewick Chapters 7 and 8.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Nothing is particularly hard if you divide it into small jobs. Henry Ford Nothing is particularly hard if you divide it into small jobs. Henry Ford.
Review Quick Sort Quick Sort Algorithm Time Complexity Examples
Today’s Material Sorting: Definitions Basic Sorting Algorithms
Quicksort This is probably the most popular sorting algorithm. It was invented by the English Scientist C.A.R. Hoare It is popular because it works well.
Sorting Ordering data. Design and Analysis of Sorting Assumptions –sorting will be internal (in memory) –sorting will be done on an array of elements.
CS 367 Introduction to Data Structures Lecture 11.
FASTER SORT using RECURSION : MERGE SORT COMP 103.
329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]
CMPT 238 Data Structures More on Sorting: Merge Sort and Quicksort.
INTRO2CS Tirgul 8 1. Searching and Sorting  Tips for debugging  Binary search  Sorting algorithms:  Bogo sort  Bubble sort  Quick sort and maybe.
FASTER SORT using RECURSION : MERGE SORT
COMP 103 SORTING Lindsay Groves 2016-T2 Lecture 26
Divide and Conquer.
Chapter 4: Divide and Conquer
Advanced Sorting Methods: Shellsort
CSC215 Lecture Algorithms.
Hassan Khosravi / Geoffrey Tien
CSE 326: Data Structures Sorting
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
Advanced Sorting Methods: Shellsort
Presentation transcript:

Peter Andreae Computer Science Victoria University of Wellington Copyright: Peter Andreae, Victoria University of Wellington Fast Sorting COMP T2 #16

COMP103 16:2 Menu Sorting Design by Divide and Conquer Merge Sort QuickSort Notes: No lecture friday. Terms Test 1 available Tutorial changes this week.

COMP103 16:3 Insertion sort, Selection Sort, Bubble Sort: All slow (except Insertion sort on almost sorted lists) O(n 2 ) Problem: Insertion and Bubble only compare adjacent items only move items one step at a time Selection compares every pair of items – ignores results of previous comparisons. Solution: Must compare and swap items at a distance Must not perform redundant comparisons Slow Sorts:

COMP103 16:4 Divide and Conquer Sorts To Sort: Split Sort each part (recursive) Combine Where does the work happen? MergeSort: split trivial combine does all the work QuickSort: split does all the work combine trivial Array Sorted Array Split Combine SubArray SortedSubArray Sort Split Combine SubArray Sort SortedSubArray Split Combine SubArray Sort SortedSubArray

COMP103 16:5 Merge Sort Split the array exactly in half Sort each half “Merge” them together Temporary array

COMP103 16:6 MergeSort Needs a temporary array for copying create temporary array [fill with a copy of the original data.] Need a "wrapper" method to start it off. public static void mergeSort(E[] data, int size, Comparator comp){ E[] other = (E[])new Object[size]; for (int i=0; i<size; i++) other[i]=data[i]; mergeSort(data, other, 0, size, comp); } Not needed for simple version

COMP103 16:7 MergeSort private static void mergeSort(E[] data, E[] temp, int low, int high, Comparator comp){ // sort items from low..high-1 using temp array if (low < high-1){ int mid = (low+high)/2; // mid = low of upper half, high= high of lower half. mergeSort(data, temp, low, mid, comp); mergeSort(data, temp, mid, high, comp); merge(data, temp, low, mid, high, comp); for (int i=low; i<high; i++) data[i]=temp[i]; } Sort each half merge into temp copy back

COMP103 16:8 Merge /** Merge from[low..mid-1] with from[mid..high-1] into to[low..high-1.*/ private static void merge(E[] from, E[] to, int low, int mid, int high, Comparator comp){ int indxLeft = low; // index into the lower half of the "from" range int indxRight = mid; // index into the upper half of the "from" range int indexTo = low; // where we will put the item into "to“ while ( indxLeft<mid && indxRight < high ){ if ( comp.compare(from[indxLeft], from[indxRight]) <=0 ) to[indexTo++] = from[indxLeft++]; else to[indexTo++] = from[indxRight++]; } //copy over the remainder. Note only one loop will do anything. while (indxLeft<mid) to[indexTo++] = from[indxLeft++]; while (indxRight<high) to[indexTo++] = from[indxRight++]; }

COMP103 16:9 MergeSort

COMP103 16:10 MergeSort Why copy items over twice? private static void mergeSort(E[] data, E[] temp, int low, int high, Comparator comp){ // sort items from low..high-1 using temp array if (high > low+1){ int mid = (low+high)/2; // mid = low of upper 1/2, = high of lower half. mergeSort(temp, data, low, mid, comp); mergeSort(temp, data, mid, high, comp); merge(temp, data, low, mid, high, comp); } Note how we swap temp and data each recursive call Sort each half in temp (using data as extra space) merge halves from temp back into data

COMP103 16:11 data [p a1 r f e q2 w q1 t z2 x c v b z1 a2 ] msort(0..16) [p a1 r f e q2 w q1 t z2 x c v b z1 a2 ] msort(0..8) [p a1 r f e q2 w q1 ] msort(0..4 ) [p a1 r f ] msort(0..2 ) [p a1 ] msort(0..1 ) [p ] msort(1..2) [ a1 ] merge(0.1.2) [a1 p ] msort(2..4) [ r f ] msort(2..3) [ r ] msort(3..4) [ f ] merge(2.3.4) [ f r ] merge(0.2.4) [a1 f p r ] msort(4..8) [ e q2 w q1 ] msort(4..6) [ e q2 ] : : merge(4.5.6) [ e q2 ] msort(6..8) [ w q1 ] : : merge(6.7.8) [ q1 w ] merge(4.6.8) [ e q2 q1 w ] merge(0.4.8) [a1 e f p q2 q1 r w ] msort(8..16) [ t z2 x c v b z1 a2 ] : : merge( ) [ a2 b c t v x z1 z2 ] merge(0.8.16) [a1 a2 b c e f p q2 q1 r t v w x z1 z2 ]

COMP103 16:12 MergeSort Cost

COMP103 16:13 MergeSort Cost Level 1:2 * n/2= n Level 2:4 * n/4= n Level 3:8 * n/8= n Level 4:16 * n/16= n Level k: n * 1 = n How many levels? Total cost? = O( ) n = 1,000: n = 1,000,000 n = 1,000,000,000

COMP103 16:14 Analysing with Recurrence Relations private static void mergeSort(E[] data, E[] temp, int low, int high, Comparator comp){ if (high > low+1){ int mid = (low+high)/2; mergeSort(temp, data, low, mid, comp); mergeSort(temp, data, mid, high, comp); merge(temp, data, low, mid, high, comp); } } Assume cost of mergeSort on n items is C(n) C(n) = C(n/2) + C(n/2) + n = 2 C(n/2) + n Recurrence Relation: Solve by repeated substitution & find pattern Solve by general method (MATH 261)

COMP103 16:15 Solving Recurrence Relations C(n) = 2 C(n/2) + n = 2 [ 2 C(n/4) + n/2] + n = 4 C(n/4) + 2 * n = 4 [ 2 (C(n/8) + n/4] + 2 * n = 8 C(n/8) + 3 * n = 16 C(n/16) + 4 * n : = 2 k C( n/2 k ) + k * n when n = 2 k, k = lg(n) = n C (1) + lg(n) * n since C(1) = 0 C(n) = lg(n) * n

COMP103 16:16 Other Properties? Stable: Doesn’t jump any item over an unsorted region ⇒ two equal items preserve their order Same cost on all input No bad worst cases “natural merge” variant doesn’t sort already sorted regions ⇒ will be very fast – O(n) – on almost sorted lists Not in place Needs double the space for temporary work There is an iterative version do all size 1's, then size 2's, then size 4's, etc. Can be done with huge files on disk

COMP103 16:17 QuickSort Divide and Conquer, but does its work in the “split” step It splits the array into two (possibly unequal) parts: choose a “pivot” item make sure all items < pivot are in the left part all items > pivot are in the right part Then (recursively) sorts each part public static void quickSort(E[] data, int size, Comparator comp){ quickSort(data, 0, size, comp); }

COMP103 16:18 QuickSort public static void quickSort(E[] data, int low, int high, Comparator comp){ if (high-low < 2) // only one item to sort. return; else { // split into two parts, mid = index of boundary int mid = partition(data, low, high, comp); quickSort(data, low, mid, comp); quickSort(data, mid, high, comp); } SEXBQR FAPLJM

COMP103 16:19 QuickSort: Partition /** Partition into small items (low..mid-1) and large items (mid..high-1) private static int partition(E[] data, int low, int high, Comparator comp){ E pivot = data[(low+high-1)/2];// simple but may be poor choice! int left = low-1; int right = high; while( left <= right ){ do { left++; // skip over items on the left < pivot } while (left<high && comp.compare(data[left], pivot)< 0); do { right--; // skip over items on the right > pivot } while (right>=low && comp.compare(data[right], pivot)> 0); if (left< right) swap(data, left, right); } return left; } or = median(data[low], data[high-1], data[(low+high-1)/2], comp); Getting this code exactly right is very tricky! Many published versions were wrong! SEXBQR FAPLJM

COMP103 16:20 QuickSort Cost: If Quicksort divides the array exactly in half: (best case) C(n) = n + 2 C(n/2) = n lg(n) comparisons = O(n log(n)) (best case) If Quicksort always divides the array into 1 and n-1: (worst case) C(n) = n + (n-1) + (n-2) + (n-3) + … = n(n-1)/2 comparisons = O(n 2 ) (worst case) Average case? Very hard to analyse. Still O(n log(n)), and very good.

COMP103 16:21 Other Properties? Unstable: Partition “jumps” items to the other end ⇒ two equal items likely to reverse their order Cost depends on choice of pivot. Choosing first item ⇒ very slow – O(n 2 ) – on almost sorted lists Better choice (median of three) ⇒ O(n log(n)) on almost sorted lists Can spend more time choosing pivot. In place – doesn't use any extra space.

COMP103 16:22 QuickSort data array : [p a1 r f e q1 w q2 t z1 x c v b z2 a2 ] indexes : [ ] do : [ p a1 r f e q1 w q2 t z1 x c v b z2 a2 ] : [ p a1 a2 f e b c q2 t z1 x w v q1 z2 r ] do 0..8 : [ p a1 a2 f e b c q2 ] : [ c a1 a2 b e f p q2 ] do 0..5 : [ c a1 a2 b e ] : [a2 a1 c b e ] do 0..2 : [a2 a1 ] : [a1 a2 ] do 0..1 : [a1 ] do 1..2 : [ a2 ] done 0..2 : [a1 a2 ] do 2..5 : [ c b e ] : [ b c e ] do 2..3 : [ b ] do 3..5 : [ c e ] : [ c e ] do 3..4 : [ c ] do 4..5 : [ e ] done 3..5 : [ c e ] done 2..5 : [ b c e ] done 0..5 : [a1 a2 b c e ] do 5..8 : [ f p q2 ] : [ f p q2 ] do 5..7 : [ f p ] : [ f p ] do 5..6 : [ f ] do 6..7 : [ p ] done 5..7 : [ f p ] do 7..8 : [ q2 ] done 5..8 : [ f p q2 ] done 0..8 : [a1 a2 b c e f p q2 ] do : [ t z1 x w v q1 z2 r ] : [ t r q1 v w x z2 z1 ] do : [ t r q1 v ] : [ q1 r t v ] do : [ q1 r ] : [ q1 r ] do 8..9 : [ q1 ] do : [ r ] done : [ q1 r ] do : [ t v ] : [ t v ] do : [ t ] do : [ v ] done : [ t v ] done : [ q1 r t v ] do : [ w x z2 z1 ] : [ w x z2 z1 ] do : [ w x ] : [ w x ] do : [ w ] do : [ x ] done : [ w x ] do : [ z2 z1 ] [ z1 z2 ] do : [ z1 ] do : [ z2 ] done : [ z1 z2 ] done : [ w x z1 z2 ] done : [ q1 r t v w x z1 z2 ] done : [a1 a2 b c e f p q2 q1 r t v w x z1 z2 ] sorted : [a1 a2 b c e f p q2 q1 r t v w x z1 z2 ]

COMP103 16:23 QuickSort data array : [p a1 r f e q1 w q2 t z1 x c v b z2 a2 ] indexes : [ ] do : [ p a1 r f e q1 w q2 t z1 x c v b z2 a2 ] : [ p a1 a2 f e b c q2 t z1 x w v q1 z2 r ] do 0..8 : [ p a1 a2 f e b c q2 ] : [ c a1 a2 b e f p q2 ] do 0..5 : [ c a1 a2 b e ] : [a2 a1 c b e ] do 0..2 : [a2 a1 ] : [a1 a2 ] do 0..1 : [a1 ] do 1..2 : [ a2 ] done 0..2 : [a1 a2 ] do 2..5 : [ c b e ] : [ b c e ] do 2..3 : [ b ] do 3..5 : [ c e ] : [ c e ] do 3..4 : [ c ] do 4..5 : [ e ] done 3..5 : [ c e ]

COMP103 16:24 QuickSort done 2..5 : [ b c e ] done 0..5 : [a1 a2 b c e ] do 5..8 : [ f p q2 ] : [ f p q2 ] do 5..7 : [ f p ] : [ f p ] do 5..6 : [ f ] do 6..7 : [ p ] done 5..7 : [ f p ] do 7..8 : [ q2 ] done 5..8 : [ f p q2 ] done 0..8 : [a1 a2 b c e f p q2 ] do : [ t z1 x w v q1 z2 r ] : [ t r q1 v w x z2 z1 ] do : [ t r q1 v ] : [ q1 r t v ] do : [ q1 r ] : [ q1 r ] do 8..9 : [ q1 ] do : [ r ] done : [ q1 r ]

COMP103 16:25 QuickSort do : [ t v ] : [ t v ] do : [ t ] do : [ v ] done : [ t v ] done : [ q1 r t v ] do : [ w x z2 z1 ] : [ w x z2 z1 ] do : [ w x ] : [ w x ] do : [ w ] do : [ x ] done : [ w x ] do : [ z2 z1 ] [ z1 z2 ] do : [ z1 ] do : [ z2 ] done : [ z1 z2 ] done : [ w x z1 z2 ] done : [ q1 r t v w x z1 z2 ] done : [a1 a2 b c e f p q2 q1 r t v w x z1 z2 ] sorted : [a1 a2 b c e f p q2 q1 r t v w x z1 z2 ]

COMP103 16:26 Where have we been? Implementing Collections: ArrayList: O(n) to add/remove, except at end Queue, Stack:O(1) ArraySet:O(n) (cost of searching) SortedArraySetO(log(n)) to search (with binary search) O(n) to add/remove (cost of inserting) O(n 2 ) to add n items O(n log(n)) to initialise with n items. (with fast sorting) Where next? We’re tired of arrays; lets look at dynamic data structures.