Download presentation
Presentation is loading. Please wait.
Published byVirgil Hicks Modified over 9 years ago
1
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.1 Content: Lecture 8: Sorting part I: –Intro: aspects of sorting, different strategies –Insertion Sort, Selection Sort, Quick Sort Lecture 9: Sorting part II: –Heap Sort, Merge Sort (Vilhelm Dahllöf) –A movie ”Sort out Sorting”: survey of 9 comparison-based sorting algorithms (Bengt Werstén) Lecture 10: Sorting part III and Selection –Theoretical lower bound for comparison-based sorting, –BucketSort, RadixSort –Selection, median finding, quick select
2
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.2 The Sorting Problem Input: A list L of data items with keys (the part of each data item we base our sorting on) Output: A list L’ of the same data items placed in order, i.e.: Caution! Don’t over use sorting! Do you really need to have it sorted, or will a dictionary do fine instead of a sorted array?
3
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.3 Aspects of Sorting: Internal vs. External sorting: Can data be kept in fast, random accessed internal memory – or... Sorting in-place vs. sorting with auxiliary data structures Does the sorting algorithm need extra data structures? Often the stack is used as a ”hidden” extra structure! Worst-case vs. expected-case performance How does the algorithm behave in different situations?
4
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.4 Aspects of Sorting (cont.) What is the ”expected case”? In some applications we may never have a really bad worst case – pick algorithm accordingly! Sorting by comparison vs. Sorting digitally compare keys, or use e.g. Binary representation of data as sorting criteria? Stable vs. unstable sorting What happens with multiple occurrences of the same key? ”Quick’n’Dirty” vs. Efficient but hard-to-remember...
5
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.5 Different strategies used when sorting... Insertion sorts: For each new element to add to the sorted set, look for the right place in that set to put the element... Linear insertion, Binary insertion, Shell sort,... Selection sorts: In each iteration, search the unsorted set for the smallest (largest) remaining item to add to the end of the sorted set Straight selection, Tree selection 1, Heap sort,... Exchange sorts: Browse back and forth in some pattern, and whenever we are looking at a pair with wrong relative order, swap them... Bubble sort, Shaker sort, Quick sort, Merge sort 1... 1) Requires extra data structures apart from the stack
6
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.6 (Linear) insertion sort ” In each iteration, insert the first item from unsorted part Its proper place in the sorted part” An in-place sorting algorithm! Data stored in A[0.. n -1] Iterate i from 1 to n -1: The table consist of: –Sorted data in A[0.. i -1] –Unsorted data in A[ i.. n-1 ] Scan sorted part for index s for insertion of the selected item Increase i iisi
7
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.7 Analysis of Insertion Sort t 1 : n-1 passes over this ”constant speed” code t 2 : n-1 passes... t 3 : I = worst case no. of iterations in inner loop: I = 1+2+…+n-1 = (n-2)(n-1)/2 = n 2 -3n+2 t 4 : I passes t 5 : n-1 passes T: t 1 +t 2 +t 3 +t 4 +t 5 = 3*(n-1)+2*(n 2 -3n+2) = 3n-3+2n 2 -6n+4 = 2n 2 - 3n+1...thus we have an algorithm in O ( n 2 )in worst case, but …. good if file almost sorted Procedure InsertionSort (table A[0..n-1]): 1for i from 1 to n-1 do 2 s i; x A[i] 3 while j 1 and A[j-1]>x do 4 A[j] A[j-1] ; j j-1 5 A[j] x
8
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.8 (Straight) selection sort ” In each iteration, search the unsorted set for the smallest remaining item to add to the end of the sorted set” An in-place sorting algorithm! Data stored in A[0.. n -1] Iterate i from 1 to n -1: The table consist of: –Sorted data in A[0.. i -1] –Unsorted data in A[ i.. n-1 ] Scan unsorted part for index s for smallest remaining item Swap places for A[ i ] and A[ s ] i is i iisisisisisisisis
9
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.9 Analysis of Straight selection t 1 : n-1 passes over this ”constant speed” code t 2 : n-1 passes... t 3 : I = no. of iterations in inner loop: I = n-2 + n-3 + n-4 +...+1 = (n-2)(n-1)/2 = n 2 -3n+2 t 4 : I passes t 5 : n-1 passes T: t 1 +t 2 +t 3 +t 4 +t 5 = 3*(n-1)+2*(n 2 -3n+2) = 3n-3+2n 2 -6n+4 = 2n 2 - 3n+1...thus we have an algorithm in O ( n 2 )...rather bad! Procedure StraightSelection (table A[0..n-1]): 1for i from 0 to n-2 do 2 s i 3 for j from i+1 to n-1 do 4 if A[j] < A[s] then s j 5 A[i] A[s] Is this analysis good enough? Can we compare algs. of similar order? How expensive is... Index comparison Data comparison Data copying...are they comparable or quite different? Worst case? Best case? Expected case?
10
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.10 Analysis of Straight selection – details? We may redo the analysis and differentiate between Cheap operations as assignment and comparison of index or pointers Different levels of ”expensive” operations such as –Procedure calls –Comparison of data –Copying of data...and we may then find two O ( x ) algorithms to be quite different... Procedure StraightSelection (table A[0..n-1]): 1for i from 0 to n-2 do 2 s i 3 for j from i+1 to n-1 do 4 if A[j] < A[s] then s j 5 A[i] A[s] Is this an expensive op? It’s allways called if data is in reverse order! ”worst case”!
11
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.11 Quick Sort - overview 1. divide-and-conquer principle 2. example, basic ideas 3. quick sort algorithm, top–down 4. examples: worst and best case 5. randomization principle 6. randomized quick sort 7. fine tuning – make it faster!
12
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.12 Divide–and–conquer principle 1.divide a problem into smaller, independent sub- problems 2.conquer: solve the sub-problems recursively (or directly if trivial) 3.combine the solutions of the sub-problems
13
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.13 Quick Sort Example – basic idea Procedure QuickSort (table A[l : r] ): 1.If l r return 2.select some element of A, e.g. A[l], as the so–called pivot element: p A[l] ; 3.partition A in–place into two disjoint sub-arrays A L, A R : m partition( A[l : r], p ) ; { determines m, l<m<r, and reorders A[l : r], such that all elements in A[l : m] are now p and all in A[m+1 : r] are now p.} 4.apply the algorithm recursively to A L and A R : quicksort ( A[l : m] ); {sorts A L } quicksort ( A[m +1 : r] ); {sorts A R }
14
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.14 Quick sort: Partitioning an array in–place int partition ( array A[l : r], key p ) { the pivot element p is A[l] } i l-1 ; j r+1 ; while ( true ) do do i i+1 while A[i] p if ( i < j ) A[i] A[j] else return j; This code will scan through the entire set once, and will as a max move each element once!...thus: Running time of partition : r – l + 1)
15
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.15 Warning – details matter! Book: right most element as pivot, swaps it in at end, recurses at either side excluding the old pivot Film: left most element as pivot, swaps it in at end recurses at either side excluding the old pivot Slides: left most as pivot, includes it in area to partition, returns one position containing an element of size equal to the pivot – recurse on both halves including the pivot...and the way i ’s and j ’s are compared ( < or ), if they are incremented (decremented) before or after comparison, etc...
16
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.16 Quick Sort - Analysis Run time as a recursive expression. We implicitly build a search tree! What is the worst case? Expression reformulated to select a ”worst” partition!
17
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.17 Quick Sort – Analysis – worst case... If the pivot element happens to be the min or max element of A in each call to quicksort... (e.g., pre-sorted data) –Unbalanced recursion tree –Recursion depth becomes n Sorting nearly–sorted arrays occurs frequently in practice
18
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.18 Quick Sort – Analysis – best case... Best – balanced search tree! How...? If pivot is median of data set!
19
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.19 Quick sort: apply randomization Randomization algorithmic design principle applicable where choosing among several alternative directions to avoid long sequences of bad decisions with high probability, independently of the input simplifies the average case analysis Select pivot randomly (not first, not last...) p A[random(l,r)] ; running time not only dependant on good input data can not construct bad input data...
20
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.20 Quick sort – fine tuning.... Median-of-three and sentinels... Inner loop of partition fn should be: i i+1 ; while i r and A[i] < p do i i+1; j j-1 ; while j l and A[j] > p do j j-1; Improve: –Sort the first, middle and last elements of A –Use the content of middle element as pivot value –Data set now has sentinels at the end (values selected to stop the iteration) and we may remove extra test i r and j l. –Probability of the middle value to be a bad pivot is low!
21
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.21 Quick sort – fine tuning.... Reduce need for extra space – upper bound on stack! Observations: –Large partition lead to a large stack depth –Does not matter in which order we perform recursive calls (left part of A before right part or vice versa) Enhancements: –Replace last (tail-) recursive call with iteration (reusing the same procedure call), leave first recursive call as is –Select the larger part of A for the repeated iteration, and use recursion for the smaller part of A...and we have the worst maximum stack depth when we have a balanced search tree, i.e. O(log n).
22
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.22 Quick sort – fine tuning.... When only a few elements remain (e.g., |A| < 4)... Over head for recursion becomes significant Entire A is almost sorted (except for small, locally unsorted sections) Stop sorting by QuickSort, perform one global sort using Linear InsertionSort– although O(n 2 ) worst case, much better on allmost sorted data., which is the case now!
23
TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.23 Straight Insertion – the good case? If table is almost sorted? E.g., max 3 items unsorted, then remainder are bigger? t 1 : n-1 passes over this ”constant speed” code t 2 : n-1 passes... T 3+4+5 : I = no. of iterations in inner loop (max 3 elements ”totaly unsorted”): I = (n-1)*3 worst case, all three allways in reverse order t 6 : n-1 passes T: t 1 +t 2 +t 3+4+5 +t 6 = 3*(n-1)+3*(n-1)= 3n-3...thus we have an algorithm in O ( n )...rather good! Procedure InsertionSort(table A[0..n-1]): 1for i from 1 to n-1 do 2 j i; tmp A[i] 3 while j>0 and tmp < A[j-1] do 4 j j-1 5 A[j+1] A[j] 6 A[j] tmp
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.