1 Chapter 7: Sorting (Insertion Sort, Shellsort) CE 221 Data Structures and Algorithms Izmir University of Economics Text: Read Weiss, § 7.1 – 7.4
2 Preliminaries Main memory sorting algorithms All algorithms are Interchangeable; an array containing N elements will be passed. “ ” (comparison) and “=“ (assignment) are the only operations allowed on the input data : comparison-based sorting Izmir University of Economics
Selection Sort In each ith iteration find the smallest element on its right indexed by min and swap a[i] and a[min]
Example : Selection Sort
Implementation: Selection Sort Izmir University of Economics20
Time Analysis of Selection Sort Comparisons: Exchanges (swaps): N
Example: Insertion Sort In each ith iteration swap a[i] with each larger element to the left of index a[i]. Everything to the left if index i is sorted
Example: Insertion Sort
Izmir University of Economics30
Example: Insertion Sort Izmir University of Economics31
Example: Insertion Sort Izmir University of Economics32
Example: Insertion Sort Izmir University of Economics33
Example: Insertion Sort Izmir University of Economics34
Implementation: Insertion Sort Izmir University of Economics35
Time Analysis: Insertion Sort Number of comparisions: Number of excahnges:
Shell Sort Move entries more than one position at a time by h–sorting the array 1.Start at L and look at every 4th element and sort it 2.Start at E and look at every 4th ekement and sort it 3.Start at E and look at every 4th ekement and sort it 4.Start at A and look at every 4th ekement and sort it
Decreasing sequence of values of h
Example: S O R T E X A M P L E Result: A E E L M O P R S T X
Implementation
Time Analysis
Shell Sort vs Insertion Sort Insertion sort compares every single item with all the rest elements of the list in order to find its place, while Shell sort compares items that lie far apart. This makes light elements to move faster to the front of the list.
43 Insertion Sort One of the simplest sorting algorithms Consists of N-1 passes. for pass p = 1 to N-1 (0 thru p-1 already known to be sorted) –elements in position 0 trough p (p+1 elements) are sorted by moving the element left until a smaller element is encountered. Izmir University of Economics
44 Insertion Sort - Algorithm typedef int ElementType; void InsertionSort( ElementType A[ ], int N ) { int j, P; ElementType Tmp; for( P = 1; P < N; P++ ) { Tmp = A[ P ]; for( j = P; j > 0 && A[ j - 1 ] > Tmp; j-- ) A[ j ] = A[ j - 1 ]; A[ j ] = Tmp; } N*N iterations, hence, time complexity = O(N 2 ) in the worst case.This bound is tight (input in the reverse order). Number of element comparisons in the inner loop is p, summing up over all p = N-1 = Θ(N 2 ). If the input is sorted O(N). Izmir University of Economics
45 A Lower Bound for Simple Sorting Algorithms An inversion in an array of numbers is any ordered pair (i, j) such that a[i] > a[j]. In the example set we have 9 inversions, (34,8),(34,32),(34,21),(64,51),(64,32),(64,21),(51,32),(51,21), and (32,21). Swapping two adjacent elements that are out of place removes exactly one inversion and a sorted array has no inversions. The running time of insertion sort O(I+N). Izmir University of Economics
46 Average Running Time for Simple Sorting - I Theorem: The average number of inversions for N distinct elements is the average number of inversionsin a permutation, i.e., N(N-1)/4. Proof: It is the sum of the number of inversions in N! different permutations divided by N!. Each permutation L has a corresponding permutation L R which is reversed in sequence. If L has x inversions, then L R has N(N-1)/2 – x inversions. As a result ((N(N-1)/2) * (N!/2)) / N! = N(N-1)/4 is the number of inversions for an average list. Izmir University of Economics
47 Theorem: Any algorithm that sorts by exchanging adjacent elements requires Ω(N 2 ) time on average. Proof: Initially there exists N(N-1)/4 inversions on the average and each swap removes only one inversion, so Ω(N 2 ) swaps are required. This is valid for all types of sorting algorithms (including those undiscovered) that perform only adjacent exchanges. Result: For a sorting algorithm to run subquadratic (o(N 2 )), it must exchange elements that are far apart (eliminating more than just one inversion per exchange). Average Running Time for Simple Sorting - II Izmir University of Economics
48 Shellsort - I Shellsort, invented by Donald Shell in 1959, works by comparing elements that are distant; the distance decreases as the algorithm runs until the last phase (diminishing increment sort) Sequence h 1, h 2,..., h t is called the increment sequence. h 1 = 1 always. After a phase, using h k, for every i, a[i] ≤ a[i+h k ]. The file is then said to be h k -sorted. Izmir University of Economics
49 Shellsort - II An h k -sorted file that is then h k-1 -sorted remains h k -sorted. To h k -sort, for each i in h k,h k +1,...,N-1, place the element in the correct spot among i, i-h k, i-2h k. This is equivalent to performing an insertion sort on h k independent subarrays. Izmir University of Economics
50 Shellsort - III void Shellsort( ElementType A[ ], int N ) { int i, j, Increment; ElementType Tmp; for( Increment = N / 2; Increment > 0; Increment /= 2 ) for( i = Increment; i < N; i++ ) { Tmp = A[ i ]; for( j = i; j >= Increment; j -= Increment ) if( Tmp < A[ j - Increment ] ) A[ j ] = A[ j - Increment ]; else break; A[ j ] = Tmp; } Increment sequence by Shell: h t =floor(N/2), h k =floor(h k+1 /2) (poor) Izmir University of Economics
51 Worst-Case Analysis of Shellsort - I Theorem: The worst case running time of Shellsort, using Shell’s increments, is Θ(N 2 ). Proof: part I: prove Ω(N 2 )= Why? smallest N/2 elements goes from position 2i-1 to i during the last pass. Previous passes all have even increments. Izmir University of Economics
52 Proof: part II: prove O(N 2 ) A pass with increment h k consists of h k insertion sorts of about N/h k elements. One pass,hence, is O(h k (N/h k ) 2 ). Summing over all passes which is O(N 2 ). Shell’s increments: pairs of increments are not relatively prime. Hibbard’s increments: 1, 3, 7,..., 2 k -1 Worst-Case Analysis of Shellsort - II Izmir University of Economics
53 Worst-Case Analysis of Shellsort - III Theorem: The worst case running time of Shellsort, using Hibbard’s increments, is Θ(N 3/2 ). Proof: (results from additive number theory) - for h k >N 1/2 use the bound O(N 2 /h k ) // h k =1, 3, 7,..., 2 t -1 -h k+2 done, h k+1 done, h k now. -a[p-i] < a[p] if -but h k+2 =2h k+1 +1 hence gcd(h k+2, h k+1 ) = 1 -Thus all can be expressed as such. -Therefore; innermost for loop executes O(h k ) times for each N-h k positions. This gives a bound of O(Nh k ) per pass. Izmir University of Economics
Homework Assignments 7.1, 7.2, 7.3, 7.4 You are requested to study and solve the exercises. Note that these are for you to practice only. You are not to deliver the results to me. Izmir University of Economics54