Sorting. We live in a world obsessed with keeping information, and to find it, we must keep it in some sensible order. You learned in the last chapter.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

Data Structures Using C++ 2E
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
ISOM MIS 215 Module 7 – Sorting. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Data Structures Data Structures Topic #13. Today’s Agenda Sorting Algorithms: Recursive –mergesort –quicksort As we learn about each sorting algorithm,
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Chapter 8 SORTING. Outline 1. Introduction and Notation 2. Insertion Sort 3. Selection Sort 4. Shell Sort 5. Lower Bounds 6. Divide-and-Conquer Sorting.
Chapter 8: Sorting. Contents Algorithms for Sorting List (both for contiguous lists and linked lists) –Insertion Sort –Selection Sort –Bubble / Quick.
Chapter 19: Searching and Sorting Algorithms
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting.
Insertion sort, Merge sort COMP171 Fall Sorting I / Slide 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the.
1 Sorting/Searching CS308 Data Structures. 2 Sorting means... l Sorting rearranges the elements into either ascending or descending order within the array.
© 2006 Pearson Addison-Wesley. All rights reserved10-1 Chapter 10 Algorithm Efficiency and Sorting CS102 Sections 51 and 52 Marc Smith and Jim Ten Eyck.
Chapter 11 Sorting and Searching. Copyright © 2005 Pearson Addison-Wesley. All rights reserved Chapter Objectives Examine the linear search and.
1 C++ Plus Data Structures Nell Dale Chapter 10 Sorting and Searching Algorithms Slides by Sylvia Sorkin, Community College of Baltimore County - Essex.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 17: Linked Lists.
Merge sort, Insertion sort
C++ Plus Data Structures
Algorithm Efficiency and Sorting
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.
Sorting CS-212 Dick Steflik. Exchange Sorting Method : make n-1 passes across the data, on each pass compare adjacent items, swapping as necessary (n-1.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
CS 106 Introduction to Computer Science I 10 / 16 / 2006 Instructor: Michael Eckmann.
CHAPTER 7: SORTING & SEARCHING Introduction to Computer Science Using Ruby (c) Ophir Frieder at al 2012.
(c) , University of Washington
1 Data Structures and Algorithms Sorting. 2  Sorting is the process of arranging a list of items into a particular order  There must be some value on.
1 Data Structures and Algorithms Sorting and Searching Algorithms Slides by Sylvia Sorkin, Community College of Baltimore County - Essex Campus and Robert.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 19: Searching and Sorting Algorithms.
CSCE 3110 Data Structures & Algorithm Analysis Sorting (I) Reading: Chap.7, Weiss.
Chapter 12 Recursion, Complexity, and Searching and Sorting
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 9: Algorithm Efficiency and Sorting Data Abstraction &
 2005 Pearson Education, Inc. All rights reserved Searching and Sorting.
 Pearson Education, Inc. All rights reserved Searching and Sorting.
C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.
Heapsort. Heapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines.
Data Structures Using C++ 2E Chapter 10 Sorting Algorithms.
CSC 211 Data Structures Lecture 13
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
Sorting CS 105 See Chapter 14 of Horstmann text. Sorting Slide 2 The Sorting problem Input: a collection S of n elements that can be ordered Output: the.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Sorting CS 110: Data Structures and Algorithms First Semester,
CS 61B Data Structures and Programming Methodology July 21, 2008 David Sun.
1 C++ Plus Data Structures Nell Dale Chapter 10 Sorting and Searching Algorithms Slides by Sylvia Sorkin, Community College of Baltimore County - Essex.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
Sorting CS Sorting means... Sorting rearranges the elements into either ascending or descending order within the array. (we’ll use ascending order.)
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
Data Structures - CSCI 102 Selection Sort Keep the list separated into sorted and unsorted sections Start by finding the minimum & put it at the front.
Kruse/Ryba ch081 Object Oriented Data Structures Sorting Insertion Sort Selection Sort Index Sort Shell Sort Divide and Conquer Sort Merge Sort Quick Sort.
Chapter 9 sorting. Insertion Sort I The list is assumed to be broken into a sorted portion and an unsorted portion The list is assumed to be broken into.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Sorting and Searching Algorithms CS Sorting means... l The values stored in an array have keys of a type for which the relational operators are.
Sorting & Searching Geletaw S (MSC, MCITP). Objectives At the end of this session the students should be able to: – Design and implement the following.
Data Structures and Algorithms Instructor: Tesfaye Guta [M.Sc.] Haramaya University.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
Data Structures Using C++ 2E
Sorting means The values stored in an array have keys of a type for which the relational operators are defined. (We also assume unique keys.) Sorting.
Sorting … and Insertion Sort.
C++ Plus Data Structures
Chapter 10 Sorting Algorithms
Searching/Sorting/Searching
Presentation transcript:

Sorting

We live in a world obsessed with keeping information, and to find it, we must keep it in some sensible order. You learned in the last chapter that in worst case the searching time is proportional to the size of the list. O(n) The only way to reduce the searching time is to keep list ordered, as in binary search. O(log 2 n )

Sorting Sorting is the process of creating some sensible order. Sorting is closely related to searching in that we must sift through an unordered list a number of times looking for a particular element or a particular place to put an element.

Ordered List An ordered list is a list in which each entry contains a key, such that the keys are in order. That is, if entry i comes before entry j in the list, then the key of entry i is less than or equal to the key of entry j.

Sorting Several years ago, it was estimated, more than half the time on many commercial computers was spent in sorting. Because sorting is so important, a great many algorithms have been devised for doing it. KNUTH dealt with about twenty-five sorting methods in his vol-3 and claims that they are “only a fraction of the algorithms that have been devised so far.”

Sorting Your text describes only a few of them: –Insertion Sort –Selection Sort –Shell Sort –Divide-and-Conquer Sorting –Mergesort for Linked Lists –Quicksort for Contiguous Lists –Heaps and Heapsort

Evaluate sorting methods We will evaluate sorting methods using “Big Oh” notation. In searching, the total amount of work done was clearly related to the number of comparisons of keys. The same observation is true for sorting algorithms, but sorting algorithms must also move their entries around the list or change pointers.

Required tasks when Sorting Compare the target item to other items Rearrange unordered items work done depends on: –number of comparisons –number of moves

Analysis As before, both the worst-case performance and the average performance of a sorting algorithm are of interest. To find the average, we shall consider what would happen if the algorithm were run on all possible orderings of the list (with n entries, there are n! such orderings altogether) and take the average of the results.

Sortable Lists We shall be particularly concerned with the performance of our sorting algorithms. In order to optimize performance of a program for sorting a list, we shall need to take advantage of any special features of the list’s implementation. For example, we shall see that some sorting algorithms work very efficiently on contiguous lists, but different implementations and different algorithms are needed to sort linked lists efficiently. Hence, to write efficient sorting programs, we shall need access to the private data members of the lists being sorted. Therefore, we shall add sorting functions as methods of our basic List data structures. The augmented list structure forms a new ADT that we shall call a Sortable_List.

class definition of Sortable Lists The class definition for a Sortable_List takes the following form. template class Sortable_list :: public List { public: // Add prototypes for sorting methods here. private: // Add prototypes for auxiliary functions here. }; This definition shows that a Sortable_list is a List with extra sorting methods. The base list class can be any of the List implementations of Chapter 6.

Record and Key We use a template parameter class called Record to stand for entries of the Sortable_list. As in Chapter 7, we assume that the class Record has the following properties: Every Record has an associated key of type Key. A Record can be implicitly converted to the corresponding Key. Moreover, the keys (hence also the records) can be compared under the operations ‘,’ ‘ >=,’ ‘ <=,’ ‘ ==,’ and ‘ !=.’

Instance of Sortable List a program for testing our Sortable_list might simply declare: Sortable_list test_list; Here, the client uses the type int to represent both records and their keys.

INSERTION SORT The name of this algorithms comes from the fact that as we build an ordered list from an unordered one, we do so by choosing an element from the unordered list and “inserting” it into its correct place in the ordered list.

Sortable Lists

algorithm Take the first item in the unsorted list Insert it into the correct position in the sorted list Repeat until the unsorted list is empty

implementation If we wish to design an implementation of an algorithm to do this, we must be more specific: i.e. -what data structure will be used? -where does the sorted list begin and end? -how do we “do” steps 1 & 2 above?

Ordered insertion An ordered list is an abstract data type, defined as a list in which each entry has a key, and such that the keys are in order; that is, if entry i comes before entry j in the list, then the key of entry i is less than or equal to the key of entry j. For ordered lists, we shall often use two new operations that have no counterparts for other lists, since they use keys rather than positions to locate the entry. One operation retrieves an entry with a specified key from the ordered list. Retrieval by key from an ordered list is exactly the same as searching. The second operation, ordered insertion, inserts a new entry into an ordered list by using the key in the new entry to determine where in the list to insert it. Note that ordered insertion is not uniquely specified if the list already contains an entry with the same key as the new entry, since the new entry could go into more than one position.

Ordered insertion

ordered insertion We begin with the ordered list shown in part (a) of the figure and wish to insert the new entry hen. In contrast to the implementation-independent version of insert from Section 7.3, we shall start comparing keys at the end of the list, rather than at its beginning. Hence we first compare the new key hen with the last key ram shown in the coloured box in part (a). Since hen comes before ram, we move ram one position down, leaving the empty position shown in part (b). We next compare hen with the key pig shown in the coloured box in part (b). Again, hen belongs earlier, so we move pig down and compare hen with the key dog shown in the coloured box in part (c). Since hen comes after dog, we have found the proper location and can complete the insertion as shown in part (d).

Sorting by Insertion To sort an unordered list, we think of –removing its entries one at a time and then –inserting each of them into an initially empty new list, always keeping the entries in the new list in the proper order according to their keys. This method is illustrated in Figure 8.2, which shows the steps needed to sort a list of six words. At each stage, the words that have not yet been inserted into the sorted list are shown in coloured boxes, and the sorted part of the list is shown in white boxes.

Sorting by Insertion

In the initial diagram, the first word hen is shown as sorted, since a list of length 1 is automatically ordered.

The main step of contiguous insertion sort

Sorting by Insertion The main step required to insert an entry denoted current into the sorted part of the list is shown in Figure 8.3. In the method that follows, we assume that the class Sorted_list is based on the contiguous List implementation of Section Both the sorted list and the unsorted list occupy the same Lis t, member array, which we recall from Section is called entry. The variable first_unsorted marks the division between the sorted and unsorted parts of this array.

insertion_sort( ) template void Sortable_list :: insertion_sort( ) /* Post: The entries of the Sortable_list have been rearranged so that the keys in all the entries are sorted into increasing order. Uses: Methods for the class Record; the contiguous List implementation of Chapter 6 */ { int first_unsorted; // position of first unsorted entry int position; // searches sorted part of list Record current; // holds the entry temporarily removed from list for (first_unsorted = 1; first_unsorted < count; first_unsorted ++ ) if (entry[first_unsorted] < entry[first_unsorted - 1]) { position = first_unsorted; current = entry[first_unsorted];//Pull unsorted entry out of the list. do { // Shift all entries until the proper position is found. entry[position] = entry[position - 1]; position -- ; // position is empty. } while (position > 0 && entry[position - 1] > current); entry[position] = current; }

insertion_sort( ) a list with only one entry is automatically sorted, the loop on first_unsorted = 1 starts with the second entry. –if it is in the correct position, nothing needs to be done. –otherwise, the new entry is pulled out of the list into the variable current, and –the do : : while loop pushes entries one position down the list until the correct position is found, and finally current is inserted there. –The case when current belongs in the first position of the list must be detected specially, since in this case there is no entry with a smaller key that would terminate the search. We treat this special case as the first clause in the condition of the do : : while loop, position > 0.

Analysis of Insertion Sort Analyze the performance of the contiguous version of the program.

Analysis of Insertion Sort Assumptions: We restrict our attention to the case when the list is initially in random order (meaning that all possible orderings of the keys are equally likely). When we deal with entry i, how far back must we go to insert it? There are i possible ways to move it: –not moving it at all, –moving it one position, –moving it up to i - 1 positions to the front of the list. Given randomness, these are equally likely. The probability that it need not be moved is thus 1/i, in which case only one comparison of keys is done, with no moving of entries.

Analysis of Insertion Sort inserting one entry The remaining case, in which entry i must be moved, occurs with probability (i - 1)/i. Let us begin by counting the average number of iterations of the do : : while loop. Since all of the i - 1 possible positions are equally likely, the average number of iterations is ( (i - 1)) / (i - 1)(p.647) = ((i - 1) i) / (2 (i - 1)) = i /2

Analysis of Insertion Sort One key comparison and one assignment are done for each of these iterations, with one more key comparison done outside the loop, along with two assignments of entries. Hence, in this second case, entry i requires, on average, i /2 + 1 comparisons and i /2 + 2 assignments.

Analysis of Insertion Sort When we combine the two cases with their respective probabilities, we have 1/i. 1 + (i - 1)/i. (i /2 + 1)comparisons = (i - 1)/2 and 1/i. 0 + (i - 1)/i. (i /2 + 2)assignments = (i + 3)/2 - 2/i

Analysis of Insertion Sort inserting all entries We wish to add these numbers from i = 2 t o i = n, but to avoid complications in the arithmetic, we first use the big- O notation to approximate each of these expressions by suppressing the terms bounded by a constant; that is, terms that are O(1). We thereby obtain i /2 + O(1) for both the number of comparisons and the number of assignments of entries. In making this approximation, we are really concentrating on the actions within the main loop and suppressing any concern about operations done outside the loop or variations in the algorithm that change the amount of work only by some bounded amount.

Analysis of Insertion Sort To add i /2 + O(1) from i = 2 to i = n, we apply Theorem A.1 on page 647. We also note that adding n terms, each of which is O ( 1 ), produces a result that is O(n ). We thus obtain for both the number of comparisons of keys and the number of assignments of entries. n i = 2 (½ i + O ( 1 )) = n i = 2 i + O ( n ) ½ = ¼ n 2 + O(n)

Analysis of Insertion Sort for both the number of comparisons of keys and the number of assignments of entries. As n becomes larger, the contributions from the term involving n 2 become much larger than the remaining terms collected as O(n ). Hence as the size of the list grows, the time needed by insertion sort grows like the square of this size. O(n 2 ). = ¼ n 2 + O(n)

Analysis of Insertion Sort The worst case for the contiguous version of insertion sort is when the keys are input in reversed order. This would require i - 1 comparisons and i + 1 assignments for the i th entry in the list, with n keys being checked, giving a worst case comparison count of n i = 2 (i - 1) = ½ (n-1) n

Analysis of Insertion Sort Worst Case: moves moves unsorted n-1 moves n moves sorted Total moves = (n-1) + n  (n-1) + n = O(n 2 )

Linked Version of Insertion Sort For a linked version of insertion sort, since there is no movement of data, there is no need to start searching at the end of the sorted sublist. Instead, we shall traverse the original list, taking one entry at a time and inserting it in the proper position in the sorted list. The pointer variable last_sorted will reference the end of the sorted part of the list, and last_sorted->next will reference the first entry that has not yet been inserted into the sorted sublist. We shall let first_unsorted also point to this entry and use a pointer current to search the sorted part of the list to find where to insert *first_unsorted. If *first_unsorted belongs before the current head of the list, then we insert it there. Otherwise, we move current down the list until first_unsorted->entry entry and then insert *first_unsorted before *current. To enable insertion before *current we keep a second pointer trailing in lock step one position closer to the head than current. A sentinel is an extra entry added to one end of a list to ensure that a loop will terminate without having to include a separate check. Since we have

Analysis of Insertion Sort the node *first_unsorted is already in position to serve as a sentinel for the search, and the loop moving current is simplified. Finally, let us note that a list with 0 or 1 entry is already sorted, so that we can check these cases separately and thereby avoid trivialities elsewhere. The details appear in the following function and are illustrated in Figure 8.4.

Insertion Sort function template void Sortable_list :: insertion_sort( ) /* Post: The entries of the Sortable_list have been rearranged so that the keys in all the entries are sorted into nondecreasing order. Uses: Methods for the class Record. The linked List implementation of/ { Node *first_unsorted, // the first unsorted node to be inserted *last_sorted, // tail of the sorted sublist *current, // used to traverse the sorted sublist *trailing; // one position behind current if (head != NULL) { // Otherwise, the empty list is already sorted. last_sorted = head; // The first node alone makes a sorted sublist.

Insertion Sort function while (last_sorted->next != NULL) { first_unsorted = last_sorted->next; if (first_unsorted->entry entry) { // Insert *first_unsorted at the head of the sorted list: last_sorted->next = first_unsorted->next; first_unsorted->next = head; head = first_unsorted; } else { // Search the sorted sublist to insert *first_unsorted: trailing = head; current = trailing->next; while (first_unsorted->entry > current->entry) { trailing = current; current = trailing->next; }

Insertion Sort function // *first_unsorted now belongs between *trailing and *current. if (first_unsorted == current) last_sorted = first_unsorted; // already in right position else { last_sorted->next = first_unsorted->next; first_unsorted->next = current; trailing->next = first_unsorted; }

Analysis of Insertion Sort the node *first_unsorted is already in position to serve as a sentinel for the search, and the loop moving current is simplified. Finally, let us note that a list with 0 or 1 entry is already sorted, so that we can check these cases separately and thereby avoid trivialities elsewhere. The details appear in the following function and are illustrated in Figure 8.4.

Analysis of Insertion Sort

Linked Insertion Sort With no movement of data, there is no need to search from the end of the sorted sublist, as for the contiguous case. Traverse the original list, taking one entry at a time and inserting it in the proper position in the sorted list. Pointer last_sorted references the end of the sorted part of the list. Pointer first_unsorted == last_sorted->next references the first entry that has not yet been inserted into the sorted sublist.

Linked Insertion Sort Pointer current searches the sorted part of the list to nd where to insert *first_unsorted. If *first_unsorted belongs before the head of the list, then insert it there. Otherwise, move current down the list until first_unsorted->entry entry and then insert *first_unsorted before *current. To enable insertion before *current, keep a second pointer trailing in lock step one position closer to the head than current.

Linked Insertion Sort A sentinel is an extra entry added to one end of a list to ensure that a loop will terminate without having to include a separate check. Since last_sorted->next == first_unsorted, the node *first_unsorted is already in position to serve as a sentinel for the search, and the loop moving current is simplied. A list with 0 or 1 entry is already sorted, so by checking these cases separately we avoid trivialities elsewhere.

Sorting Algorithms and Average Case Number of Comparisons Simple Sorts –Straight Selection Sort –Bubble Sort –Insertion Sort More Complex Sorts –Quick Sort –Merge Sort –Heap Sort O(N 2 ) O(N*log N) 48

Selection Sort We can analyze the performance of function selection_sort in the same way that it is programmed. The main function does nothing except some bookkeeping and calling the subprograms. The function swap is called n - 1 times, and each call does 3 assignments of entries, for a total count of 3(n - 1). The function max_key is called n - 1 times, with the length of the sublist ranging from n down to 2. If t is the number of entries on the part of the list for which it is called, then max_key does exactly t - 1 comparisons of keys to determine the maximum. Hence, altogether, there are (n - 1) + (n - 2) +…+ 1 = 1/2 *n(n - 1) comparisons of keys, which we approximate to = ½ n 2 + O(n)

Analysis and comparison:

Selection sort moves the entries very efficiently but does many redundant comparisons. In its best case, insertion sort does the minimum number of comparisons, but it is inefficient in moving entries only one position at a time. Our goal now is to derive another method that avoids, as much as possible, the problems with both of these. Let us start with insertion sort and ask how we can reduce the number of times it moves an entry. Shell Sort

The reason why insertion sort can move entries only one position is that it compares only adjacent keys. If we were to modify it so that it first compares keys far apart, then it could sort the entries far apart. Afterward, the entries closer together would be sorted, and finally the increment between keys being compared would be reduced to 1, to ensure that the list is completely in order. This is the idea implemented in 1959 by D. L. SHELL in the sorting method bearing his name. This method is also sometimes called diminishing- increment sort.

Example of Shell Sort

Shell Sort we first sort all names that are at distance 5 from each other (so there will be only two or three names on each such list), then re-sort the names using increment 3, and finally perform an ordinary insertion sort (increment 1). You can see that, even though we make three passes through all the names, the early passes move the names close to their final positions, so that at the final pass (which does an ordinary insertion sort), all the entries are very close to their final positions so the sort goes rapidly.

Shell Sort we start with increment == count, where we recall that count represents the size of the List being sorted, and at each pass reduce the increment with a statement: increment = increment/3 + 1;

Analysis of Shell Sort Very large empirical studies have been made of Shell sort, and it appears that the number of moves, when n is large, is in the range of n 1:25 to 1.6n 1:25. This constitutes a substantial improvement over insertion sort.

57 Merge Sort Algorithm Cut the array in half. Sort the left half. Sort the right half. Merge the two sorted halves into one sorted array. [first] [middle] [middle + 1] [last]

// Recursive merge sort algorithm template void MergeSort ( ItemType values[ ], int first, int last ) // Pre: first <= last // Post: Array values[ first.. last ] sorted into ascending order. { if ( first < last ) // general case {int middle = ( first + last ) / 2 ; MergeSort ( values, first, middle ) ; MergeSort( values, middle + 1, last ) ; // now merge two subarrays // values [ first... middle ] with // values [ middle + 1,... last ]. Merge( values, first, middle, middle + 1, last ) ; } 58

59 Using Merge Sort Algorithm with N =

60 Merge Sort of N elements: How many comparisons? The entire array can be subdivided into halves only log 2 N times. Each time it is subdivided, function Merge is called to re-combine the halves. Function Merge uses a temporary array to store the merged elements. Merging is O(N) because it compares each element in the subarrays. Copying elements back from the temporary array to the values array is also O(N). MERGE SORT IS O(N*log 2 N).

Figure 11-24

Figure 11-25

Mergesort