Sorting Taking an arbitrary permutation of n items and rearranging them into total order Sorting is, without doubt, the most fundamental algorithmic.

Slides:



Advertisements
Similar presentations
ITEC200 Week10 Sorting. pdp 2 Learning Objectives – Week10 Sorting (Chapter10) By working through this chapter, students should: Learn.
Advertisements

Garfield AP Computer Science
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
Chapter 10: Sorting1 Sorting Based on Chapter 10 of Koffmann and Wolfgang.
CSE 373: Data Structures and Algorithms
Updated QuickSort Problem From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j]
Section 8.8 Heapsort.  Merge sort time is O(n log n) but still requires, temporarily, n extra storage locations  Heapsort does not require any additional.
Ver. 1.0 Session 5 Data Structures and Algorithms Objectives In this session, you will learn to: Sort data by using quick sort Sort data by using merge.
Spring 2010CS 2251 Sorting Chapter 8. Spring 2010CS 2252 Chapter Objectives To learn how to use the standard sorting methods in the Java API To learn.
Sorting Chapter 10.
Sorting CS-212 Dick Steflik. Exchange Sorting Method : make n-1 passes across the data, on each pass compare adjacent items, swapping as necessary (n-1.
Sorting Chapter 10. Chapter 10: Sorting2 Chapter Objectives To learn how to use the standard sorting methods in the Java API To learn how to implement.
CHAPTER 7: SORTING & SEARCHING Introduction to Computer Science Using Ruby (c) Ophir Frieder at al 2012.
Section 8.4 Insertion Sort CS Insertion Sort  Another quadratic sort, insertion sort, is based on the technique used by card players to arrange.
1 Data Structures and Algorithms Sorting. 2  Sorting is the process of arranging a list of items into a particular order  There must be some value on.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Sorting Chapter 10. Chapter Objectives  To learn how to use the standard sorting methods in the Java API  To learn how to implement the following sorting.
CSE 373: Data Structures and Algorithms Lecture 6: Sorting 1.
Sorting 2 Taking an arbitrary permutation of n items and rearranging them into total order Sorting is, without doubt, the most fundamental algorithmic.
1 Sorting اعداد: ابوزيد ابراهيم حامد سعد صبرة حميده الشاذلي عبدالاه السيد محمد احمد.
Chapter 10: Sorting1 Sorting. Chapter 10: Sorting2 Chapter Outline How to use standard sorting functions in How to implement these sorting algorithms:
M180: Data Structures & Algorithms in Java Sorting Algorithms Arab Open University 1.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Chapter 9: Sorting1 Sorting & Searching Ch. # 9. Chapter 9: Sorting2 Chapter Outline  What is sorting and complexity of sorting  Different types of.
SORTING Chapter 8. Chapter Objectives  To learn how to use the standard sorting methods in the Java API  To learn how to implement the following sorting.
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
Prof. U V THETE Dept. of Computer Science YMA
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Sorting Mr. Jacobs.
Subject Name: Design and Analysis of Algorithm Subject Code: 10CS43
Sorting.
Sorting Chapter 13 presents several common algorithms for sorting an array of integers. Two slow but simple algorithms are Selectionsort and Insertionsort.
Sorting Chapter 10.
Sorting Chapter 8.
10.6 Shell Sort: A Better Insertion
Quicksort 1.
10.3 Bubble Sort Chapter 10 - Sorting.
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Quicksort and Mergesort
Sorting Algorithms IT12112 Lecture 07.
Advanced Sorting Methods: Shellsort
Quicksort analysis Bubble sort
CSC215 Lecture Algorithms.
Unit-2 Divide and Conquer
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
Noncomparison Based Sorting
Sorting Chapter 8 CS 225.
Sub-Quadratic Sorting Algorithms
Sorting Chapter 13 presents several common algorithms for sorting an array of integers. Two slow but simple algorithms are Selectionsort and Insertionsort.
Chapter 4.
Sorting Chapter 8.
Searching.
Quicksort.
CS 1114: Sorting and selection (part two)
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
Sorting Chapter 10.
Chapter 10 Sorting Algorithms
Quicksort.
CSCE 3110 Data Structures & Algorithm Analysis
CSCE 3110 Data Structures & Algorithm Analysis
The Selection Problem.
CSCE 3110 Data Structures & Algorithm Analysis
Module 8 – Searching & Sorting Algorithms
Divide and Conquer Merge sort and quick sort Binary search
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Quicksort.
10.3 Bubble Sort Chapter 10 - Sorting.
Presentation transcript:

Sorting Taking an arbitrary permutation of n items and rearranging them into total order Sorting is, without doubt, the most fundamental algorithmic problem Supposedly, between 25% and 50% (depending on source) of all CPU cycles are spent sorting used in office apps (databases, spreadsheets, word processors,...) Sorting is fundamental to most problems, for example binary search. Many different approaches lead to useful sorting algorithms Generally it helps to know about the properties of data to be sorted so we can sort it faster.

Applications of Sorting Sorting: important because once list is sorted, other problems become easy.   Searching  Speeding up searching is perhaps the most important application of sorting. Closest pair Given n numbers, find the pair which are closest to each other.   Once a list is sorted, how long will this take? Element uniqueness Given a set of n items, are they all unique? Remove duplicates.     Sorted list versus unsorted list? Set differences: Compare 2 large sets and find where they differ Frequency distribution mode Given a set of n items, which element occurs the largest number of times?    Median and Selection What is the kth largest item in the set?  The median element?

Write a function that sorts an array of ints: int x[6] = {3,2,8,4,1,6}; int x[6] = {1,2,3,4,5,6}; int x[6] = {6,5,4,3,2,1};

Selection Sort Your basic sorting algorithm Straightforward How do we do this?

Selection Sort Example 35 65 30 60 20 scan 0-4, smallest=20 swap 35 and 20 20 65 30 60 35 scan 1-4, smallest=30 swap 65 and 30 20 30 65 60 35 scan 2-4, smallest=35 swap 65 and 35 20 30 35 60 65 scan 3-4, smallest=60 swap 60 and 60 30 35 60 65 done Algorithm design?

Selection Sort Algorithm for i = 0 to n-2 do // steps 2-6 form a pass set min_pos to i for j = i+1 to n-1 do if item at j < item at min_pos set min_pos to j Exchange item at min_pos with one at i

Bubble Sort Compares adjacent array elements Exchanges their values if they are out of order Smaller values bubble up to the top of the array Larger values sink to the bottom

Bubble Sort Algorithm do for each pair of adjacent array elements if values are out of order Exchange the values while the array is not sorted

Bubble Sort Algorithm, Refined do Initialize exchanges to false for each pair of adjacent array elements if values are out of order Exchange the values Set exchanges to true while exchanges

Bubble Sort Code void bubble_sort(int first, int last, int arr[]) { int pass = 1; bool exchanges; do { exchanges = false; // No exchanges yet. // Compare each pair of adjacent elements. for (int x = first; x != last - pass; x++) { int y = x + 1; if (arr[y] < arr[x]) { // Exchange pair. int temp = arr[y]; arr[y] = arr[x]; arr[x] = temp; exchanges = true; // Set flag. } pass++; } while (exchanges);

Analysis of Bubble Sort Is this better than selection sort? In what cases would this algorithm work best? Worst?

Analysis of Bubble Sort Excellent performance in some cases But very poor performance in others! Works best when array is nearly sorted to begin with Worst case number of comparisons: O(n2) Worst case number of exchanges: O(n2) Best case occurs when the array is already sorted: O(n) comparisons O(1) exchanges (none actually) Can we do better?

Insertion Sort Based on technique of card players to arrange a hand Player keeps cards picked up so far in sorted order When the player picks up a new card Makes room for the new card Then inserts it in its proper place

Insertion Sort Algorithm For each element from 2nd to last: Insert element where it belongs in first part of list Inserting into the sorted part Increases sorted subarray size by 1 To make room: Hold value to be inserted in a temp variable Shuffle elements to the right until gap at right place Place temp value in the gap

Insertion Sort Example

More Efficient Version void insertion_sort (int first, int last, int arr[]) { for (int next_pos = first+1; next_pos != last; next_pos++) { // elements at position first thru next_pos - 1 are sorted. // Insert element at next_pos in the sorted subarray. insert(first, next_pos, arr); } void insert(int first, int next_pos, int arr[]) { int next_val = arr[next_pos]; // next_val is element to insert. while (next_pos != first && next_val < arr[next_pos – 1]) { arr[next_pos] = arr[next_pos – 1]; next_pos--; // Check next smaller element. arr[next_pos] = next_val; // Store next_val where it belongs. Analysis? Best case? Worst Case?

Analysis of Insertion Sort Maximum number of comparisons: O(n2) In the best case, number of comparisons: O(n) # shifts for an insertion = # comparisons - 1 When new value smallest so far, # comparisons = n A shift in insertion sort moves only one item Bubble or selection sort exchange: 3 assignments

Comparison of Quadratic Sorts Good enough? Can we do better?

What sorts? Best case? Worst? 1. void asort(int first, int last, int arr[]) { bool exchanges; do { exchanges = false;. for (int x = first; x != last - pass; x++) { int y = x + 1; if (arr[y] < arr[x]) { int temp = arr[y]; arr[y] = arr[x]; arr[x] = temp; exchanges = true; } } while (exchanges); 2. void asort(int arr[], int n) { int i, j, min_idx; for (i = 0; i < n-1; i++) { min_idx = i; for (j = i+1; j < n; j++) { if (arr[j] < arr[min_idx]) { min_idx = j; } int tmp = arr[i]; arr[i] = arr[min_idx] arr[min_idx]=tmp; } } 3. void asort(int arr[], int n)  {      int i, key, j;      for (i = 1; i < n; i++)  {          key = arr[i];          j = i - 1;          while (j >= 0 && arr[j] > key)   {              arr[j + 1] = arr[j];              j = j - 1;          }          arr[j + 1] = key;      }  } 

Quicksort Developed in 1962 by C. A. R. Hoare Given a pivot value: Rearranges array into two parts: Left part  pivot value Right part > pivot value

Trace of Algorithm for Partitioning

Quicksort Example 44 75 12 43 64 23 55 77 33 44 33 12 43 23 64 55 77 75 23 33 12 43 44 64 55 77 75 23 33 12 43 64 55 77 75 23 12 33 43 55 64 77 75 12 23 33 43 33 43 77 75 75 77

In English: Pick a pivot (we picked the first value in each subarray) Place a firstptr at the first value in the subarray (after the pivot) Place a lastptr at the last value in the subarray While the firstptr is less than the lastptr: While the firstptr is less than the lastptr And the firstptr points to a value less than the pivot Increment the firstptr to the next value in the array While the lastptr is greater than the firstptr And the lastptr points to a value greater than the pivot decrement the lastptr to the previous value in the array If firstptr < lastptr, switch the values at the firstptr and the lastptr Switch the values at the pivot and the lastptr Now the pivot is in place All values before the pivot become a new subarray and all the values after the pivot become a new subarray Repeat until subarrays are of length 1 or 2.

Algorithm for Quicksort first and last are end points of region to sort if first < last Partition using pivot, which ends in piv_index Apply Quicksort recursively to left subarray Apply Quicksort recursively to right subarray

Algorithm for Partitioning Set pivot value to a[first] Set up to first+1 and down to last do Increment up until a[up] > pivot or up = last Decrement down until a[down] <= pivot or down = first if up < down, swap a[up] and a[down] while up is to the left of down swap a[first] and a[down] return down as pivIndex

Quicksort Code void quick_sort(int first, int last int arr[]) { if (last - first > 1) { // There is data to be sorted. // Partition the table. int pivot = partition(first, last, arr); // Sort the left half. quick_sort(first, pivot-1, arr); // Sort the right half. quick_sort(pivot + 1, last,arr); }

Partitioning Code Analysis? Does this preserve stability? int partition(int first, int last, int arr[]) { int p = first; int pivot = arr[first]; int i = first+1, j = last; int tmp; while (i <= j) { while (arr[i] < pivot) i++; while (arr[j] > pivot) j--; if (i <= j) { tmp = arr[i]; arr[i] = arr[j]; arr[j] = tmp; } arr[0] = arr[j]; arr[j] = pivot; p = j; return p Analysis? Does this preserve stability? What happens with a sorted list? Try: s=['g','n','o','a','d','c','e'] partition(0,6,s);

Revised Partitioning Algorithm Average case for Quicksort is O(n log n) We partition log n times We compare n values each time (and flip some of them) Worst case is O(n2) What would make the worst case happen? When the pivot chosen always ended up with all values on one side of the pivot When would this happen? Sorted list (go figure)

Solution: pick better pivot values The worst case occurs when list is sorted or almost sorted To eliminate this problem, pick a better pivot: Use the middle element of the subarray as pivot. Use a random element of the array as the pivot. Perhaps best: take the median of three elements as the pivot. Use three “marker” elements: first, middle, last Let pivot be one whose value is between the others

Merge Sort Like QuickSort in that it involves “divide and conquer” approach Divide and Conquer usually means O(n log n) A merge is a common data processing operation: We’re merging two sets of ordered data Goal: Combine the two sorted sequences in one larger sorted sequence Merge sort starts small and merges longer and longer sequences

Merge Algorithm (Two Arrays) Merging two arrays: Access the first item from both sequences While neither sequence is finished Compare the current items of both Copy smaller current item to the output Access next item from that input sequence Copy any remaining from first sequence to output Copy any remaining from second to output

Picture of Merge Analysis of this? Time? Space?

Analysis of Merge Two input sequences, total length n elements Must move each element to the output Merge time is O(n) Must store both input and output sequences An array cannot be merged in place Additional space needed: O(n)

Using Merge to Sort So far, we’ve merged 2 files that are already in order. We can do this in O(n) time – good! Can we use merge to sort an entire list? Yes! Take an unordered list, and divide it into 2 lists Can we merge these lists? No – these lists are also unordered. So let’s divide each of these lists into 2 lists We continue to divide until each list contains one element Is a one-element list ordered? Yes! Now we can start merging lists. This looks recursive!

Merge Sort Algorithm Overview: Split array into two halves MergeSort the left half (recursively) MergeSort the right half (recursively) Merge the two sorted halves Recursively

Merge Sort Example 50 60 45 30 90 20 80 15 50 60 45 30 90 20 80 15 50 60 45 30 90 20 80 15 50 60 45 30 90 20 80 15 50 60 30 45 20 90 15 80 30 45 50 60 15 20 80 90 15 20 30 45 50 60 80 90

Algorithm (in English) for merging You have two arrays you are going to merge into one array: Create a new array the length of the two arrays combined Place a pointer at the beginning of both arrays. Take the smaller of the two pointer values and place it in the new array. Increment that pointer value Continue until pointer in one array is at the end of the array. Copy remaining of other array to new array 30 45 50 60 15 20 80 90 15 20 30 45 50 60 80 90

void merge(int arr[], int l, int m, int r) { int i, j, k; int n1 = m - l + 1; int n2 = r - m; int L[n1], R[n2]; /* create temp arrays */ for(i = 0; i < n1; i++) /* Copy data to temp arrays L[] and R[] */ L[i] = arr[l + i]; for(j = 0; j < n2; j++) R[j] = arr[m + 1+ j]; i = 0; /* Merge the temp arrays back into arr[l..r]*/ j = 0; k = l; while (i < n1 && j < n2) { if (L[i] <= R[j]) { arr[k] = L[i]; i++; } else { arr[k] = R[j]; j++; k++; while (i < n1) {/* Copy the remaining elements of L[], if there are any */ while (j < n2) {/* Copy the remaining elements of R[], if there are any */ Merge Sort Code

Merge Sort Analysis Merging: must go through all the elements in every array for merge This is O(n) But we only do this log n times Merge 1, then 2, then 4, then 8… So total is O (n log n) Not bad! Sorted lists? Reverse order lists?

Noncomparison Based Sorting All the sorting algorithms we have seen assume binary comparisons as the basic primative Is x before y?  Suppose we had a set of n integers that range from 1…n, no two integers the same, and an array of length n. How long would it take to sort this array? What if we had a set of integers, all between 1…n in value, and we had duplicates? How can we tell how many of each integer we have? Can we write an algorithm that runs in O(n)?

Radix Sort What if we have very large key values? Think about the decimal representation of a number: x = a + 10*b + 100*c +1000*d +… a,b,c,d,… are all single digit integers (0…9) Now we can do a bucketsort on a,b,c,d We do a bucketsort on a, then we do a bucketsort on b, then c, then d, etc. (smallest to largest). Why smallest first?

Radix Sort in action Let’s try it: 427, 496, 834, 222, 333, 444, 595, 582, 767, 294 First pass: 222,582 333 834,444,294 595 496 427,767 Second Pass: 222,427 333,834 444 767 582 294,594,496 Third Pass: 222,294 333 427, 444, 496 582, 594 767 834

Try: (missing left digits are 0s) 647, 315, 16, 14,359, 453,203,235 First Pass: 453,203 14 315,235 16 647 359 Second Pass: 203 14,315,16 235 647 453,359 Third Pass: 14,16 203,235 315,359, 453 647

Radix Sort Analysis The algorithm takes O(n) time per sort. There are k = digit length bucket sorts (3 digits) So the total time is O(n*k)