CSE 332: Data Abstractions Sorting I

Slides:



Advertisements
Similar presentations
Garfield AP Computer Science
Advertisements

Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Efficient Sorts. Divide and Conquer Divide and Conquer : chop a problem into smaller problems, solve those – Ex: binary search.
CSE 373: Data Structures and Algorithms
CSE332: Data Abstractions Lecture 12: Introduction to Sorting Tyler Robison Summer
Sorting Algorithms Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Sorting Algorithms Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Merge sort, Insertion sort
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
CS 104 Introduction to Computer Science and Graphics Problems Data Structure & Algorithms (3) Recurrence Relation 11/11 ~ 11/14/2008 Yang Song.
Divide and Conquer Sorting
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
CSE 373 Data Structures Lecture 19
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
Searching. The process used to find the location of a target among a list of objects Searching an array finds the index of first element in an array containing.
CSE 373: Data Structures and Algorithms Lecture 6: Sorting 1.
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort (iterative, recursive?) * Bubble sort.
Sorting Sorting: –Task of rearranging data in an order. –Order can be: Ascending Order: –1,2,3,4,5,6,7,8,9 Descending Order: –9,8,7,6,5,4,3,2,1 Lexicographic.
Sorting CSIT 402 Data Structures II. 2 Sorting (Ascending Order) Input ›an array A of data records ›a key value in each data record ›a comparison function.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
1 Sorting (Bubble Sort, Insertion Sort, Selection Sort)
Chapter 8 Sorting and Searching Goals: 1.Java implementation of sorting algorithms 2.Selection and Insertion Sorts 3.Recursive Sorts: Mergesort and Quicksort.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Data Structures - CSCI 102 Selection Sort Keep the list separated into sorted and unsorted sections Start by finding the minimum & put it at the front.
CSE332: Data Abstractions Lecture 12: Introduction to Sorting Dan Grossman Spring 2010.
Divide and Conquer Sorting Algorithms COMP s1 Sedgewick Chapters 7 and 8.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Today’s Material Sorting: Definitions Basic Sorting Algorithms
CS 367 Introduction to Data Structures Lecture 11.
CSE373: Data Structure & Algorithms Lecture 19: Comparison Sorting Lauren Milne Summer 2015.
1 Priority Queues (Heaps). 2 Priority Queues Many applications require that we process records with keys in order, but not necessarily in full sorted.
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Sorting.
Subject Name: Design and Analysis of Algorithm Subject Code: 10CS43
CSCI 104 Sorting Algorithms
May 17th – Comparison Sorts
CSE 373: Data Structure & Algorithms Comparison Sorting
CSE373: Data Structures & Algorithms
CSE332: Data Abstractions Lecture 12: Introduction to Sorting
Chapter 7 Sorting Spring 14
Teach A level Computing: Algorithms and Data Structures
Divide and Conquer.
Adapted from slides by Marty Stepp and Stuart Reges
Data Structures and Analysis (COMP 410)
Instructor: Lilian de Greef Quarter: Summer 2017
Chapter 4: Divide and Conquer
Unit IV : Searching and sorting Techniques
Data Structures & Algorithms Priority Queues & HeapSort
CSc 110, Spring 2017 Lecture 39: searching.
CSE373: Data Structure & Algorithms Lecture 21: Comparison Sorting
Hassan Khosravi / Geoffrey Tien
8/04/2009 Many thanks to David Sun for some of the included slides!
Sub-Quadratic Sorting Algorithms
CSE373: Data Structure & Algorithms Lecture 21: Comparison Sorting
Simple Sorting Methods: Bubble, Selection, Insertion, Shell
Sorting CSE 373 Data Structures.
CSE 326: Data Structures Sorting
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
Priority Queues (Heaps)
CSE 326: Data Structures: Sorting
CSE 332: Sorting II Spring 2016.
Advanced Sorting Methods: Shellsort
Presentation transcript:

CSE 332: Data Abstractions Sorting I Spring 2016

Announcements

Sorting Input Output an array A of data records a key value in each data record a comparison function which imposes a consistent ordering on the keys Output “sorted” array A such that For any i and j, if i < j then A[i]  A[j]

Consistent Ordering The comparison function must provide a consistent ordering on the set of possible keys You can compare any two keys and get back an indication of a < b, a > b, or a = b (trichotomy) The comparison functions must be consistent If compare(a,b) says a<b, then compare(b,a) must say b>a If compare(a,b) says a=b, then compare(b,a) must say b=a If compare(a,b) says a=b, then equals(a,b) and equals(b,a) must say a=b

Why Sort? Provides fast search: Find kth largest element in: O(log n)

Space How much space does the sorting algorithm require? In-place: no more than the array or at most O(1) addition space out-of-place: use separate data structures, copy back External memory sorting – data so large that does not fit in memory

Stability A sorting algorithm is stable if: Items in the input with the same value end up in the same order as when they began. Input Adams 1 Black 2 Brown 4 Jackson 2 Jones 4 Smith 1 Thompson 4 Washington 2 White 3 Wilson 3 Unstable sort Adams 1 Smith 1 Washington 2 Jackson 2 Black 2 White 3 Wilson 3 Thompson 4 Brown 4 Jones 4 Stable Sort Adams 1 Smith 1 Black 2 Jackson 2 Washington 2 White 3 Wilson 3 Brown 4 Jones 4 Thompson 4

Time How fast is the algorithm? requirement: for any i<j, A[i] < A[j] This means that you need to at least check on each element at the very minimum Complexity is at least: And you could end up checking each element against every other element Complexity could be as bad as: The big question: How close to O(n) can you get?

Sorting: The Big Picture Simple algorithms: O(n2) Fancier algorithms: O(n log n) Comparison lower bound: (n log n) Specialized algorithms: O(n) Handling huge data sets Insertion sort Selection sort … Heap sort Merge sort Quick sort (avg) … Bucket sort Radix sort External sorting

Demo (with sound!) http://www.youtube.com/watch?v=kPRA0W1kECg

Selection Sort: idea Find the smallest element, put it 1st Find the next smallest element, put it 2nd Find the next smallest, put it 3rd And so on …

Try it out: Selection Sort 31, 16, 54, 4, 2, 17, 6

Selection Sort: Code One good feature: does at most n swaps void SelectionSort (Array a[0..n-1]) { for (i=0; i<n; ++i) { j = Find index of smallest entry in a[i..n-1] Swap(a[i],a[j]) } One good feature: does at most n swaps For stability: if there are duplicate smallest elements, choose the first. Runtime: worst case : best case : average case : O(n2) : must select in linear time Even when already sorted!

Bubble Sort Take a pass through the array Repeat until no swaps needed If neighboring elements are out of order, swap them. Repeat until no swaps needed Wost & avg case: O(n2) pretty much no reason to ever use this algorithm

Insertion Sort Sort first 2 elements. Insert 3rd element in order. (First 3 elements are now sorted.) Insert 4th element in order (First 4 elements are now sorted.) And so on…

How to do the insertion? Suppose my sequence is: 16, 31, 54, 78, 32, 17, 6 And I’ve already sorted up to 78. How to insert 32?

Try it out: Insertion sort 31, 16, 54, 4, 2, 17, 6

O(n2) : move k/2 in kth step Insertion Sort: Code void InsertionSort (Array a[0..n-1]) { for (i=1; i<n; i++) { for (j=i; j>0; j--) { if (a[j] < a[j-1]) Swap(a[j],a[j-1]) else break } O(n2) : reverse input O(n) : sorted input O(n2) : move k/2 in kth step But good for: 2,3,4,…1 Runtime: worst case : best case : average case : Note: can instead move the “hole” to minimize copying, as with a binary heap.

Insertion Sort vs. Selection Sort Same worst case, avg case complexity Insertion better best-case preferable when input is “almost sorted” one of the best sorting algs for almost sorted case (also for small arrays)

Sorting: The Big Picture Simple algorithms: O(n2) Fancier algorithms: O(n log n) Comparison lower bound: (n log n) Specialized algorithms: O(n) Handling huge data sets Insertion sort Selection sort … Heap sort Merge sort Quick sort (avg) … Bucket sort Radix sort External sorting

Heap Sort: Sort with a Binary Heap Buildheap = N Pull out mins = O(N log N) Can make it in-place by copying to end…use max heap Worst Case Runtime: Worst, avg: O(n log n) Is this really better than shell sort?

In-place heap sort Treat the initial array as a heap (via buildHeap) When you delete the ith element, put it at arr[n-i] It’s not part of the heap anymore! 4 7 5 9 8 6 10 3 2 1 heap part sorted part 5 7 6 9 8 10 4 3 2 1 arr[n-i]= deleteMin() heap part sorted part But this reverses the list. How would you solve fix this? - max instead of min heap. Or final pass to reverse n/2 swaps.

Worst, best, avg: O(n log n) AVL Sort Insert nodes into an AVL Tree Conduct an In-order traversal to extract nodes in sorted order *Not* array-based Good for later inserts (in logN time) BuildTree = O(Nlog N) Traverse = O(N) [Opposite of heap sort!] Worst Case Runtime: Worst, best, avg: O(n log n)

“Divide and Conquer” Very important strategy in computer science: Divide problem into smaller parts Independently solve the parts Combine these solutions to get overall solution Idea 1: Divide array in half, recursively sort left and right halves, then merge two halves  known as Mergesort Idea 2 : Partition array into small items and large items, then recursively sort the two sets  known as Quicksort

Mergesort Divide it in two at the midpoint Sort each half (recursively) Merge two halves together 8 2 9 4 5 3 1 6

Mergesort Example 8 2 9 4 5 3 1 6 Divide 8 2 9 4 5 3 1 6 Divide 8 2 8 2 9 4 5 3 1 6 Divide 8 2 9 4 5 3 1 6 Divide 1 element 8 2 9 4 5 3 1 6 Merge 2 8 4 9 3 5 1 6 Merge 2 4 8 9 1 3 5 6 Merge 1 2 3 4 5 6 8 9

Merging: Two Pointer Method (Or “Two *Finger* Method”) Perform merge using an auxiliary array 2 4 8 9 1 3 5 6 Auxiliary array It takes constant time to decide which element goes in output and then copy it. Thus, O(n).

Merging: Two Pointer Method Perform merge using an auxiliary array 2 4 8 9 1 3 5 6 Auxiliary array 1

Merging: Two Pointer Method Perform merge using an auxiliary array 2 4 8 9 1 3 5 6 Auxiliary array 1 2 3 4 5

Merging: Finishing Up Starting from here… i j target Left finishes up copy i j target or first copy this… Right finishes up …then this i j target

Merging: Two Pointer Method Final result 1 2 3 4 5 6 8 9 Auxiliary array 1 2 3 4 5 6 Complexity? Stability?

Merging Ensures stability… Merge(A[], Temp[], left, mid, right) { Int i, j, k, l, target i = left j = mid + 1 target = left while (i < mid && j < right) { if (A[i] < A[j]) Temp[target] = A[i++] else Temp[target] = A[j++] target++ } if (i > mid) //left completed// for (k = left to target-1) A[k] = Temp[k]; if (j > right) //right completed// k = mid l = right while (k > i) A[l--] = A[k--] A[k] = Temp[k] Ensures stability…

Recursive Mergesort What is the recurrence relation? MainMergesort(A[1..n], n) { Array Temp[1..n] Mergesort[A, Temp, 1, n] } Mergesort(A[], Temp[], left, right) { if (left < right) { mid = (left + right)/2 Mergesort(A, Temp, left, mid) Mergesort(A, Temp, mid+1, right) Merge(A, Temp, left, mid, right) Base case: T(1) = 1 T(n) = 2 T(n/2) + n What is the recurrence relation?

Mergesort: Complexity Want: n/2k = 1 n = 2k log n = k Point to “divide term” and “conquer term” Mergesort: Complexity

Iterative Mergesort Merge by 1 Merge by 2 Merge by 4 Merge by 8

Iterative Mergesort Iterative Mergesort reduces copying Complexity? Merge by 1 Merge by 2 Merge by 4 Merge by 8 Merge by 16 copy Iterative Mergesort reduces copying Complexity?

Properties of Mergesort In-place? Stable? Sorted list complexity? Nicely extends to handle linked lists. Multi-way merge is basis of big data sorting. Java uses Mergesort on Collections and on Arrays of Objects. Mergesort generally does fewer comparisons: when an array is sorted, only check half the elements, then copy the rest. Later, we get quicksort: more comparisons, but no extra space needed.