TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT 20058.1 Content: Lecture 8: Sorting part I: –Intro: aspects of sorting, different.

Slides:



Advertisements
Similar presentations
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Advertisements

§7 Quicksort -- the fastest known sorting algorithm in practice 1. The Algorithm void Quicksort ( ElementType A[ ], int N ) { if ( N < 2 ) return; pivot.
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
Math 130 Introduction to Computing Sorting Lecture # 17 10/11/04 B Smith: Save until Week 15? B Smith: Save until Week 15? B Smith: Skipped Spring 2005?
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
Divide-and-Conquer The most-well known algorithm design strategy:
1 Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances.
CS4413 Divide-and-Conquer
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 5.
Analysis of Algorithms CS 477/677 Sorting – Part B Instructor: George Bebis (Chapter 7)
Spring 2015 Lecture 5: QuickSort & Selection
Quicksort CS 3358 Data Structures. Sorting II/ Slide 2 Introduction Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case:
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
Quicksort COMP171 Fall Sorting II/ Slide 2 Introduction * Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case: O(N.
Chapter 7: Sorting Algorithms
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Updated QuickSort Problem From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j]
Introduction to Algorithms Rabie A. Ramadan rabieramadan.org 4 Some of the sides are exported from different sources.
CS 253: Algorithms Chapter 7 Mergesort Quicksort Credit: Dr. George Bebis.
Chapter 4 Divide-and-Conquer Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
TDDB56 DALGOPT-D DALG-C Lecture 8 – Sorting (part I) Jan Maluszynski - HT Sorting: –Intro: aspects of sorting, different strategies –Insertion.
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
Sorting II/ Slide 1 Lecture 24 May 15, 2011 l merge-sorting l quick-sorting.
CSE 373 Data Structures Lecture 19
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
1 Data Structures and Algorithms Sorting. 2  Sorting is the process of arranging a list of items into a particular order  There must be some value on.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “ Introduction to the Design & Analysis of Algorithms, ” 2 nd ed., Ch. 1 Chapter.
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 5 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
1 Joe Meehean.  Problem arrange comparable items in list into sorted order  Most sorting algorithms involve comparing item values  We assume items.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
Sorting CSIT 402 Data Structures II. 2 Sorting (Ascending Order) Input ›an array A of data records ›a key value in each data record ›a comparison function.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2.Solve smaller instances.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
ICS201 Lecture 21 : Sorting King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.
Sorting divide and conquer. Divide-and-conquer  a recursive design technique  solve small problem directly  divide large problem into two subproblems,
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 122 – Data Structures Sorting.
Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all.
Today’s Material Sorting: Definitions Basic Sorting Algorithms
Chapter 9: Sorting1 Sorting & Searching Ch. # 9. Chapter 9: Sorting2 Chapter Outline  What is sorting and complexity of sorting  Different types of.
Quicksort This is probably the most popular sorting algorithm. It was invented by the English Scientist C.A.R. Hoare It is popular because it works well.
Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University 1 Chapter 7 Sorting Sort is.
329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Sorting.
Chapter 4: Divide and Conquer
Quick Sort (11.2) CSE 2011 Winter November 2018.
8/04/2009 Many thanks to David Sun for some of the included slides!
CSE 373 Data Structures and Algorithms
CSC 380: Design and Analysis of Algorithms
Quick-Sort 4/25/2019 8:10 AM Quick-Sort     2
Advanced Sorting Methods: Shellsort
Presentation transcript:

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Content: Lecture 8: Sorting part I: –Intro: aspects of sorting, different strategies –Insertion Sort, Selection Sort, Quick Sort Lecture 9: Sorting part II: –Heap Sort, Merge Sort (Vilhelm Dahllöf) –A movie ”Sort out Sorting”: survey of 9 comparison-based sorting algorithms (Bengt Werstén) Lecture 10: Sorting part III and Selection –Theoretical lower bound for comparison-based sorting, –BucketSort, RadixSort –Selection, median finding, quick select

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT The Sorting Problem Input: A list L of data items with keys (the part of each data item we base our sorting on) Output: A list L’ of the same data items placed in order, i.e.: Caution! Don’t over use sorting! Do you really need to have it sorted, or will a dictionary do fine instead of a sorted array?

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Aspects of Sorting: Internal vs. External sorting: Can data be kept in fast, random accessed internal memory – or... Sorting in-place vs. sorting with auxiliary data structures Does the sorting algorithm need extra data structures? Often the stack is used as a ”hidden” extra structure! Worst-case vs. expected-case performance How does the algorithm behave in different situations?

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Aspects of Sorting (cont.) What is the ”expected case”? In some applications we may never have a really bad worst case – pick algorithm accordingly! Sorting by comparison vs. Sorting digitally compare keys, or use e.g. Binary representation of data as sorting criteria? Stable vs. unstable sorting What happens with multiple occurrences of the same key? ”Quick’n’Dirty” vs. Efficient but hard-to-remember...

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Different strategies used when sorting... Insertion sorts: For each new element to add to the sorted set, look for the right place in that set to put the element... Linear insertion, Binary insertion, Shell sort,... Selection sorts: In each iteration, search the unsorted set for the smallest (largest) remaining item to add to the end of the sorted set Straight selection, Tree selection 1, Heap sort,... Exchange sorts: Browse back and forth in some pattern, and whenever we are looking at a pair with wrong relative order, swap them... Bubble sort, Shaker sort, Quick sort, Merge sort ) Requires extra data structures apart from the stack

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT (Linear) insertion sort ” In each iteration, insert the first item from unsorted part Its proper place in the sorted part” An in-place sorting algorithm! Data stored in A[0.. n -1] Iterate i from 1 to n -1: The table consist of: –Sorted data in A[0.. i -1] –Unsorted data in A[ i.. n-1 ] Scan sorted part for index s for insertion of the selected item Increase i iisi

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Analysis of Insertion Sort t 1 : n-1 passes over this ”constant speed” code t 2 : n-1 passes... t 3 : I = worst case no. of iterations in inner loop: I = 1+2+…+n-1 = (n-2)(n-1)/2 = n 2 -3n+2 t 4 : I passes t 5 : n-1 passes T: t 1 +t 2 +t 3 +t 4 +t 5 = 3*(n-1)+2*(n 2 -3n+2) = 3n-3+2n 2 -6n+4 = 2n 2 - 3n+1...thus we have an algorithm in O ( n 2 )in worst case, but …. good if file almost sorted Procedure InsertionSort (table A[0..n-1]): 1for i from 1 to n-1 do 2 s  i; x  A[i] 3 while j  1 and A[j-1]>x do 4 A[j]  A[j-1] ; j  j-1 5 A[j]  x

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT (Straight) selection sort ” In each iteration, search the unsorted set for the smallest remaining item to add to the end of the sorted set” An in-place sorting algorithm! Data stored in A[0.. n -1] Iterate i from 1 to n -1: The table consist of: –Sorted data in A[0.. i -1] –Unsorted data in A[ i.. n-1 ] Scan unsorted part for index s for smallest remaining item Swap places for A[ i ] and A[ s ] i is i iisisisisisisisis

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Analysis of Straight selection t 1 : n-1 passes over this ”constant speed” code t 2 : n-1 passes... t 3 : I = no. of iterations in inner loop: I = n-2 + n-3 + n = (n-2)(n-1)/2 = n 2 -3n+2 t 4 : I passes t 5 : n-1 passes T: t 1 +t 2 +t 3 +t 4 +t 5 = 3*(n-1)+2*(n 2 -3n+2) = 3n-3+2n 2 -6n+4 = 2n 2 - 3n+1...thus we have an algorithm in O ( n 2 )...rather bad! Procedure StraightSelection (table A[0..n-1]): 1for i from 0 to n-2 do 2 s  i 3 for j from i+1 to n-1 do 4 if A[j] < A[s] then s  j 5 A[i]  A[s] Is this analysis good enough? Can we compare algs. of similar order? How expensive is... Index comparison Data comparison Data copying...are they comparable or quite different? Worst case? Best case? Expected case?

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Analysis of Straight selection – details? We may redo the analysis and differentiate between Cheap operations as assignment and comparison of index or pointers Different levels of ”expensive” operations such as –Procedure calls –Comparison of data –Copying of data...and we may then find two O ( x ) algorithms to be quite different... Procedure StraightSelection (table A[0..n-1]): 1for i from 0 to n-2 do 2 s  i 3 for j from i+1 to n-1 do 4 if A[j] < A[s] then s  j 5 A[i]  A[s] Is this an expensive op? It’s allways called if data is in reverse order!  ”worst case”!

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick Sort - overview 1. divide-and-conquer principle 2. example, basic ideas 3. quick sort algorithm, top–down 4. examples: worst and best case 5. randomization principle 6. randomized quick sort 7. fine tuning – make it faster!

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Divide–and–conquer principle 1.divide a problem into smaller, independent sub- problems 2.conquer: solve the sub-problems recursively (or directly if trivial) 3.combine the solutions of the sub-problems

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick Sort Example – basic idea Procedure QuickSort (table A[l : r] ): 1.If l  r return 2.select some element of A, e.g. A[l], as the so–called pivot element: p  A[l] ; 3.partition A in–place into two disjoint sub-arrays A L, A R : m  partition( A[l : r], p ) ; { determines m, l<m<r, and reorders A[l : r], such that all elements in A[l : m] are now  p and all in A[m+1 : r] are now  p.} 4.apply the algorithm recursively to A L and A R : quicksort ( A[l : m] ); {sorts A L } quicksort ( A[m +1 : r] ); {sorts A R }

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick sort: Partitioning an array in–place int partition ( array A[l : r], key p ) { the pivot element p is A[l] } i  l-1 ; j  r+1 ; while ( true ) do do i  i+1 while A[i] p if ( i < j ) A[i]  A[j] else return j; This code will scan through the entire set once, and will as a max move each element once!...thus: Running time of partition :  r – l + 1)

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Warning – details matter! Book: right most element as pivot, swaps it in at end, recurses at either side excluding the old pivot Film: left most element as pivot, swaps it in at end recurses at either side excluding the old pivot Slides: left most as pivot, includes it in area to partition, returns one position containing an element of size equal to the pivot – recurse on both halves including the pivot...and the way i ’s and j ’s are compared ( < or  ), if they are incremented (decremented) before or after comparison, etc...

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick Sort - Analysis Run time as a recursive expression. We implicitly build a search tree! What is the worst case? Expression reformulated to select a ”worst” partition!

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick Sort – Analysis – worst case... If the pivot element happens to be the min or max element of A in each call to quicksort... (e.g., pre-sorted data) –Unbalanced recursion tree –Recursion depth becomes n Sorting nearly–sorted arrays occurs frequently in practice 

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick Sort – Analysis – best case... Best – balanced search tree! How...? If pivot is median of data set!

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick sort: apply randomization Randomization algorithmic design principle applicable where choosing among several alternative directions to avoid long sequences of bad decisions with high probability, independently of the input simplifies the average case analysis Select pivot randomly (not first, not last...) p  A[random(l,r)] ;  running time not only dependant on good input data  can not construct bad input data...

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick sort – fine tuning.... Median-of-three and sentinels... Inner loop of partition fn should be: i  i+1 ; while i  r and A[i] < p do i  i+1; j  j-1 ; while j  l and A[j] > p do j  j-1; Improve: –Sort the first, middle and last elements of A –Use the content of middle element as pivot value –Data set now has sentinels at the end (values selected to stop the iteration) and we may remove extra test i  r and j  l. –Probability of the middle value to be a bad pivot is low!

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick sort – fine tuning.... Reduce need for extra space – upper bound on stack! Observations: –Large partition lead to a large stack depth –Does not matter in which order we perform recursive calls (left part of A before right part or vice versa) Enhancements: –Replace last (tail-) recursive call with iteration (reusing the same procedure call), leave first recursive call as is –Select the larger part of A for the repeated iteration, and use recursion for the smaller part of A...and we have the worst maximum stack depth when we have a balanced search tree, i.e. O(log n).

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Quick sort – fine tuning.... When only a few elements remain (e.g., |A| < 4)... Over head for recursion becomes significant Entire A is almost sorted (except for small, locally unsorted sections) Stop sorting by QuickSort, perform one global sort using Linear InsertionSort– although O(n 2 ) worst case, much better on allmost sorted data., which is the case now!

TDDB56 DALGOPT-D – Lecture 8 – Sorting (part I) Jan Maluszynski - HT Straight Insertion – the good case? If table is almost sorted? E.g., max 3 items unsorted, then remainder are bigger? t 1 : n-1 passes over this ”constant speed” code t 2 : n-1 passes... T : I = no. of iterations in inner loop (max 3 elements ”totaly unsorted”): I = (n-1)*3  worst case, all three allways in reverse order t 6 : n-1 passes T: t 1 +t 2 +t t 6 = 3*(n-1)+3*(n-1)= 3n-3...thus we have an algorithm in O ( n )...rather good! Procedure InsertionSort(table A[0..n-1]): 1for i from 1 to n-1 do 2 j  i; tmp  A[i] 3 while j>0 and tmp < A[j-1] do 4 j  j-1 5 A[j+1]  A[j] 6 A[j]  tmp