1/28 COP 3540 Data Structures with OOP Chapter 7 - Part 1 Advanced Sorting.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Sorting Chapter 8 CSCI 3333 Data Structures.
Sorting Sorting is the process of arranging a list of items in a particular order The sorting process is based on specific value(s) Sorting a list of test.
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
CPS120: Introduction to Computer Science Searching and Sorting.
1/20 COP 3540 Data Structures with OOP Chapter 7 - Part 2 Advanced Sorting.
Data Structures Data Structures Topic #13. Today’s Agenda Sorting Algorithms: Recursive –mergesort –quicksort As we learn about each sorting algorithm,
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Data Structures and Algorithms PLSD210 Sorting. Card players all know how to sort … First card is already sorted With all the rest, ¶Scan back from the.
Chapter 19: Searching and Sorting Algorithms
Data Structures Advanced Sorts Part 2: Quicksort Phil Tayco Slide version 1.0 Mar. 22, 2015.
Shellsort. Review: Insertion sort The outer loop of insertion sort is: for (outer = 1; outer < a.length; outer++) {...} The invariant is that all the.
CSE 373: Data Structures and Algorithms
Sorting21 Recursive sorting algorithms Oh no, not again!
Simple Sorting Algorithms
Quicksort. Quicksort I To sort a[left...right] : 1. if left < right: 1.1. Partition a[left...right] such that: all a[left...p-1] are less than a[p], and.
Merge sort, Insertion sort
Quicksort.
Quicksort
Algorithm Efficiency and Sorting
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.
Cmpt-225 Sorting – Part two. Idea of Quick Sort 1) Select: pick an element 2) Divide: partition elements so that x goes to its final position E 3) Conquer:
Selection Sort, Insertion Sort, Bubble, & Shellsort
Sorting CS-212 Dick Steflik. Exchange Sorting Method : make n-1 passes across the data, on each pass compare adjacent items, swapping as necessary (n-1.
Simple Sorting Algorithms. 2 Bubble sort Compare each element (except the last one) with its neighbor to the right If they are out of order, swap them.
Simple Sorting Algorithms. 2 Outline We are going to look at three simple sorting techniques: Bubble Sort, Selection Sort, and Insertion Sort We are going.
Week 11 Sorting Algorithms. Sorting Sorting Algorithms A sorting algorithm is an algorithm that puts elements of a list in a certain order. We need sorting.
1 Data Structures and Algorithms Sorting. 2  Sorting is the process of arranging a list of items into a particular order  There must be some value on.
9/17/20151 Chapter 12 - Heaps. 9/17/20152 Introduction ► Heaps are largely about priority queues. ► They are an alternative data structure to implementing.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Sorting HKOI Training Team (Advanced)
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Data Structures Using C++ 2E Chapter 10 Sorting Algorithms.
Fundamentals of Algorithms MCS - 2 Lecture # 15. Bubble Sort.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Lecture No. 04,05 Sorting.  A process that organizes a collection of data into either ascending or descending order.  Can be used as a first step for.
1 Searching and Sorting Searching algorithms with simple arrays Sorting algorithms with simple arrays –Selection Sort –Insertion Sort –Bubble Sort –Quick.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
Data Structures - CSCI 102 Selection Sort Keep the list separated into sorted and unsorted sections Start by finding the minimum & put it at the front.
Searching and Sorting Searching: Sequential, Binary Sorting: Selection, Insertion, Shell.
Copyright © Curt Hill Sorting Ordering an array.
Shell Sort - an improvement on the Insertion Sort Review insertion sort: when most efficient? when almost in order. (can be close to O(n)) when least efficient?
Chapter 9 sorting. Insertion Sort I The list is assumed to be broken into a sorted portion and an unsorted portion The list is assumed to be broken into.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 122 – Data Structures Sorting.
Computer Science 1620 Sorting. cases exist where we would like our data to be in ascending (descending order) binary searching printing purposes selection.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Quicksort This is probably the most popular sorting algorithm. It was invented by the English Scientist C.A.R. Hoare It is popular because it works well.
Sorting Ordering data. Design and Analysis of Sorting Assumptions –sorting will be internal (in memory) –sorting will be done on an array of elements.
Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University 1 Chapter 7 Sorting Sort is.
Data Structures and Algorithms Instructor: Tesfaye Guta [M.Sc.] Haramaya University.
WHICH SEARCH OR SORT IS BETTER?. COMPARING ALGORITHMS Time efficiency refers to how long it takes an algorithm to run Space efficiency refers to the amount.
Chapter 4, Part I Sorting Algorithms. 2 Chapter Outline Insertion sort Bubble sort Shellsort Radix sort Heapsort Merge sort Quicksort External polyphase.
Prof. U V THETE Dept. of Computer Science YMA
UNIT - IV SORTING By B.Venkateswarlu Dept of CSE.
Sorting Chapter 13 presents several common algorithms for sorting an array of integers. Two slow but simple algorithms are Selectionsort and Insertionsort.
10.3 Bubble Sort Chapter 10 - Sorting.
Binary Search Back in the days when phone numbers weren’t stored in cell phones, you might have actually had to look them up in a phonebook. How did you.
Advanced Sorting Methods: Shellsort
Quicksort analysis Bubble sort
Sorting … and Insertion Sort.
Sub-Quadratic Sorting Algorithms
Simple Sorting Methods: Bubble, Selection, Insertion, Shell
CSE 373 Sorting 2: Selection, Insertion, Shell Sort
Advanced Sorting Methods: Shellsort
Presentation transcript:

1/28 COP 3540 Data Structures with OOP Chapter 7 - Part 1 Advanced Sorting

2/28 Advanced Sorting  Two sorts we will cover first.  Shell Sort – an O(n(log 2 n) 2 ) sort … in general, and ‘can approach’ O(n) performance!  Partitioning, an O(nlog 2 n) sort.  Then, we’ll cover the QuickSort.

3/28 Recall how the Insertion Sort worked.  Took an element out of the ‘array’ and assumed all elements ‘to the left’ were sorted.  We marked this spot.  And we extracted out that element.  We then  compared the element extracted out with the elements ‘to the left’ of this element and  ‘inserted’ this element into its proper place  shifting all elements to the right as needed to make room for this inserted element and fill the vacated spot.

4/28 Approach that helped us:  Constraints:  Helped ourselves by: starting with a single element to the left – so knew ‘that’ element was sorted - certainly sorted unto itself.  Then we proceeded: Slowly the elements to the ‘left’ of the marked element grew in sorted number, as new numbers find their proper place in the subarray to the left - while the unsorted elements to the right diminish in number.

5/28 Potential Problems with the Insertion Sort  Now, what happens if the new number to be sorted is very small (or very large) and our sort is ‘ascending (or descending)?’  This may require a large number of ‘copies’ to the right to make room for this new element.  Can require a number of ‘copies’ close to ‘n’ in fact.  Average number of copies is clearly n/2.  For n elements to be sorted and an average of n/2 copies per element, we have n*n/2 or n 2 /2 copies.  That may result in a very inefficient sort.  This is how the insertion sort is an O(n 2 ) sort.  It is this number of copies (comparing and shifting) that decreases its performance.

6/28 Shell Sort Approach  Want to reduce these numbers of large shifts  Shell sort does this by sorting a very small subset of numbers – like three or four:  Where the numbers themselves might be large distances apart (like in a large array)  and it sorts them with respect to each other  By sorting a small number of numbers, very small (or very large) numbers can be put much more nearly ‘in place’ much more quickly than with other approaches.  How done?

7/28 Shell Sort uses the notion of a ‘computed Gap’  The Shell Sort uses a computed ‘gap’ between numbers represented by an ‘h’ as the distance between numbers in each subset to be sorted.  1. Sorts all numbers (say in the array of numbers) with the same ‘h’ (gap) Like, numbers eight apart – or four apart… Sorts these numbers with respect to each other. 2. Then, after doing this, the algorithm reduces the gap (or distance) to a smaller number, like maybe 4 apart.  3. (Ultimately the gap has size = 1;) Then the algorithm ‘1-sorts’ the array using the insertion sort.

8/28 Example  Consider: sort three elements at a time with respect to each other, where the numbers are some ‘h’ distance apart  …………………………………………………….  For array size n=10, and if gap size h = 4, we have four sub-arrays: (We call this a 4-sort)  Indices: (0,4,8), (1,5,9), (2,6) and (3,7). These sets are sorted with respect to each other. (Note: all ten are sorted!)  Arrays are interleaved, but, again, sorted with respect to each other.  (Note: the integers are not yet in final spot.

9/28 Consider Improved Performance!  Recall again the Insertion Sort  Recalling how the insertion sort works,  very efficient for arrays nearly sorted (fewer swaps and movement, and yet can be  very inefficient (due to shifts and copies) if the data are very unsorted. Particularly true for very large / very small numbers.  Shell sort does ‘n-sorting’  Capitalizes on initial position of elements especially if they are far from where they might ultimately end up.  Brings numbers more quickly to final position…(or nearer)  Algorithm moves elements that may be very far apart much closer to their final position more quickly thus reducing copying and shifting and swapping!  Shell Sort can approach O(n) performance: much better than O(n 2 ) !

10/28 What about Larger Arrays? Gap Size?  Using a carefully researched algorithm to compute optimum gap size,.  Don Knuth developed a ‘recursive’ relationship:  h= 3*h+1 to start with, and then, subsequent gaps at  (h-1)/3.  (note the ‘recursion’ in the formula itself. Uses value of h to compute new value of h.  These h-values are referred to as  interval sequence or gap sequence  and are recursively computed as functions of h.  In more detail:

11/28 Don Knuth’s algorithm will start with a 3-sort; that is, sort three numbers some distance apart. By Don Knuth’s research reveals, as it turns out (algorithm is a few slides ahead), for an array of size > 364 and < 1093, 3-sort with a gap size of 364; After that sort, use a gap size of 121; then gap size = 40; steadily decreasing… Develop initial gap size recursively by computing h: (algorithm is three slides ahead) h 3*h+1 h is determined by computing the largest value of h 1 4 computing h=h*3 +1 until h <= nElems/3 is false So, computing h we see that h increases from 1 to 4 to 13 to 121 to 364 to … Once original gap is determined, sort continues and algorithm steadily reduces gap h from 364 to until h = So for array size > 364 and < 1093, gap = 364, etc. Gap sizes

12/28 Algorithm (covered in previous slide)  Algorithm first uses a short loop to generate the first (initial) value of h.  Then, once we have an initial value of h:  additional values of h are recursively computed depending on the size of the array to be sorted.  Gap then starts with largest h-value.  For a 1000-element array, our initial gap size is 364.  After sorting, we would successively decrease the gap using the formula: h = (h-1)/3 as shown.

13/28 Note: 1.As it turns out, the algorithm actually sorts the first two elements of each group for a given gap first; then it goes back and sorts all three-element groups. This results in better performance time.  You will see this if you look carefully at the algorithm.

14/28 public void shellSort() { int inner, outer; long temp; int h = 1; // find initial value of h while (h <= nElems/3) // COMPUTE GAP SIZE h = h*3 + 1; // (1, 4, 13, 40, 121, 364,...) // Compute initial value of h // Value of h depends on original size of array, nElems. // start with largest gap (h-value) such that h < nElem/3 while (h > 0) // for 1000 element array, h = 364 { for (outer=h; outer<nElems; outer++) // h – sort the structure… {// for 1000 elements, h = 364; outer < nElems (1000); increment by one. temp = theArray[outer]; inner = outer; while (inner > h-1 && theArray[inner-h] >= temp) { theArray[inner] = theArray[inner-h]; inner -= h; } // end while theArray[inner] = temp; } // end for h = (h-1) / 3; // computes new gap: decreases h } // end while (h>0) } // end shellSort()

15/28 Google: Shell Sort Applet  Google: applet Lafore  You will get a number of applet choices.  Select and enjoy

16/28 Demo of Shell Sort  Do n=12 and notice how the gap varies across the bars.  You can see when h goes from 4 to 1.  Can see when it compares two in the interval … then three; then 1-sorts.  Do 100 sort.  It starts with h = 40. See it compares two of the three in the interval until there are only intervals of two left.  There is a larger number of intervals when it goes to h= 13.  Go to h=4 and see more intervals yet.  Finally, h=1.  Do this.

17/28 Shell Sort - Evaluation  Good for medium-sized array up to a few thousand items.  Shell Sort - O(n(log 2 n) 2 ) is not as fast as the Quick Sort O(nlog 2 n) (coming soon)  Not so good for large files, but  Easy to implement  Requires very little extra space.  All sorts have a ‘worst case’ performance.  For Shell Sorts, the  Worse case is not much worse than average performance, so this is good!  (Worse case is very different than average case in a Quick Sort).

18/28 Final Remarks on Shell Sort  Other sequences are available.  Many alternatives available. Can experiment…  Ultimately, need to end up with a 1  Forces last pass to be an insertion sort.  Guideline:  Gaps should be relatively prime.  Note Shell Sort’s numbers presented are not all prime (4, 40…). This led to some earlier inefficiencies.  Experiments on Shell Sort yield performance mostly between O(n 3/2 ) to O(n 7/6) )  or from almost O(n 2 ) down to almost O(n)!  Quite a difference and the difference is realized as n increases, which makes sense.

19/28 Partitioning

20/28 Partitioning  Partitioning is key to QuickSort thinking.  Partitioning divides data into two groups dependent upon the value of a key.  E.g. Divide students into two groups: 3.0 (Incidentally, why is a gpa of 3.0 important??)  We select a Pivot Value:  value used to separate data items into two groups:  end up with Data pivot value.

21/28 Pivot Values  Note: pivot point can be any key value.  Need not be a midpoint or value ‘half-way.’  Would be nice if pivot were half-way point, but we have no way of knowing…  Later we will see how the choice of the pivot impacts performance!  Pivot value used to separate array into left side and right side.  Ideally, we’d ‘like’ the sub-arrays to be roughly the same size, and we will work toward that reality.

22/28 Run Partition Algorithm to build Sub-Arrays  Once pivot value selected, we run the partition algorithm  Once run,  data on the left side of the pivot ‘belongs’ to the left side of the array (whatever number of elements may be on the left) and,  Data on the right side (>=) than the pivot value belong to the right side, however many elements are on the right side.  Note: Once partitioning is run, data is NOT sorted,  But, the items are a lot ‘closer’ to their final position…  And array is partitioned based on the pivot value.

23/28 The Partitioning Algorithm  Pick a pivot value… (more later)  Start with index at the left side of one partition.  Let’s call it left scan.  Move toward the right.  Compare element to pivot value.  If an element is less than the pivot value, leave it alone. Move to the right.  Advance to the right until element is >= pivot value and then Stop.  Starting with index at right most index on the right side  Let’s call it a right scan.  Move toward the left.  Compare element to the pivot value  If an element is >= pivot value, leave it alone; Move to the left.  Advance to the left until element is < pivot value and then Stop.  Swap the two values.  Iterate (back on the left; then right) until left and right scan are looking at the same entry.  ….

24/28  Let’s look at the applet

25/28 Partition.html  Google: applet Lafore  Run with n=12 with various orderings…  Run with n=40. Notice the partition first and the final ordering…  Note: in running the partitioning algorithm the data are not totally sorted – but they are a good bit closer.

26/28 Partitioning and the Pivot Value  Note partitioning is not stable.  As elements on one side are moved to the other side of the pivot value, they are NOT necessarily in the same relative positions in this ‘new’ partition!  In fact, they tend to be in reverse order.  Further, the number of elements on each side need not be the same either – depends on the pivot value.  Very likely, there is NOT the same number of elements on each side of the pivot.

27/28 One (of several ) Problems with Partitioning  1. What if a poor pivot value were chosen such that all elements to the left were < pivot value?   Algorithm index keeps advancing.  End up with array index out of bounds exception.  Ditto the other way. See code below. while (leftPtr < right && theArray[++leftPtr] < pivot) ; // nop  Clearly – as any program that is to be robust, there must be checks on the pivot value.

28/28 Efficiency of the Partition  Algorithm is pretty efficient too  Runs in O(n) time.  Pointers move from opposite ends moving and swapping at a constant rate.  If n were 2n, the algorithm would take roughly twice as long.  Thus the algorithm operates in O(n) time – means time is proportional to the number of items being sorted.

29/28 Efficiency of the Partitioning Algorithm  Non random data yields terrible results!  If data is inversely ordered, then every pair will be swapped, so n/2 swaps! Very inefficient!  Multiply this by n elements and we have a n 2 /2. Poor!  Random data: yields fewer than n/2 swaps.  Some will already be in the right place.  On average for random data, about half of maximum no. of swaps will take place.  Regardless of random / non-random, both situations result in an efficiency proportional to n.