Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sorting HKOI Training Team (Advanced) 2006-01-21.

Similar presentations


Presentation on theme: "Sorting HKOI Training Team (Advanced) 2006-01-21."— Presentation transcript:

1 Sorting HKOI Training Team (Advanced) 2006-01-21

2 What is sorting? Given: A list of n elements: A 1,A 2,…,A n Re-arrange the elements to make them follow a particular order, e.g. Ascending Order: A 1 ≤ A 2 ≤ … ≤ A n Descending Order: A 1 ≥ A 2 ≥ … ≥ A n We will talk about sorting in ascending order only

3 Why is sorting needed? Some algorithms works only when data is sorted e.g. binary search Better presentation of data Often required by problem setters, to reduce workload in judging

4 Why learn Sorting Algorithms? C++ STL already provided a sort() function Unfortunately, no such implementation for Pascal  This is a minor point, though

5 Why learn Sorting Algorithms? Most importantly, OI problems does not directly ask for sorting, but its solution may be closely linked with sorting algorithms In most cases, C++ STL sort() is useless. You still need to write your own “sort” So… it is important to understand the idea behind each algorithm, and also their strengths and weaknesses

6 Some Sorting Algorithms… Bubble Sort Insertion Sort Selection Sort Shell Sort Heap Sort Merge Sort Quick Sort Counting Sort Radix Sort How many of them do you know?

7 Bubble, Insertion, Selection… Simple, in terms of Idea, and Implementation Unfortunately, they are inefficient O(n 2 ) – not good if N is large Algorithms being taught today are far more efficient than these

8 Shell Sort Named after its inventor, Donald Shell Observation:Insertion Sort is very efficient when n is small when the list is almost sorted

9 Shell Sort Divide the list into k non-contiguous segments Elements in each segments are k-elements apart In the beginning, choose a large k so that all segments contain a few elements (e.g. k=n/2) Sort each segment with Insertion Sort 214748364774

10 Shell Sort Definition: A list is said to be “k-sorted” when A[i] ≤ A[i+k] for 1 ≤ i ≤ n-k Now the list is 5-sorted 2144483677

11 Shell Sort After each pass, reduces k (e.g. by half) Although the number of elements in each segments increased, the segments are usually mostly sorted Sort each segments with Insertion Sort again 2147483647 Insert≥2Insert≥1

12 Shell Sort After each pass, reduces k (e.g. by half) Although the number of elements in each segments increased, the segments are usually mostly sorted Sort each segments with Insertion Sort again 2147483647 Insert≥4Insert≥7

13 Shell Sort After each pass, reduces k (e.g. by half) Although the number of elements in each segments increased, the segments are usually mostly sorted Sort each segments with Insertion Sort again 2147483647 Insert<4 ≥2

14 Shell Sort After each pass, reduces k (e.g. by half) Although the number of elements in each segments increased, the segments are usually mostly sorted Sort each segments with Insertion Sort again 2147483647 Insert<8<7≥1≥1

15 Shell Sort After each pass, reduces k (e.g. by half) Although the number of elements in each segments increased, the segments are usually mostly sorted Sort each segments with Insertion Sort again 2147483647 Insert≥4≥4

16 Shell Sort After each pass, reduces k (e.g. by half) Although the number of elements in each segments increased, the segments are usually mostly sorted Sort each segments with Insertion Sort again 2147483647 Insert<8≥7

17 Shell Sort Finally, k is reduced to 1 The list look like mostly sorted Perform 1-sort, i.e. the ordinary Insertion Sort 21474836472147483647

18 Shell Sort – Worse than Ins. Sort? In Shell Sort, we still have to perform an Insertion Sort at last A lot of operations are done before the final Insertion Sort Isn’t it worse than Insertion Sort?

19 Shell Sort – Worse than Ins. Sort? The final Insertion Sort is more efficient than before All sorting operations before the final one are done efficiently k-sorts compare far-apart elements Elements “moves” faster, reducing amount of movement and comparison

20 Shell Sort – Increment Sequence In our example, k starts with n/2, and half its value in each pass, until it reaches 1, i.e. {n/2, n/4, n/8, …, 1} This is called the “Shell sequence” In a good Increment Sequence, all numbers should be relatively prime to each other Hibbard’s Sequence: {2 m -1, 2 m-1 -1, …, 7, 3, 1}

21 Shell Sort – Analysis Average Complexity: O(n 1.5 ) Worse case of Shell Sort with Shell Sequence: O(n 2 ) When will it happen?

22 Heap Sort In Selection Sort, we scan the entire list to search for the maximum, which takes O(n) time Are there better way to get the maximum? With the help of a heap, we may reduce the searching time to O(lg n)

23 Heap Sort – Build Heap 1. Create a Heap with the list 285714 2 85 714

24 Heap Sort 2. Pick the maximum, restore the heap property 285714 2 8 57 14

25 Heap Sort 3. Repeat step 2 until heap is empty 285714 2 5 7 1 4

26 Heap Sort 3. Repeat step 2 until heap is empty 285714 2 5 14

27 Heap Sort 3. Repeat step 2 until heap is empty 285714 21 4

28 Heap Sort 3. Repeat step 2 until heap is empty 285714 2 1

29 Heap Sort 3. Repeat step 2 until heap is empty 285714 1

30 Heap Sort – Analysis Complexity: O(n lg n) Not a stable sort Difficult to implement

31 Merging Given two sorted list, merge the list to form a new sorted list A naïve approach: Append the second list to the first list, then sort them Slow, takes O(n lg n) time Are there any better way?

32 Merging We make use of a property of sorted lists: The first element is always the minimum What does that imply? An additional array is needed store temporary merged list Pick the smallest number from the un- inserted numbers and append them to the merged list

33 Merging List A List B 1379 236 Temp

34 Merge Sort Merge sort follows the divide-and- conquer approach Divide: Divide the n-element sequence into two (n/2)-element subsequences Conquer: Sort the two subsequences recursively Combine: Merge the two sorted subsequence to produce the answer

35 Merge Sort 1. Divide the list into two 2. Call Merge Sort recursively to sort the two subsequences Merge Sort 28571485 147

36 3. Merge the list (to temporary array) 285147 4. Move the elements back to the list

37 Merge Sort – Analysis Complexity: O(n lg n) Stable Sort What is a stable sort? Not an “In-place” sort i.e. Additional memory required Easy to implement, no knowledge of other data structures needed

38 Stable Sort What is a stable sort? The name of a sorting algorithm A sorting algorithm that has stable performance over all distribution of elements, i.e. Best ≈ Average ≈ Worse A sorting algorithm that preserves the original order of duplicated keys

39 Stable Sort 135342 Original List ab 123345Stable Sort a b 123345Un-stable Sort b a

40 Stable Sort Which sorting algorithms is/are stable? StableUn-stable Bubble Sort Merge Sort Insertion Sort Selection Sort Shell Sort Heap Sort

41 Stable Sort In our previous example, what is the difference between 3 a and 3 b ? When will stable sort be more useful? Sorting records Multiple keys

42 Quick Sort Quick Sort also uses the Divide-and- Conquer approach Divide: Divide the list into two by partitioning Conquer: Sort the two list by calling Quick Sort recursively Combine: Combine the two sorted list

43 Quick Sort – Partitioning Given: A list and a “pivot” (usually an element in the list) Re-arrange the elements so that Elements on the left-hand side of “pivot” are less than the pivot, and Elements on the right-hand side of the “pivot” are greater than or equal to the pivot Pivot< Pivot≥ Pivot

44 Quick Sort – Partitioning e.g. Take the first element as pivot Swap all pairs of elements that meets the following criteria: The left one is greater than or equal to pivot The right one is smaller than pivot Swap pivot with A[hi] 46709394 Pivotlohi ≥ pivot?< pivot? ≥ pivot?< pivot?

45 Quick Sort After partitioning: Apply Quick Sort on both lists 46709394 Pivot Quick Sort 67994

46 Quick Sort – Analysis Complexity Best: O(n lg n) Worst: O(n 2 ) Average: O(n lg n) When will the worst case happen? How to avoid the worst case? In-Place Sort Not a stable sort

47 Counting Sort Consider the following list of numbers 5, 4, 2, 1, 4, 3, 4, 2, 5, 1, 4, 5, 3, 2, 3, 5, 5 Range of numbers = [1,5] We may count the occurrence of each number 12345 23345

48 Counting Sort (1) With the frequency table, we can reconstruct the list in ascending order 12345 23345 1, 2, 2, 2,3, 3, 3,4, 4, 5, 5, 5, 5, 5

49 Counting Sort (1) Can we sort records with this counting sort? Is this sort stable?

50 Counting Sort (2) An alternative way: use cumulative frequency table and a temporary array Given the following “records” 321223 123 Frequency Table 132 Cumulative 46

51 Counting Sort (2) 123 146 132223 53 2 0 1 4

52 Counting Sort – Analysis Complexity: O(n+k), where k is the range of numbers Not an In-place sort Stable Sort (Method 2) Cannot be applied on data with wide ranges

53 Radix Sort Counting Sort requires a “frequency table” The size of frequency table depends on the range of elements If the range is large (e.g. 32-bit), it may be infeasible, if not impossible, to create such a table

54 Radix Sort We may consider a integer as a “record of digits”, each digit is a key Significance of keys decrease from left to right e.g. the number 123 consists of 3 digits Leftmost digit: 1 (Most significant) Middle digit: 2 Rightmost digit: 3 (Least signficant)

55 Radix Sort Now, the problem becomes a multi-key record sorting problem Sort the records on the least significant key with a stable sort Repeat with the 2nd least significant key, 3rd least significant key, and so on

56 Radix Sort For all keys in these “records”, the range is [0,9]  Narrow range We apply Counting Sort to do the sorting here

57 Radix Sort 101 97141110997733 Original List 0

58 Radix Sort Sort on the least significant digit 101097141110997733 01 13 23 34 44 54 64 76 86 96

59 Radix Sort Sort on the 2nd least significant digit 101101097097141141110110997997733733 01 12 22 33 44 54 64 74 84 96

60 Radix Sort Lastly, the most significant digit 101097141110997733 01 14 24 34 44 54 64 75 85 96

61 Radix Sort – Analysis Complexity: O(dn), where d is the number of digits Not an In-place Sort Stable Sort Can we run Radix Sort on Real numbers? String?

62 Choosing Sorting Algorithms List Size Data distribution Data Type Availability of Additional Memory Cost of Swapping/Assignment

63 Choosing Sorting Algorithms List Size If N is small, any sorting algorithms will do If N is large (e.g. ≥5000), O(n 2 ) algorithms may not finish its job within time limit Data Distribution If the list is mostly sorted, running QuickSort with “first pivot” is extremely painful Insertion Sort, on the other hand, is very efficient in this situation

64 Choosing Sorting Algorithms Data Type It is difficult to apply Counting Sort and Radix Sort on real numbers or any other data types that cannot be converted to integers Availability of Additional Memory Merge Sort, Counting Sort, Radix Sort require additional memory

65 Choosing Sorting Algorithms Cost of Swapping/Assignment Moving large records may be very time- consuming Selection Sort takes at most (n-1) swap operations Swap pointers of records (i.e. swap the records logically rather than physically)


Download ppt "Sorting HKOI Training Team (Advanced) 2006-01-21."

Similar presentations


Ads by Google