Download presentation
Presentation is loading. Please wait.
Published byHeather Clark Modified over 9 years ago
1
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan davek@cs
2
Outline Sorting: The Problem Space Sorting by Comparison –Lower bound for comparison sorts –Insertion Sort –Heap Sort –Merge Sort –Quick Sort External Sorting Comparison of Sorting by Comparison Outline
3
Sorting: The Problem Space General problem Given a set of N orderable items, put them in order Without (significant) loss of generality, assume: –Items are integers –Ordering is (Most sorting problems map to the above in linear time.)
4
Lower Bound for Sorting by Comparison Sorting by Comparison –Only information available to us is the set of N items to be sorted –Only operation available to us is pairwise comparison between 2 items What is the best running time we can possibly achieve?
5
Decision Tree Analysis of Sorting by Comparison
6
Max depth of decision tree How many permutations are there of N numbers? How many leaves does the tree have? What’s the shallowest tree with a given number of leaves? What is therefore the worst running time (number of comparisons) by the best possible sorting algorithm?
7
Lower Bound for log(n!) Stirling’s approximation:
8
Insertion Sort Basic idea After k th pass, ensure that first k+1 elements are sorted On k th pass, swap (k+1) th element to left as necessary 7283596 2783596 2783596 Start After Pass 1 After Pass 2 2738596 After Pass 3 2378596 What if array is initially sorted? What if array is initially reverse sorted?
9
Why Insertion Sort is Slow Inversion: a pair (i,j) such that i Array[j] Array of size N can have (N 2 ) inversions –average number of inversions in a random set of elements is N(N-1)/4 Insertion Sort only swaps adjacent elements –only removes 1 inversion!
10
HeapSort Sorting via Priority Queue (Heap) Basic idea: Shove items into a priority queue, take them out smallest to largest. Worst Case: Best Case:
11
MergeSort Merging Cars by key [Aggressiveness of driver]. Most aggressive goes first. MergeSort (Table [1..n]) Split Table in half Recursively sort each half Merge two sorted halves together Merge (T1[1..n],T2[1..n]) i1=1, i2=1 While i1<n, i2<n If T1[i1] < T2[i2] Next is T1[i1] i1++ Else Next is T2[i2] i2++ End If End While
12
MergeSort Analysis Running Time –Worst case? –Best case? –Average case? Other considerations besides running time?
13
QuickSort Basic idea: Pick a “pivot”. Divide into less-than & greater-than pivot. Sort each side recursively. Picture from PhotoDisc.com
14
QuickSort Partition 7283596 Pick pivot: Partition with cursors 7283596 <> 7283596 <> 2 goes to less-than
15
QuickSort Partition (cont’d) 7263598 <> 6, 8 swap less/greater-than 7263598 3,5 less-than 9 greater-than 7263598 Partition done. Recursively sort each side.
16
Analyzing QuickSort Picking pivot: constant time Partitioning: linear time Recursion: time for sorting left partition (say of size i) + time for right (size N-i-1) T(1) = b T(N) = T(i) + T(N-i-1) + cN where i is the number of elements smaller than the pivot
17
QuickSort Worst case Pivot is always smallest element. T(N) = T(i) + T(N-i-1) + cN T(N)= T(N-1) + cN = T(N-2) + c(N-1) + cN = T(N-k) + = O(N 2 )
18
Optimizing QuickSort Choosing the Pivot –Randomly choose pivot Good theoretically and practically, but call to random number generator can be expensive –Pick pivot cleverly “Median-of-3” rule takes Median(first, middle, last). Works well in practice. Cutoff –Use simpler sorting technique below a certain problem size (Weiss suggests using insertion sort, with a cutoff limit of 5-20)
19
QuickSort Best Case Pivot is always middle element. T(N) = T(i) + T(N-i-1) + cN T(N)= 2T(N/2 - 1) + cN
20
QuickSort Average Case Assume all size partitions equally likely, with probability 1/N details: Weiss pg 278-279
21
External Sorting When you just ain’t got enough RAM … –e.g. Sort 10 billion numbers with 1 MB of RAM. –Databases need to be very good at this MergeSort Good for Something! –Basis for most external sorting routines –Can sort any number of records using a tiny amount of main memory in extreme case, only need to keep 2 records in memory at any one time!
22
External MergeSort Split input into two tapes Each group of 1 records is sorted by definition, so merge groups of 1 to groups of 2, again split between two tapes Merge groups of 2 into groups of 4 Repeat until data entirely sorted log N passes
23
Better External MergeSort Suppose main memory can hold M records. Initially read in groups of M records and sort them (e.g. with QuickSort). Number of passes reduced to log(N/M) k-way mergesort reduces number of passes to log k (N/M) –Requires 2k output devices (e.g. mag tapes) But wait, there’s more … Polyphase merge does a k-way mergesort using only k+1 output devices (plus k th -order Fibonacci numbers!)
24
Sorting by Comparison Summary Sorting algorithms that only compare adjacent elements are (N 2 ) worst case – but may be (N) best case HeapSort - (N log N) both best and worst case –Suffers from two test-ops per data move MergeSort - (N log N) running time –Suffers from extra-memory problem QuickSort - (N 2 ) worst case, (N log N) best and average case –In practice, median-of-3 almost always gets us (N log N) –Big win comes from {sorting in place, one test-op, few swaps}! Any comparison-based sorting algorithm is (N log N) External sorting: MergeSort with (log N/M) passes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.