COMP 103 Sorting with Binary Trees: Tree sort, Heap sort Alex Potanin Lindsay Groves, Marcus Frean , Peter Andreae, and Thomas Kuehne, VUW Alex Potanin School of Engineering and Computer Science, Victoria University of Wellington 2016-T2 Lecture 29
RECAP-TODAY RECAP TODAY ANNOUNCEMENTS Slow sorts: selection sort, insertion sort, bubble sort – O(n2) Faster sorts: merge sort, quicksort – O(n log n) TODAY Sorting with binary trees: sorting with a Binary Search Tree – Tree sort sorting with a Partially Ordered Tree/heap – HeapSort ANNOUNCEMENTS No lecture on Friday
Sorting with Binary rees Binary search trees provide an efficient way to insert into an ordered sequence Mmmm – sounds a bit like insertion sort Partially ordered trees provide an efficient way of removing the smallest element in a set, and thus of extracting elements in ascending order Mmmm – sounds a bit like selection sort
Sorting with a BST: Tree Sort Binary search trees provide an efficient way to insert into an ordered sequence Insert each element into a BST Traverse the BST, copying elements one at a time to the output list Cost: O(n log n) to insert all the elements into BST O(n) to traverse the BST = O(n log n) Note: Not an in-place sort.
Sorting with a Priority Queue Put all the items to be sorted into a priority queue, using the item’s value as its priority Remove the items from the priority queue one at a time and add to the output list Output is constructed in the correct order, as in selection sort Cost: Depends on the implementation of the priority queue
Sorting with a Priority Queue What happens if we implement the priority queue as: Unordered list: Making the priority queue is easy: O(1). Selecting/removing the next item is hard: O(n). Ordered list: Making the priority queue is hard: O(n). Selecting/removing the next item is easy: O(1). Partially order tree/heap: Making the priority queue is “quite easy”: O(log n). Selecting/removing the next item is is “quite easy”: O(log n). Can we get an in-pace sort?
recap: Heap the children of node i are at (2i+1) and (2i+2) “Heap” = a complete POT, implemented in an array Bottom right node is last element used We can compute the index of parent and children of a node: the children of node i are at (2i+1) and (2i+2) the parent of node i is at (i-1)/2 note: no gaps! Bee 35 Kea 19 Eel 26 Dog 14 Fox 7 Hen 23 Ant 9 Jay 2 Gnu 13 Cat 4 1 2 3 4 5 6 7 8 9 Bee 35 Kea 19 Eel 26 Dog 14 Fox 7 Hen 23 Ant 9 Jay 2 Gnu 13 Cat 4
Heapsort: In-Place Sorting Use an array-based Heap in-place sorting algorithm! turn the array into a heap “remove” top element, and restore heap property again repeat step 2. n-1 times in-place dequeueing This used to say the following, but I don’t understand it. (LG) Doing it in place is a little more tricky: For each non-leaf node (n/2..0) Compare to its children and pushdown if needed. Sorted → 35 19 26 13 14 23 4 9 7 1 1 2 3 4 5 6 7 8 9 and so on! 8
Heapsort: In-Place Sorting How to turn the array into a heap? Heapify for i = lastParent down to 0 sinkdown(i) (n-2)/2 heap property installed 13 9 23 7 14 26 4 19 35 1 1 2 3 4 5 6 7 8 9 9
HeapSort: Algorithm (a) Turn data into a heap (b) Repeatedly swap root with last item and push down public void heapSort(E[] data, int size, Comparator<E> comp) { for (int i = (size-2)/2; i >= 0; i--) sinkDown(i, data, size, comp); while (size > 0) { size--; swap(data, size, 0); sinkDown(0, data, size, comp); } "heapify" in-place dequeueing
Cost of “heapify” (n/2log (n+1)-1) (log(n+1)-1) n/82 n/41 n/20 Cost = n [1/4 + 2/8 + 3/16 + 4/32 + ⋯ (log(n+1)-1)/n] swaps = ⋮
Cost of “heapify” Cost = n [ 1/4 + 2/8 + 3/16 + 4/32 + ⋯] ≈ n [1] = n swaps We can turn an unordered list into a heap in linear time!!! 1st row 2nd row 3rd row 1st row 2nd row 3rd row
HeapSort: Summary Cost of heapify = O(n) n Cost of remove = O(n log n) Total cost = O(n log n) True for worst-case and average case! unlike QuickSort and TreeSort Can be done in-place Unlike MergeSort, doesn’t need extra memory to be fast Not stable