Ch 6: Heapsort Ming-Te Chi Algorithms
Why sorting 1. Sometimes the need to sort information is inherent in a application. 2. Algorithms often use sorting as a key subroutine. 3. There is a wide variety of sorting algorithms, and they use rich set of techniques. 4. Sorting problem has a nontrivial lower bound 5. Many engineering issues come to fore when implementing sorting algorithms.
Sorting algorithm Insertion sort : Merge sort : Worst case in O(n2) In place: only a constant number of elements of the input array are even sorted outside the array. Merge sort : Worst case in O(n lg n) not in place. Heap sort : (Chapter 6) Sorts n numbers in place in O(n lg n)
Sorting algorithm Heap sort : (Chapter 6) O(n lg n) worst case—like merge sort. Sorts in place—like insertion sort. Combines the best of both algorithms. Introduces another algorithm design technique: the use of data structure.
Sorting algorithm To understand heapsort, we will cover heaps and heap operations, and then take a look at priority queues.
Sorting algorithm (preview) Quick sort : (chapter 7) worst time complexity O(n2) Average time complexity O(n logn) Decision tree model : (chapter 8) Lower bound O (n logn) Counting sort Radix sort Order statistics
Heap data structure Heap (not garbage-collected storage) is a nearly complete binary tree. Height of node = # of edges on a longest simple path from the node down to a leaf Height of tree = height of root Height of heap = height of root = (log n)
Heap data structure A heap can be stored as an array A. Root of tree is A[1]. Parent of A[i ] = A[ ]. Left child of A[i ] = A[2i ]. Right child of A[i ] = A[2i + 1]. Computing is fast with binary representation implementation.
Remarks about a TREE depth d height h (of a node) 0 (root) 3 1 2 2 1 1 2 2 1 3 (leaves) 0
6.1 Heaps (Binary heap) The binary heap data structure is an array object that can be viewed as a complete tree. Parent(i) return LEFT(i) return 2i Right(i) return 2i+1
Heap property Two kinds of heap: Max-heap and Min-heap Max-heap property: A [parent(i)] A[i] (Every node i other than the root) The value of the node is at most the value of its parent. Min-heap property: A [parent(i)] ≤ A[i] The smallest element in a min-heap is the root.
Heap property Max-heap is used in heapsort algorithm. Min-heaps are commonly used in priority queues. Basic operations on heaps run in time proportional to the height of the tree O(log n) time
Basic procedures on Max-heap Max-Heapify procedure O(lg n) , key to maintain the max-heap property Build-Max-Heap procedure O(n) , produces a max-heap from an unordered input array. Heapsort procedure O(n lg n) , sorts an array in place Max-Heap-Insert, Heap-Extract-Max, Heap-Increase-Key, and Heap-Maximum procedures O(lg n) , allow the heap data structure to be used as a priority queue.
6.2 Maintaining the heap property MAX-HEAPIFY is used to maintain the max-heap property. Before MAX-HEAPIFY, A[i ] may be smaller than its children. Assume left and right subtrees of i are max-heaps. After MAX-HEAPIFY, subtree rooted at i is a max-heap.
MAX-HEAPIFY Input: Find the largest element in the heap subtree. Array An index i into the array (root of a subtree.) Find the largest element in the heap subtree. If it is not the root of the subtree Exchange it with the root of the subtree The subtree rooted at largest may not be a max-heap Recursively call the MAX-HEAPIFY
Max-Heapify (A, i ) 1 l Left (i ) 2 r Right(i ) 3 if l ≤ heap-size[A] and A[l ] > A[i ] 4 then largest l 5 else largest i 6 if r ≤ heap-size[A] and A[r ] > A[largest] 7 then largest r 8 if largest i 9 then exchange A[i ] A[largest] 10 Max-Heapify (A, largest)
Max-Heapify(A,2) heap-size[A] = 10
Time Complexity of MAX-HEAPIFY Running time of Max-heapify on a subtree of size n rooted at given node i O(1) to fix up relationships among the elements A[i] , A[LEFT(i)] , and A[RIGHT(i)] Time to run Max-heapify on a subtree rooted at one of the children of node i. The children's subtrees each have size at most 2n/3. The worst case occurs when the last row of the tree is exactly half full.
Time Complexity of MAX-HEAPIFY Running time of Max-heapify (case 2 of the master theorem) Alternatively O(h) (h: height of the node)
6.3 Building a heap The elements in the subarray A[(length[A]/2) .. n] are all leaves of the tree. Each one is a 1-element heap No need to exam these elements The Build-max-heap procedure exams the remaining nodes of the tree and applies MAX-HEAPIFY on each one.
1 heap-size[A] length[A] 2 for i length[A]/2 downto 1 Build-Max-Heap(A) 1 heap-size[A] length[A] 2 for i length[A]/2 downto 1 3 do Max-Heapify(A, i) (given an unordered array, will produce a max-heap.)
Analysis Simple bound: (A good approach to analysis in general is to start by proving easy bound, then try to tighten it.) O(n) calls to MAX-HEAPIFY Each of which takes O(lg n) time ⇒ O(n lg n).
Analysis Tighter analysis: Time to run MAX-HEAPIFY is linear in the height of the node it's run on, and most nodes have small heights. n-element heap: height h (= floor(log n) ) has ≤ ceiling(n/2h+1) nodes of height h Time required by MAX-HEAPIFY when called on a node of height h is O(h).
n/2h+1 analysis n = total number of nodes = 20 + 21 + 22 + 23 + 24 + 25 = 26 -1 = 63 Level Depth d Height h # of nodes at each level 2h+1 n/2h+1 Ceiling of (n/2h+1) 1 (root) 5 20 = 1 64 63/64 1 2 4 21 = 2 32 63/32 3 22 = 4 16 63/16 23 = 8 8 63/8 24 = 16 63/4 6 (leaves) 25 = 32 63/2
Analysis─ Tighter analysis(continue) The running time of BUILD-MAX-HEAP is O(n) 附錄公式 A.6 兩邊微分後再乘以 x 可得
6.4 The Heapsort algorithm Given an input array, the heapsort algorithm acts as follows: Builds a max-heap from the array. Starting with the root (the maximum element), the algorithm places the maximum element into the correct place in the array by swapping it with the element in the last position in the array. "Discard" this last node (knowing that it is in its correct place) by decreasing the heap size, and calling MAX-HEAPIFY on the new (possibly incorrectly-placed) root. Repeat this "discarding" process until only one node (the smallest element) remains, and therefore is in the correct place in the array.
6.4 The Heapsort algorithm 1 Build-Max-Heap(A) 2 for i length[A] down to 2 3 do exchange A[1]A[i] 4 heap-size[A] heap-size[A] -1 5 Max-Heapify(A,1)
The operation of Heapsort
Analysis: O(n logn)
Analysis BUILD-MAX-HEAP: O(n) for loop: n - 1 times, O(n) exchange elements: O(1) MAX-HEAPIFY: O(lg n) Total time (for loop): O(n lg n). Total time (heapsort): O(n lg n).
Analysis Though heapsort is a great algorithm, a well-implemented quicksort usually beats it in practice. Heap data structure is useful. Priority queue is one of the most popular applications of a heap.
6.5 Priority queues A queue is a data structure with the FIFO property. A priority queue is a queue and each element of it has an associated value call a key (the priority). Insertion: inserts an element to the queue according to its priority. Removal: removes the element from the queue also according to its priority. Max-priority queue vs. min-priority queue.
6.5 Priority queues Max-priority queue: Based on max-heap Supports the following operations: insert, maximum, extract-max, and increase-key. Example max-priority queue application: Schedule jobs on shared computer Greedy search Example min-priority queue application: Event-driven simulator.
Inserts the element x into a set S Maximum (S) O(1) Insert (S, x) O(log n) Inserts the element x into a set S Maximum (S) O(1) Returns the element of S with the largest key Extract-Max (S) O(log n) Removes and returns the elements of S with the largest key Increase-Key (S, x, k) O(log n) Increases the value of element x's key to the new value k. (assume that k is no less than x's current key value)
Heap_Maximum(A) 1 return A[1] Time O(1)
Heap_Extract-Max(A) 1 if heap-size[A] < 1 2 then error "heap underflow" 3 max A[1] 4 A[1] A[heap-size[A]] 5 heap-size[A] heap-size[A] - 1 6 Max-Heapify (A, 1) 7 return max Why do we need max-heapify at step 6?
Analysis: Heap_Extract-Max(A) Constant time assignments plus time for MAX-HEAPIFY. Time required: O(lg n).
Heap-Increase-Key (A, i, key) 1 if key < A[i] 2 then error "new key is smaller than current key" 3 A[i] key 4 while i > 1 and A[Parent(i)] < A[i] 5 do exchange A[i] A[Parent(i)] 6 i Parent(i)
Analysis: Heap-Increase-Key (A, i, key) Upward path from node i has length O(lg n) in an n-element heap. Time required: O(lg n).
Heap-Increase-Key (key = 15)
Max_Heap_Insert(A, key) 1 heap-size[A] heap-size[A] + 1 2 A[heap-size[A]] -∞ 3 Heap-Increase-Key (A, heap-size[A], key)
Analysis: Max_Heap_Insert(A, key) Constant time assignments + time for HEAP-INCREASE-KEY. Time required: O(lg n).
Summary A heap gives a good compromise between fast insertion but slow extraction. Both operations take O(lg n) time. A heap can support any priority-queue operation on a set of size n in O(lg n) time. Min-priority queue operations are implemented similarly with min-heaps.
Questions: Name 3 algorithms that use D & C. Name 3 algorithms that use recursion.