Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 583 Analysis of Algorithms

Similar presentations


Presentation on theme: "CS 583 Analysis of Algorithms"— Presentation transcript:

1 CS 583 Analysis of Algorithms
Heapsort Algorithm CS 583 Analysis of Algorithms 11/21/2018 CS583 Fall'06: Heapsort

2 Outline Sorting Problem Heaps Heapsort Algorithm Definition
Maintaining heap property Building a heap Heapsort Algorithm 11/21/2018 CS583 Fall'06: Heapsort

3 Sorting Problem Sorting is usually performed not on isolated data, but records. Each record contains a key, which is the value to be sorted. The remainder is called satellite data. When a sorting algorithm permutes the keys, it must permute the satellite data as well. If the satellite data is large for each record, we often permute pointers to records. This level of detail is usually irrelevant in the study of algorithms, but is important when converting an algorithm to a program. 11/21/2018 CS583 Fall'06: Heapsort

4 Sorting Problem: Importance
Sorting is arguably the most fundamental problem in the study of algorithms for the following reasons: The need to sort information is often a key part of an application. For example, sorting the financial reports by security IDs. Algorithms often use sorting as a key subroutine. For example, in order to match a security against benchmarks, the latter set needs to be sorted by some key elements. There is a wide variety of sorting algorithms, and they use a rich set of techniques. 11/21/2018 CS583 Fall'06: Heapsort

5 Heaps Heapsort algorithm sorts in place and its running time is O(n log(n)). It combines the better attributes of insertion sort and merge sort algorithms. It is based on a data structure, -- heaps. The (binary) heap data structure is an array object that can be viewed as a nearly complete binary tree. An array A that represents a heap is an object with two attributes: length[A], which is the number of elements, and heap-size[A], the number of elements in the heap stored within the array A. 11/21/2018 CS583 Fall'06: Heapsort

6 Heaps: Example A = {10, 8, 6, 5, 7, 3, 2} 10 8 6 The root of the tree is A[1]. Children of a node i determined as follows: Left(i) return 2i Right(i) return 2i+1 11/21/2018 CS583 Fall'06: Heapsort

7 Heaps: Example (cont.) The above is proven by induction:
The root's left child is 2 = 2*1. Assume it is true for node n. The left child of a node (n+1) will follow the right child of node n: left(n+1) = 2*n = 2(n+1)  The parent of a node i is calculated from i=2p, or i=2p+1, where p is a parent node. Hence Parent(i) return floor(i/2) 11/21/2018 CS583 Fall'06: Heapsort

8 Max-Heaps In a max-heap, for every node i other than the root:
A[Parent(i)] >= A[i] For the heapsort algorithm, we use max-heaps. The height of the heap is defined to be the longest path from the root to a leaf, and it is (lg n) since it is a complete binary tree. We will consider the following basic procedures on the heap:  Max-Heapify to maintain the max-heap property. Build-Max-Heap to produce a max-heap from an unordered input array. Heapsort to sort an array in place. 11/21/2018 CS583 Fall'06: Heapsort

9 Maintaining the Heap Property
The Max-Heapify procedure takes an array A and its index i. It is assumed that left and right subtrees are already max-heaps. The procedure lets the value of A[i] "float down" in the max-heap so that the subtree rooted at index i becomes a max-heap. 11/21/2018 CS583 Fall'06: Heapsort

10 Max-Heapify: Algorithm
Max-Heapify (A, i) 1 l = Left(i) 2 r = Right(i) 3 if l <= heap-size[A] and A[l] > A[i] 4 largest = l 5 else 6 largest = i 7 if r <= heap-size[A] and A[r] > A[largest] 8 largest = r 9 if largest <> i 10 <exchange A[i] with A[largest]> 11 Max-Heapify(A, largest) 11/21/2018 CS583 Fall'06: Heapsort

11 Max-Heapify: Analysis
It takes (1) to find A[largest], plus the time to run the procedure recursively on at most 2n/3 elements. (This is the maximum size of a child tree. It occurs when the last row of the tree is exactly half full.) Assume there n nodes and x levels in the tree that has half of the last row. This means: n = ^(x-1) + 2^x/2 2^x – 1 + 2^x/2 = n 2^(x-1) = a => 2a + a = n+1 => 2^(x-1) = (n+1)/3 11/21/2018 CS583 Fall'06: Heapsort

12 Max-Heapify: Analysis (cont.)
Max subtree size = (half of all elements to level x-1) + (elements at the last level) – (1 root element) = (2^x – 1)/2 + 2^x/2 – 1 = 2^(x-1) – ½ + 2^(x-1) – 1 = n/3 + 1/3 + n/3 + 1/3 – 1.5 = 2n/3 + 2/3 – 1.5 ~ 2n/3 Therefore the running time of Max-Heapify is described by the following recurrence: T(n) <= T(2n/3) + (1) According to the master theorem: T(n) = (lg n) (a=1, b=3/2, f(n) = (1)) Since T(n) is the worst-case scenario, we have a running time of the algorithm at O(lg n). 11/21/2018 CS583 Fall'06: Heapsort

13 Building a Heap We can use the procedure Max-Heapify in a bottom-up manner to convert the whole array A[1..n] into a max-heap. Note that, elements A[floor(n/2)+1..n] are leaves. The last element that is not a leaf is a parent of the last node, -- floor(n/2). The procedure Build-Max-Heap goes through all non-leaf nodes and runs Max-Heapify on each of them. 11/21/2018 CS583 Fall'06: Heapsort

14 Build-Max-Heap: Algorithm
Build-Max-Heap(A, n) 1 heap-size[A] = n 2 for i = floor(n/2) to 1 3 Max-Heapify(A,i) Invariant: At the start of each iteration 2-3, each node i+1, ... , n is the root of a max-heap. Proof. Initialization: i=floor(n/2). Each node in floor(n/2)+1,...,n are leaves and hence are roots of trivial max-heaps. 11/21/2018 CS583 Fall'06: Heapsort

15 Build-Max-Heap: Correctness
Maintenance: children of node i are numbered higher than i, and by the loop invariant are assumed to be roots of max-heaps. This is the condition for Max-Heapify. Moreover, the Max-Heapify preserves the property that i+1, ... , n are roots of max-heaps. Decrementing i by 1 makes the loop invariant for the next iteration. Termination: i=0, hence each node 1,2,...,n is the root of a max-heap. 11/21/2018 CS583 Fall'06: Heapsort

16 Build-Max-Heap: Performance
Each call to Max-Heapify takes O(lg n) time and there are n such calls. Therefore the running time of Build-Max-Heap is O(n lgn). To derive a tighter bound, we observe that the running time of Max-Heapify depends on the node's height. An n-element heap has height floor(lgn). There are at most ceil(n/2^(h+1)) nodes of any height h. Assume these nodes are at height x of the original tree. Then we have: 11/21/2018 CS583 Fall'06: Heapsort

17 Build-Max-Heap: Performance (cont.)
^x+...+2^h = n 2^(x+h+1) = n+1 2^x = (n+1)/2^(h+1) = ceil(n/2^(h+1)) The time required by Max-Heapify when called on a node of height h is O(h). Hence: h=0,floor(lgn)ceil(n/2^(h+1)) O(h) = O(nh=0,floor(lgn)h/2^h) A.8: k=0,k/x^k = x/(1-x)^2 h=0,h/2^h = ½ / (1-1/2)^2 = 2 Thus, the running time of Build-Max-Heap can be bounded O(n h=0,floor(lgn)h/2^h) = O(nh=0,h/2^h) = O(n) 11/21/2018 CS583 Fall'06: Heapsort

18 The Heapsort Algorithm
The heapsort algorithm uses Build-Max-Heap on A[1..n]. Since the maximum element of the array is at A[1], it can be put into correct position A[n]. Now A[1..(n-1)] can be made max-heap again. Heapsort (A,n) 1 Build-Max-Heap(A,n) 2 for i = n to 2 3 <swap A[1] with A[i]> 4 heap-size[A] = heap-size[A]-1 5 Max-Heapify(A,1) Step 1 takes O(n) time. Loop 2 is repeated (n-1) times with step 5 taking most time O(lgn). Hence the running time of heapsort is O(n) + O(n lgn) = O(n lgn). 11/21/2018 CS583 Fall'06: Heapsort


Download ppt "CS 583 Analysis of Algorithms"

Similar presentations


Ads by Google