Download presentation
Presentation is loading. Please wait.
1
Copyright © 2011-2016 Aiman Hanna All rights reserved
Heaps Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia University, Montreal, Canada These slides has been extracted, modified and updated from original slides of : Data Structures and Algorithms in Java, 5th edition. John Wiley& Sons, ISBN Data Structures and the Java Collections Framework by William J. Collins, 3rdedition, ISBN Both books are published by Wiley. Copyright © Wiley Copyright © 2010 Michael T. Goodrich, Roberto Tamassia Copyright © 2011 William J. Collins Copyright © Aiman Hanna All rights reserved
2
Coverage & Warning Section 8.3: Heaps
Warning: As an upfront warning, the “heap” data structure discussed here has nothing to do with the memory heap used in the run-time environment. 2 6 5 7 9 Heaps
3
Recall Priority Queue ADT
A priority queue stores a collection of entries Each entry is a pair (key, value) Main methods of the Priority Queue ADT insert(k, x) inserts an entry with key k and value x removeMin() removes and returns the entry with smallest key Additional methods min() returns, but does not remove, an entry with smallest key size(), isEmpty() Applications: Standby flyers Auctions Stock market Heaps
4
Recall P.Q. Sorting We use a priority queue Algorithm PQ-Sort(S, C)
Insert the elements with a series of insert operations Remove the elements in sorted order with a series of removeMin operations The running time depends on the priority queue implementation: Unsorted sequence gives selection-sort: O(n2) time Sorted sequence gives insertion-sort: O(n2) time Can we do better? Algorithm PQ-Sort(S, C) Input sequence S, comparator C for the elements of S Output sequence S sorted in increasing order according to C P priority queue with comparator C while S.isEmpty () e S.remove (S. first ()) P.insertItem(e, ) while P.isEmpty() e P.removeMin().getKey() S.addLast(e) Heaps
5
Heaps 12/4/2018 5:27 AM Heaps Both insertion-sort and selection-sort of P.Q. achieved running time of O(n2). An efficient realization of a priority queue uses a data structure, called heap. This data structure allows us to perform both insertions and removals in logarithmic time, which is significant improvement over the list-based implementation. Fundamentally, the heap achieves such improvement by abandoning the idea of storing entries in a list; instead it stores the entries in a binary tree. Heaps
6
Heaps The last node of a heap is the rightmost node of maximum depth 2
12/4/2018 5:27 AM Heaps In other words, a heap is a binary tree storing entries at its nodes. Additionally, a heap satisfies the following properties: Heap-Order: for every internal node v other than the root, key(v) key(parent(v)) Complete Binary Tree: let h be the height of the heap for i = 0, … , h - 1, there are 2i nodes of depth i at depth h - 1, the internal nodes are to the left of the external nodes there is at most one node with one child (only one node can have a single child) and this child must be a left child The last node of a heap is the rightmost node of maximum depth 2 5 6 9 7 last node Heaps
7
Heaps 12/4/2018 5:27 AM Heap-order Property Heap-Order Property: This s a relational property. For every internal node v other than the root, key(v) key(parent(v)). Consequently, the keys encountered on a path from the root to an external node are in non-decreasing order. Additionally, the minimum key (which is the most important one) is hence always stored at the root (or the “top of the heap”, hence the name “heap” of this data structure). Heaps
8
Complete Binary Tree Property
Heaps 12/4/2018 5:27 AM Complete Binary Tree Property Complete Binary Tree Property: This is a structural property. This property is needed to insure that the height of the heap is as small as possible. Heaps
9
Complete Binary Tree Property
Heaps 12/4/2018 5:27 AM Complete Binary Tree Property A heap T with height h is a complete binary tree if each depth i = 0, … , h - 1 has the maximum possible number of entries, and at least one entry at the last depth (that is depth h). Formally, each level i = 0, … , h - 1 must have 2i nodes. In a complete binary tree, a node v is to the left of node w if v and w are at the same level and that v is encountered before w (from left to right, which is also an inorder traversal). Heaps
10
Complete Binary Tree Property
Heaps 12/4/2018 5:27 AM Complete Binary Tree Property For instance, node with entry (15, K) is to the left of node with entry (7, Q). Similarly, node with entry (14, E) is to the left of node with entry (8, w), etc. Another important node is the last node. This is the right-most, deepest external node in the tree. Note: right-most means the one to the right of all other nodes in the level (this may not necessary be a right child). (4, C) (6, Z) (5, A) (9, F) (15, K) (7, Q) (20, B) (25, J) (12, H) (8, W) 16, X) (14, E) (11, S) last node Heaps
11
Height of a Heap Theorem: A heap storing n keys has height O(log n)
Proof: (we apply the complete binary tree property) Let h be the height of a heap storing n keys Since there are 2i keys at depth i = 0, … , h - 1 and at least one key at depth h, we have n … + 2h = 2h Thus, n 2h , i.e., h log n This theorem is very important since it implies that if we can perform updates on a heap in a time proportional to its height, then those operations will run in logarithmic time. 1 2 2h-1 keys h-1 h depth Heaps
12
Heaps and Priority Queues
We can use a heap to implement a priority queue. We store a (key, element) item at each internal node. We keep track of the position of the last node. (2, Sue) (5, Pat) (6, Mark) (9, Jeff) (7, Anna) Heaps
13
Insertion into a Heap Method insert(k, x) of the priority queue ADT corresponds to the insertion of a key k to the heap. The insertion algorithm consists of three steps: Find the insertion node z (the new last node) Store k at z Restore the heap-order property (discussed next) 2 5 6 z 9 7 insertion node 2 6 5 7 9 1 z Heaps
14
Upheap After the insertion of a new key k, the heap-order property may be violated. Algorithm upheap restores the heap-order property by swapping k along an upward path from the insertion node. 2 6 5 7 9 1 z 2 1 5 7 9 6 z 1 5 2 z 9 7 6 Heaps
15
Upheap Upheap terminates when the key k reaches the root or a node whose parent has a key smaller than or equal to k. Since a heap has height O(log n), upheap runs in O(log n) time. 1 5 2 z 9 7 6 Heaps
16
Upheap Example: (4, C) (6, Z) (5, A) (9, F) (15, K) (7, Q) (20, B)
(25, J) (12, H) (8, W) 16, X) (14, E) (11, S) (4, C) (5, A) (6, Z) (15, K) (9, F) (7, Q) (20, B) 16, X) (25, J) (14, E) (12, H) (11, S) (8, W) (2, T) Heaps
17
Upheap Example (Continues …): (4, C) (5, A) (6, Z) (15, K) (9, F)
(7, Q) (2, T) 16, X) (25, J) (14, E) (12, H) (11, S) (8, W) (20, B) (4, C) (5, A) (2, T) (15, K) (9, F) (7, Q) (6, Z) 16, X) (25, J) (14, E) (12, H) (11, S) (8, W) (20, B) Heaps
18
Upheap Example (Continues …): (2, T) (5, A) (4, C) (15, K) (9, F)
(7, Q) (6, Z) 16, X) (25, J) (14, E) (12, H) (11, S) (8, W) (20, B) Heaps
19
Removal from a Heap (§ 7.3.3) Method removeMin() of the priority queue ADT corresponds to the removal of the root key from the heap. The removal algorithm consists of four steps: Return the root entry Replace the root key (entry in fact) with the key of the last node w Remove w Restore the heap-order property (discussed next) 2 5 6 w 9 7 last node 7 5 6 w 9 new last node Heaps
20
Downheap After replacing the root key with the key k of the last node, the heap-order property may be violated. Algorithm downheap restores the heap-order property by swapping key k along a downward path from the root. If T has no right child, then the swapping (if needed) starts at the left child (which is the only one actually) If both left and right children are there, then the swapping occurs with the one that has the smaller key; otherwise only one side of the tree may be corrected, which may force further swap operations. 7 5 5 6 7 6 w w 9 9 Heaps
21
Downheap Downheap terminates when key k reaches a leaf or a node whose children have keys greater than or equal to k. The downward swapping process is called down-heap bubbling. Since a heap has height O(log n), downheap runs in O(log n) time. 7 5 5 6 7 6 w w 9 9 Heaps
22
Downheap (4, C) Removal Example: (4, C) (6, Z) (5, A) (9, F) (15, K)
(7, Q) (20, B) (25, J) (12, H) (13, W) 16, X) (14, E) (11, S) (4, C) (13, W) (5, A) (6, Z) (15, K) (9, F) (7, Q) (20, B) 16, X) (25, J) (14, E) (12, H) (11, S) Heaps
23
Downheap Removal Example (Continues …): (13, W) (6, Z) (5, A) (9, F)
(15, K) (7, Q) (20, B) (25, J) (12, H) 16, X) (14, E) (11, S) (5, A) (13, W) (6, Z) (15, K) (9, F) (7, Q) (20, B) 16, X) (25, J) (14, E) (12, H) (11, S) Heaps
24
Downheap Removal Example (Continues …): (5, A) (6, Z) (9, F) (13, W)
(15, K) (7, Q) (20, B) (25, J) (12, H) 16, X) (14, E) (11, S) (5, A) (9, F) (6, Z) (15, K) (12, H) (7, Q) (20, B) 16, X) (25, J) (14, E) (13, W) (11, S) Heaps
25
Updating the Last Node The insertion node can be found by traversing a path of O(log n) nodes For instance, if the current last node is a left child then: Go up to the father and down right to the new last node If the current last node is a right child then: Go up until a left child or the root is reached If a left child is reached, go to the right child Go down left until a leaf is reached Similar algorithm for updating the last node after a removal Heaps
26
Heap Performance Operation Complexity size(), isEmpty() O(1) min()
The performance of a P.Q. realized by means of a heap is as follows: Operation Complexity size(), isEmpty() O(1) min() insert() O(log n) removeMin() Heaps
27
Heap-Sort Let us recall the PQ-Sort algoirthm Algorithm PQ-Sort(S, C)
Input sequence S, comparator C for the elements of S Output sequence S sorted in increasing order according to C P priority queue with comparator C while S.isEmpty () e S.removeFirst () P.insert (e, ) while P.isEmpty() e P.removeMin().getKey() S.addLast(e) Heaps
28
Heap-Sort Using a heap-based priority queue, the following is true: Each insertion takes O(log n) time In fact the insertion of any entry i takes exactly O(1 + log i) since this is actually the ith insertion where 1 <= i <= n. Each removal takes O(log n) time In fact the removal of any entry i takes exactly O(1 + log(n – i + 1)) since this is actually the ith removal (that is, some elements may have already been removed) where 1 <= i <= n. Consequently, the entire algorithm consumes O(n log n) time to have n removals in phase I (first loop) and O(n log n) time to have n insertions in phase I (first loop), which leads to a final complexity of O(n log n). Heaps
29
Heap-Sort The resulting algorithm using heap to realize the P.Q. is hence called heap-sort. Heap-sort is much faster than quadratic sorting algorithms, such as insertion-sort and selection-sort. Heaps
30
Heap-Sort Consider a priority queue with n items implemented by means of a heap the space used is O(n) methods insert and removeMin take O(log n) time methods size, isEmpty, and min take time O(1) time Using a heap-based priority queue, we can sort a sequence of n elements in O(n log n) time The resulting algorithm is called heap-sort Heap-sort is much faster than quadratic sorting algorithms, such as insertion-sort and selection-sort Heaps
31
Array-based Heap Implementation
We can represent a heap with n keys by means of an array of length n + 1 For the node at rank i the left child is at rank(index) 2i the right child is at rank 2i + 1 Links between nodes are not explicitly stored The cell of at rank 0 is not used Operation insert corresponds to inserting at rank n + 1 Operation removeMin corresponds to removing at rank n Yields in-place heap-sort 2 5 6 9 7 2 5 6 9 7 1 3 4 Heaps
32
Merging Two Heaps We are given two heaps and a key k
3 2 We are given two heaps and a key k We create a new heap with the root node storing k and with the two heaps as subtrees We perform downheap to restore the heap-order property 8 5 4 6 7 3 2 8 5 4 6 2 3 4 8 5 7 6 Heaps
33
Bottom-up Heap Construction
It is possible to construct a heap in o(n log n) time. Simply, perform n successive insertions. Since worst case for any of these insertions is log n, the total time consumption is O(n log n). However, if all the entries (that is all the key-value pairs) are known and given in advance, we can have an alternative approach that results only on a complexity of O(n). Such approach is referred to as the bottom-up construction approach. Heaps
34
Bottom-up Heap Construction
To simplify the description of the algorithm, let us assume that the heap is a complete binary tree; that is all levels are full. Consequently, n = 2h+1 – 1 h = log(n + 1) - 1 The construction of the heap consists of h + 1 steps (which is also log( n + 1) steps). Heaps
35
Bottom-up Heap Construction
Generally, in each step i, pairs of heaps with 2i -1 keys are merged to construct larger heaps with 2i+1-1 keys. Step 1: construct (n + 1) / 21 elementary heaps with one entry each. Step 2: construct (n + 1) / 22 heaps, each storing 3 entries by joining pairs of elementary heaps and adding a new entry. New entry is added at the root, and swapping is performed if needed. 2i -1 2i+1-1 Heaps
36
Bottom-up Heap Construction
: Step h + 1 : In this last step, construct (n + 1) / 2h+1 heap. This is actually the final heap containing all n elements by joining the last two heaps (each storing (n -1)/2 entries) from Step h. As before, the new entry is added at the root, and swapping is performed if needed. Note that all swap operations in each of the steps use down-heap bubbling to preserve the heap-order property. 2i -1 2i+1-1 Heaps
37
Bottom-up Heap Construction Example
16 15 4 12 6 7 23 20 Step 1 25 5 11 27 16 15 4 12 6 7 23 20 Step 2 Heaps
38
Example (contd.) 25 5 11 27 16 15 4 12 6 9 23 20 Heap-order property is violated; we need to apply down-heap bubbling 15 4 6 20 16 25 5 12 11 9 23 27 Step 2 is concluded Heaps
39
Example (contd.) 7 8 15 4 6 20 16 25 5 12 11 9 23 27 Step 3 4 6 15 5 8 20 16 25 7 12 11 9 23 27 Heap-order property was violated; Heap after applying down-heap bubbling Heaps
40
Final heap after applying down-heap bubbling
Example (end) 10 4 6 15 5 8 20 16 25 7 12 11 9 23 27 Step 4 4 5 6 15 7 8 20 16 25 10 12 11 9 23 27 Final heap after applying down-heap bubbling Heaps
41
Analysis Let us consider any internal node v. The construction of the tree rooted at v (subtree actually, except when v is the root) is proportional to the size of the paths of this tree. Which paths are associated with a node? An associated path, p(v),with a node v is the one that goes to the right child of v then goes left downward until it reaches an external node (note that this path may differ from the actual downheap path). Heaps
42
Analysis Clearly the size (number of nodes) of p(v) is equal to the: height of the tree rooted at v + 1. Each node in the tree belongs to at most 2 associated paths: the one rooted at v and the one coming down from its parent (note that the root and all the nodes on the left-most root-to-leaf path belong only to one associated path). Heaps
43
Analysis Therefore, the total sizes of the paths associated with the internal nodes of the tree is at most 2n – 1. Consequently, the construction of the tree (all its paths) using bottom-up heap construction runs in O(n) time. Bottom-up heap construction speeds up the first phase of heap-sort. Can someone help explain how can building a heap be O(n) complexity? Inserting an item into a heap is O(logN), and the insert is repeated n/2 times (the remainder are leaves and can't violate the heap property. So this means the complexity should be O(nLog n) I would think. Put another way, for each item we "heapify", it has the potential to have to filter down once for each level for the heap so far (which is log n levels). What am I missing? algorithm complexity-theory share|improve this question asked Mar 18 '12 at 3:15 Will Den 2, what precisely do you mean by "building" a heap? – mfrankli Mar 18 '12 at 3:29 As you would in a heapsort, take an unsorted array and filterdown each of the top half elements until it conforms to the rules of a heap – Will Den Mar 18 '12 at 3:29 Only thing i could find was this link: The complexity of Buildheap appears to be Θ(n lg n) – n calls to Heapify at a cost of Θ(lg n) per call, but this result can be improved to Θ(n) cs.txstate.edu/~ch04/webtest/teaching/courses/5329/lectures/… – Will Den Mar 18 '12 at 3:30 7 Answers activeoldestvotes up vote26down voteaccepted Your analysis is corrrect. However, it is not tight. It is not really easy to explain. You should better read it. Great analysis of algorithm can be seen here. However, the main idea is, in Buildheap algorithm the actual Heapify cost is not O(logn) for all elements. When Heapify is called, the running time depends on how far an element might move down in tree before the process terminates. In other words it depends on the height of node. In the worst case the element might go down all the way to the leaf level. Let us count the work done level by level. At the bottommost level there are 2^(h) nodes, but we do not call Heapify on any of these so the work is 0. At the next to level there are 2^(h − 1) nodes, and each might move down by 1 level. At the 3rd level from the bottom there are 2^(h − 2) nodes, and each might move down by 2 levels. As you can see not all heapify operations are O(logn), this is why you are getting O(n). share|improve this answer edited May 9 '12 at 6:11 answered Mar 18 '12 at 3:39 emre nevayeshirazi 4, 5 Ah a converging series... – Will Den Mar 18 '12 at 3:59 2 This is a great explanation...but why is it then that the heap-sort runs in O(n log n). Why doesn't the same reasoning apply to heap-sort? – hba Mar 30 at 3:11 Heaps
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.