Heaps and Heapsort Prof. Sin-Min Lee Department of Computer Science San Jose State University.

Slides:



Advertisements
Similar presentations
COL 106 Shweta Agrawal and Amit Kumar
Advertisements

Analysis of Algorithms
 Chapter 11 has several programming projects, including a project that uses heaps.  This presentation shows you what a heap is, and demonstrates two.
CS 206 Introduction to Computer Science II 03 / 23 / 2009 Instructor: Michael Eckmann.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
P p Chapter 10 has several programming projects, including a project that uses heaps. p p This presentation shows you what a heap is, and demonstrates.
Trees II Kruse and Ryba Ch 10.1,10.2,10.4 and 11.3.
CS 315 March 24 Goals: Heap (Chapter 6) priority queue definition of a heap Algorithms for Insert DeleteMin percolate-down Build-heap.
Binary Heaps CSE 373 Data Structures Lecture 11. 2/5/03Binary Heaps - Lecture 112 Readings Reading ›Sections
More sorting algorithms: Heap sort & Radix sort. Heap Data Structure and Heap Sort (Chapter 7.6)
Heapsort. 2 Why study Heapsort? It is a well-known, traditional sorting algorithm you will be expected to know Heapsort is always O(n log n) Quicksort.
CS 206 Introduction to Computer Science II 10 / 31 / 2008 Happy Halloween!!! Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 04 / 2009 Instructor: Michael Eckmann.
Version TCSS 342, Winter 2006 Lecture Notes Priority Queues Heaps.
Chapter 10 Heaps Anshuman Razdan Div of Computing Studies
Source: Muangsin / Weiss1 Priority Queue (Heap) A kind of queue Dequeue gets element with the highest priority Priority is based on a comparable value.
Priority Queues1 Part-D1 Priority Queues. Priority Queues2 Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is.
PQ, binary heaps G.Kamberova, Algorithms Priority Queue ADT Binary Heaps Gerda Kamberova Department of Computer Science Hofstra University.
Heapsort CIS 606 Spring Overview Heapsort – O(n lg n) worst case—like merge sort. – Sorts in place—like insertion sort. – Combines the best of both.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 17: Binary Search Trees; Heaps.
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
Heapsort Based off slides by: David Matuszek
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
Compiled by: Dr. Mohammad Alhawarat BST, Priority Queue, Heaps - Heapsort CHAPTER 07.
Computer Science and Software Engineering University of Wisconsin - Platteville 12. Heap Yan Shi CS/SE 2630 Lecture Notes Partially adopted from C++ Plus.
PRIORITY QUEUES (HEAPS). Queues are a standard mechanism for ordering tasks on a first-come, first-served basis However, some tasks may be more important.
Binary Heap.
The Binary Heap. Binary Heap Looks similar to a binary search tree BUT all the values stored in the subtree rooted at a node are greater than or equal.
Priority Queues and Binary Heaps Chapter Trees Some animals are more equal than others A queue is a FIFO data structure the first element.
Chapter 21 Binary Heap.
Chapter 11 Heap. Overview ● The heap is a special type of binary tree. ● It may be used either as a priority queue or as a tool for sorting.
Data Structures Week 8 Further Data Structures The story so far  Saw some fundamental operations as well as advanced operations on arrays, stacks, and.
Chapter 21 Priority Queue: Binary Heap Saurav Karmakar.
P p Chapter 10 has several programming projects, including a project that uses heaps. p p This presentation shows you what a heap is, and demonstrates.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Priority Queue. Priority Queues Queue (FIFO). Priority queue. Deletion from a priority queue is determined by the element priority. Two kinds of priority.
1 Heaps (Priority Queues) You are given a set of items A[1..N] We want to find only the smallest or largest (highest priority) item quickly. Examples:
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
CSC211 Data Structures Lecture 18 Heaps and Priority Queues Instructor: Prof. Xiaoyan Li Department of Computer Science Mount Holyoke College.
CPSC 252 Binary Heaps Page 1 Binary Heaps A complete binary tree is a binary tree that satisfies the following properties: - every level, except possibly.
David Luebke 1 12/23/2015 Heaps & Priority Queues.
Heapsort. What is a “heap”? Definitions of heap: 1.A large area of memory from which the programmer can allocate blocks as needed, and deallocate them.
INTRO TO HEAP & Adding and Removing a NODE Presented to: Sir AHSAN RAZA Presented by: SHAH RUKH Roll #07-22 Semester BIT 3 rd Session 2007—2011 Department.
CSC212 Data Structure Lecture 16 Heaps and Priority Queues Instructor: George Wolberg Department of Computer Science City College of New York.
Lecture 15 Jianjun Hu Department of Computer Science and Engineering University of South Carolina CSCE350 Algorithms and Data Structure.
Lecture 8 : Priority Queue Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
1 Heap Sort. A Heap is a Binary Tree Height of tree = longest path from root to leaf =  (lgn) A heap is a binary tree satisfying the heap condition:
AVL Trees and Heaps. AVL Trees So far balancing the tree was done globally Basically every node was involved in the balance operation Tree balancing can.
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
1 Chapter 6 Heapsort. 2 About this lecture Introduce Heap – Shape Property and Heap Property – Heap Operations Heapsort: Use Heap to Sort Fixing heap.
Sorting Cont. Quick Sort As the name implies quicksort is the fastest known sorting algorithm in practice. Quick-sort is a randomized sorting algorithm.
Heaps, Heap Sort, and Priority Queues. Background: Binary Trees * Has a root at the topmost level * Each node has zero, one or two children * A node that.
"Teachers open the door, but you must enter by yourself. "
Source: Muangsin / Weiss
Bohyung Han CSE, POSTECH
Heap Sort Example Qamar Abbas.
Design and Analysis of Algorithms
i206: Lecture 14: Heaps, Graphs intro.
CSC212 Data Structure - Section RS
Tree Representation Heap.
Heaps Chapter 10 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates two.
Heaps Chapter 10 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates two.
Heap Sort The Heap Data Structure
"Teachers open the door, but you must enter by yourself. "
Heaps Chapter 11 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates.
Heaps Chapter 11 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates.
Heaps & Multi-way Search Trees
Heaps Chapter 10 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates two.
Heaps.
Presentation transcript:

Heaps and Heapsort Prof. Sin-Min Lee Department of Computer Science San Jose State University

Heap Definition. Adding a Node. Removing a Node. Array Implementation. Analysis Source Code. Heaps

What is Heap? A heap is like a binary search tree. It is a binary tree where the six comparison operators form a total order semantics that can be used to compare the node’s entries. But the arrangement of the elements in a heap follows some new rules that are different from a binary search tree.

Heaps A heap is a certain kind of complete binary tree. It is defined by the following properties.

Heaps (Priority Queues) A heap is a data structure which supports efficient implementation of three operations: insert, findMin, and deleteMin. (Such a heap is a min-heap, to be more precise.) Two obvious implementations are based on an array (assuming the size n is known) and on a linked list. The comparisons of their relative time complexity are as follows: insert findMin deleteMin Sorted array O(n) O(1) O(1) (from large (to move (find at end) (delete the end) to small) elements) Linked list O(1) O(1) O(n) (append) (a pointer to Min) (reset Min pointer)

Heaps 1. All leaf nodes occur at adjacent levels. When a complete binary tree is built, its first node must be the root. When a complete binary tree is built, its first node must be the root. Root

Heaps 2. All levels of the tree, except for the bottom level are completely filled. All nodes on the bottom level occur as far left as possible. Left child of the root The second node is always the left child of the root. The second node is always the left child of the root.

Heaps 3. The key of the root node is greater than or equal to the key of each child. Each subtree is also a heap. Right child of the root The third node is always the right child of the root. The third node is always the right child of the root.

Heaps The next nodes always fill the next. level from left-to-right. The next nodes always fill the next. level from left-to-right.

Heaps The next nodes always fill the next level from left-to-right. The next nodes always fill the next level from left-to-right.

Heaps Complete binary tree.

Heap Storage Rules A heap is a binary tree where the entries of the nodes can be compared with a total order semantics. In addition, these two rules are followed: The entry contained by a node is greater than or equal to the entries of the node’s children. The tree is a complete binary tree, so that every level except the deepest must contain as many nodes as possible; and at the deepest level, all the nodes are as far left as possible.

Heaps A heap is a certain kind of complete binary tree The "heap property" requires that each node's key is >= the keys of its children The "heap property" requires that each node's key is >= the keys of its children

Heaps This tree is not a heap. It breaks rule number

Heaps This tree is not a heap. It breaks rule number

Heaps This tree is not a heap. It breaks rule number

Heap implementation of priority queue Each node of the heap contains one entry along with the entry’s priority, and the tree is maintained so that it follows the heap storage rules using the entry’s priorities to compare nodes. Therefore: The entry contained by a node has a priority that is greater than or equal to the priority of the entries of the nodes children. The tree is a complete binary tree.

Adding a Node to a Heap ¶ Put the new node in the next available spot

Adding a Node to a Heap ·Push the new node upward, swapping with its parent until the new node reaches an acceptable location

Adding a Node to a Heap ·Push the new node upward, swapping with its parent until the new node reaches an acceptable location

Adding a Node to a Heap 4The parent has a key that is >= new node, or 4The node reaches the root. ÚThe process of pushing the new node upward is called reheapification upward

Removing the Top of a Heap ¶Move the last node onto the root

Removing the Top of a Heap ¶Move the last node onto the root

Removing the Top of a Heap ·Push the out-of- place node downward, swapping with its larger child until the new node reaches an acceptable location

Removing the Top of a Heap ·Push the out-of- place node downward, swapping with its larger child until the new node reaches an acceptable location

Removing the Top of a Heap ·Push the out-of- place node downward, swapping with its larger child until the new node reaches an acceptable location

Removing the Top of a Heap 4The children all have keys <= the out-of-place node, or 4The node reaches the leaf. ØThe process of pushing the new node downward is called reheapification downward

Adding an Entry to a Heap As an example, suppose that we already have nine entries that are arranged in a heap with the above priorities.

Suppose that we are adding a new entry with a priority 42.The first step is to add this entry in a way that keeps the binary tree complete. In this case the new entry will be the left child of the entry with priority

The structure is no longer a heap, since the node with priority 21 has a child with a higher priority. The new entry (with priority 42) rises upward until it reaches an acceptable location. This is accomplished by swapping the new entry with its parent

The new entry is still bigger than its parent, so a second swap is done

Adding an entry to a priority queue Pseudocode for Adding an entry The priority queue has been implemented as a heap. 1.Place the new entry in the heap in the first available location.This keeps the structure as a complete binary tree, but it might no longer be a heap since the new entry might have a higher priority than its parent. 2. while (the new entry has a priority that is higher than its parent) swap the new entry with its parent.

Reheapification upward Now the new entry stops rising, because its parent (with priority 45) has a higher priority than the new entry. In general, the “new entry rising” stops when the new entry reaches the root. The rising process is called reheapification upward.

Implementing a Heap We will store the data from the nodes in a partially-filled array. An array of data

Implementing a Heap Data from the root goes in the first location of the array. An array of data

Implementing a Heap Data from the next row goes in the next two array locations. An array of data

Implementing a Heap Data from the next row goes in the next two array locations. Any node’s two children reside at indexes (2n) and (2n + 1) in the array. An array of data

Implementing a Heap Data from the next row goes in the next two array locations. An array of data We don't care what's in this part of the array.

Important Points about the Implementation The links between the tree's nodes are not actually stored as pointers, or in any other way. The only way we "know" that "the array is a tree" is from the way we manipulate the data. An array of data

Important Points about the Implementation If you know the index of a node, then it is easy to figure out the indexes of that node's parent and children. [1] [1] [2] [3] [4] [5]

Removing an entry from a Heap When an entry is removed from a priority, we must always remove the entry with the highest priority - the entry that stands “on top of heap”. For example, the entry at the root, with priority 45 will be removed.

The highest priority item, at the root, will be removed from the heap

Removing an entry from a heap During the removal, we must ensure that the resulting structure remains a heap. If the root entry is the only entry in the heap, then there is really no more work to do except to decrement the member variable that is keeping track of the size of the heap. But if there are other entries in the tree, then the tree must be rearranged because a heap is not allowed to run around without a root. The rearrangement begins by moving the last entry in the last level upto the root, like this :

Before the move After the move The last entry in the last level has been moved to the root. Removing an entry from a Heap

Heapsort Heapsort combines the time efficiency of mergesort and the storage efficiency of quicksort. Like mergesort, heapsort has a worst case running time that is O(nlogn), and like quicksort it does not require an additional array. Heapsort works by first transforming the array to be sorted into a heap.

Heap representation A heap is a complete binary tree, and an efficient method of representing a complete binary tree in an array follows these rules: Data from the root of the complete binary tree goes in the [0] component of the array. The data from depth one nodes is placed in the next two components of the array. We continue in this way, placing the four nodes of depth two in the next four components of the array, and so forth. For example, a heap with ten nodes can be stored in a ten element array as shown below:

Rules for location of data in the array The data from the root always appears in the [0] component of the array. Suppose that the data for a root node appears in the component [i] of the array. Then the data for its parent is always at location [(i - 1) / 2] (using integer division).

Suppose that the data for a node appears in component [i] of the array. Then its children (if they exist)always have their data at these locations: Left child at component [2i + 1] Right child at component [2i + 2]

The largest value in a heap is stored in the root and that in our array representation of a complete binary tree, the root node is stored in the first array position. Thus, since the array represents a heap, we know that the largest array element is in the first array position. To get the largest value in the correct array position, we simply interchange the first and the final array elements. After interchanging these two array elements, we know that the largest array element is in the final array position, as shown below:

The dark vertical line in the array separates the sorted side of the array from the unsorted side (on the left). Moreover the left side still represents a complete binary tree which is almost a heap

When we consider the unsorted side as a complete binary tree, it is only the root that violates the heapstorage rule. This root is smaller than its children. The next step of the heapsort is to reposition this out of place value in order to restore the heap. The process called reheapification downward begins by comparing the value in the root with the value of each of its children. If one or both children are larger than the root, then the root’s value is exchanged with the larger of the two children. This moves the troublesome value down one level in the tree. For example:

In the example 5 is still out of place, so we will once again exchange it with its largest child resulting in the array and heap shown below:

The unsorted side of the array must be maintained a heap. When the unsorted side of the array is once again a heap, the heapsort continues by exchanging the largest element in the unsorted side with the rightmost element of the unsorted side. For example 42 is exchanged with 4 as shown below:

The sorted side now has two largest elements, and the unsorted side is once again almost a heap, as shown below : Only the root (4) is out of place, and that may be fixed by another reheapification downward. After the reheapification downward, the largest value of the unsorted side will once again reside at location [0]and we can continue to pull out the next largest value.

Pseudocode for the heapsort algorithm //Heapsort for the array called data with n elements 1. Convert the array of n elements into a heap. 2. unsorted = n; // The number of elements in the unsorted side 3. while ( unsorted > 1) { // Reduce the unsorted side by one unsorted - -; Swap data[0] with data [unsorted]. The unsorted side of the array is now a heap with the root out of place. Do a reheapification downward to turn the unsorted side back into a heap. }

While the Heapsort is a constant factor slower than the Quicksort for average data sets, it still has a complexity of O(n log n) for best, worst and average data. Reheapification is the process by which an out of place node is exchanged with its parent (upward) or its children (downward) until the node’s key is = to it’s children. Then reheapification is complete. Analysis

A simpler but related problem: Find the smallest and the second smallest elements in an array (of size n): A straightforward method: Find the smallest using n –1 comparisons; swap it to the end of the array, find the next smallest in n –2 comparisons, for a total of 2n – 3 comparisons. A method that “saves” results of some earlier comparisons: (1) Divide the array into two halves, find the smallest elements of each half, call them min1 and min2, respectively; (2) Compare min1 and min2, to find the smallest; (3) If min1 is the smallest, compare min2 with the remaining elements of the first half to determine the second smallest; similarly for the case if min2 is the smallest. The number of comparisons is (n –1) + (  n/2  –1) = (  3n/2  –2.

The second method can be depicted in the following figure: min1min2 min First half of arraySecond half of array In the figure, a link connects a smaller element to a larger one going downward; for example, min  min1 and min  min2, and min1  each element in the first half of the array, similarly for min2. This organization facilitates finding the smallest and second smallest elements. H1H1 H2H2

A binary tree data structure for Min-heaps: A binary tree is called a left-complete binary tree if (1)the tree is full at each level except possibly at the maximum level (depth), where the tree is full at level i means there are 2 i nodes at that level (recall the root is at level 0); and (2)at the tree’s maximum level h, if there are fewer than 2 h nodes then the missing nodes are on the right side of the tree. Missing on the right side If there are n nodes in a left- complete binary tree of depth h, then 2 h  n  2 h+1 –1. Thus, h  lg n < (h +1). In particular, h = O(lg n).

A binary tree satisfies the (min-)heap-order property if the value at each is less than or equal to the value at each of the child nodes (if exist). A binary min-heap is a left-complete binary tree that also satisfies the heap-order property Note that there is no definite order between values in sibling nodes; also, it is obvious that the minimum value is at the root. A binary min-heap

Two basic operations on binary min-heaps: (1)Suppose the value at a node is decreased which causes violation to the heap-order property (because the value is smaller than that of its parent’s). The operation to restore the heap property is named percolateUp (siftUp). (2)Suppose the value at a node is increased which becomes larger than either (or both) of the value of the children’s. The operation to restore the heap order is named percolateDown (siftDown). Both operations can be completed in time proportional to the tree height, thus, of time complexity O(lg n) where n is the total number of nodes. These operations are crucial in supporting the heap operations insert and deleteMin.

The percolateUp (siftUp) operation: The heap order violated at node valued percolateUp percolateUp Note that the time of each step (each level) is constant, i.e., compare with parent and swap

The percolateDown (siftDown) operation: The heap order violated at node valued 43 Each step of percolateDown finds the smaller of the two children (if exist) then swap it with the violating node Who is smaller ? percolateDown percolate- Down

Implementation of the heap operations: insert(): insert a node next to the last node, call percolateUp from it, treating the new node as a violation. deleteMin(): delete the root node, move the last node to the root, then call percolateDown from the root. findMin(): return the root’s value. Finally, a binary heap can be implemented using an array storing the heap elements from the root towards the leaves, level by level and from left to right within each level. The left-completeness property guarantees that there are no “holes” in the array. Also, if we use array indexes 1..n to store n elements, the parent’s index (if exists) of node i is  i/2 , the left child’s index is 2i, the right child 2i + 1 (if exist). The time complexity of deleteMin and insert both are O(lg n); the time complexity of findMin is O(1).

An array implementation of a binary heap: Number the nodes (starting at 1) by levels, from top to bottom and left to right within level Level 2 nodesLevel 1 nodesRoot Parent to children (index i to 2i, 2i+1)

Convert an array T[1..n] into a heap (known as heapify): A top-down method: repeatedly call insert() by inserting T[i] into a heap T[1..(i –1)], for i = 2 to n Initially, T[1] by itself ia a heap Insert 12 into T[1..1], making T[1..2] a heap Insert 9 into T[1..2], making T[1..3] a heap Insert 13 into T[1..3], making T[1..4} a heap Insert 7 into T[1..4], making T[1..5] a heap The total time complexity of top-down heapify is O(n lg n).

A bottom-up heapify method: For i =  n/2  down to 1 /* percolate down T[i], making the subtree rooted at i a heap */ call percolateDown(i) n = 5, start at node  n/2 , the node of the highest index that has any children Complexity analysis: Node1 travels down h levels during percolateDown, nodes 2 and 3 each (h –1) levels, nodes 4 through 7 each (h –2) levels, etc. The total number of levels traveled is  h + 2(h –1) + 4 (h –2) + …+ 2 h –1 (h – (h – 1)) = 2 h –1 (1 + 2(1/2) + 3(1/2) 2 + 4(1/2) 3 + …)  2n, because 2 h –1 < n/2 since h  lg n, and the infinite series 1 + 2x + 3x 2 + … = 1/(1 –x) 2 when |x| < 1. Thus, the time complexity of heapify is O(n).