Instructor: Lilian de Greef Quarter: Summer 2017

Slides:



Advertisements
Similar presentations
COL 106 Shweta Agrawal and Amit Kumar
Advertisements

Heaps, Heap Sort, and Priority Queues. Sorting III / Slide 2 Background: Binary Trees * Has a root at the topmost level * Each node has zero, one or two.
CSE332: Data Abstractions Lecture 4: Priority Queues Dan Grossman Spring 2010.
CSE332: Data Abstractions Lecture 5: Binary Heaps, Continued Tyler Robison Summer
Priority Queue (Heap) & Heapsort COMP171 Fall 2006 Lecture 11 & 12.
1 Chapter 6 Priority Queues (Heaps) General ideas of priority queues (Insert & DeleteMin) Efficient implementation of priority queue Uses of priority queues.
Heaps and heapsort COMP171 Fall Sorting III / Slide 2 Motivating Example 3 jobs have been submitted to a printer in the order A, B, C. Sizes: Job.
Version TCSS 342, Winter 2006 Lecture Notes Priority Queues Heaps.
Source: Muangsin / Weiss1 Priority Queue (Heap) A kind of queue Dequeue gets element with the highest priority Priority is based on a comparable value.
CSC 172 DATA STRUCTURES. Priority Queues Model Set with priorities associatedwith elements Priorities are comparable by a < operator Operations Insert.
data ordered along paths from root to leaf
CSE373: Data Structures & Algorithms Lecture 6: Priority Queues Dan Grossman Fall 2013.
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
CE 221 Data Structures and Algorithms Chapter 6: Priority Queues (Binary Heaps) Text: Read Weiss, §6.1 – 6.3 1Izmir University of Economics.
CSE373: Data Structures & Algorithms Lecture 6: Priority Queues Kevin Quinn Fall 2015.
CSE332: Data Abstractions Lecture 5: Binary Heaps, Continued Dan Grossman Spring 2012.
Data StructuresData Structures Priority Queue. Recall Queues FIFO:First-In, First-Out Some contexts where this seems right? Some contexts where some things.
HEAPS. Review: what are the requirements of the abstract data type: priority queue? Quick removal of item with highest priority (highest or lowest key.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
AVL Trees and Heaps. AVL Trees So far balancing the tree was done globally Basically every node was involved in the balance operation Tree balancing can.
CSE373: Data Structures & Algorithms Lecture 8: Priority Queues Aaron Bauer Winter 2014.
CSE 332 Data Abstractions: Priority Queues, Heaps, and a Small Town Barber Kate Deibel Summer 2012 June 25, 2012CSE 332 Data Abstractions, Summer
CSE373: Data Structures & Algorithms Lecture 8: AVL Trees and Priority Queues Catie Baker Spring 2015.
CSE373: Data Structures & Algorithms Lecture 9: Priority Queues and Binary Heaps Linda Shapiro Spring 2016.
CSE332: Data Abstractions Lecture 4: Priority Queues Dan Grossman Spring 2012.
Heaps, Heap Sort, and Priority Queues. Background: Binary Trees * Has a root at the topmost level * Each node has zero, one or two children * A node that.
CSE373: Data Structures & Algorithms Lecture 8: AVL Trees and Priority Queues Linda Shapiro Spring 2016.
1 Priority Queues (Heaps). 2 Priority Queues Many applications require that we process records with keys in order, but not necessarily in full sorted.
Priority Queues and Heaps. John Edgar  Define the ADT priority queue  Define the partially ordered property  Define a heap  Implement a heap using.
Priority Queues and Binary Heaps Fundamental Data Structures and Algorithms Peter Lee February 4, 2003.
Heaps and Heap Sort. Sorting III / Slide 2 Background: Complete Binary Trees * A complete binary tree is the tree n Where a node can have 0 (for the leaves)
CSE373: Data Structures & Algorithms Priority Queues
CSE373: Data Structure & Algorithms
CSE373: Data Structures & Algorithms
CS 201 Data Structures and Algorithms
CSE373: Data Structures & Algorithms More Heaps; Dictionaries; Binary Search Trees Riley Porter Winter 2016 Winter 2017.
Priority Queues and Heaps
October 30th – Priority QUeues
Source: Muangsin / Weiss
Binary Heaps What is a Binary Heap?
March 31 – Priority Queues and the heap
Bohyung Han CSE, POSTECH
Cse 373 April 26th – Exam Review.
Heaps, Heap Sort, and Priority Queues
Instructor: Lilian de Greef Quarter: Summer 2017
Priority Queues (Heaps)
Binary Heaps What is a Binary Heap?
7/23/2009 Many thanks to David Sun for some of the included slides!
Heaps, Heap Sort, and Priority Queues
CMSC 341: Data Structures Priority Queues – Binary Heaps
CSE332: Data Abstractions Lecture 5: Binary Heaps, Continued
Draw pictures to indicate the subproblems middleMax solves at each level and the resulting maxPtr and PrevPtr for each on this linked list:
CSE 373: Data Structures and Algorithms
CMSC 341 Lecture 14 Priority Queues & Heaps
B-Tree Insertions, Intro to Heaps
CSE 332: Data Structures Priority Queues – Binary Heaps
CSE332: Data Abstractions Lecture 4: Priority Queues
Instructor: Lilian de Greef Quarter: Summer 2017
Heaps and the Heapsort Heaps and priority queues
CE 221 Data Structures and Algorithms
CS Data Structure: Heaps.
CSE 326: Data Structures Priority Queues & Binary Heaps
CSE 332: Data Structures Priority Queues – Binary Heaps Part II
CSE373: Data Structures & Algorithms Lecture 7: Binary Heaps, Continued Dan Grossman Fall 2013.
CSE 373 Data Structures and Algorithms
CSE 373 Priority queue implementation; Intro to heaps
Priority Queues (Heaps)
Priority Queues CSE 373 Data Structures.
CSE 373: Data Structures and Algorithms
Priority Queues Binary Heaps
Presentation transcript:

Instructor: Lilian de Greef Quarter: Summer 2017 CSE 373: Data Structures and Algorithms Lecture 12: Binary Heaps Instructor: Lilian de Greef Quarter: Summer 2017

Announcements Midterm on Friday Practice midterms on course website Note that some may cover slightly different material Will start at 10:50, will end promptly at 11:50 (even if you’re late), so be early Will have homework 3 grades back before midterm Reminder: course feedback session on Wednesday

Priority Queue ADT Meaning: A priority queue holds compare-able data Key property: deleteMin returns and deletes the item with the highest priority (can resolve ties arbitrarily) Operations: deleteMin insert isEmpty insert deleteMin 7 18 12 23 3 45 15 6 2

Finding a good data structure Will show an efficient, non-obvious data structure for this ADT But first let’s analyze some “obvious” ideas for n data items data insert algorithm / time deleteMin algorithm / time unsorted array unsorted linked list sorted circular array sorted linked list binary search tree AVL tree add at end search add at front search search / shift move front put in right place remove at front put in right place leftmost

Our data structure A binary min-heap (or just binary heap or just heap) has: Structure property: Heap property: The priority of every (non-root) node is less important than the priority of its parent So: Where is the highest-priority item? Where is the lowest priority? What is the height of a heap with n items? 99 60 40 80 20 10 50 700 85 7 3 18 5 10 Where is the highest-priority item? Root Where is lowest priority item? Leaf node What is the height of a heap with n items? height is logn COMPLETE TREE!

deleteMin: Step #1 3 4 9 8 5 7 10 6 11 1

deleteMin: Step #2 (Keep Structure Property) Want to keep structure property 3 4 9 8 5 7 10 6 11

deleteMin: Step #3 Want to restore heap property 3 4 9 8 5 7 6 11

deleteMin: Step #3 (Restore Heap Property) Percolate down: Compare priority of item with its children If item has lower priority, swap with the most important child Repeat until both children have lower priority or we’ve reached a leaf node What is the run time? 3 4 9 8 5 7 10 6 11 ? 11 4 9 8 5 7 10 6 3 ? 8 4 9 10 5 7 6 11 3 Run time? Runtime is O(height of heap) = O(logn) Height of a complete binary tree of n nodes =  log2(n) 

deleteMin: Run Time Analysis Run time is A heap is a So its height with n nodes is So run time of deleteMin is

insert: Step #1 2 3 4 9 8 5 7 6 11 1

insert: Step #2 2 3 4 9 8 5 7 6 11 1

insert: Step #2 (Restore Heap Property) Percolate up: Put new data in new location If higher priority than parent, swap with parent Repeat until parent is more important or reached root What is the running time? 2 8 4 9 10 5 7 6 11 1 ? 2 5 8 4 9 10 7 6 11 1 ? 2 5 8 9 10 4 7 6 11 1 ?

Summary: basic idea for operations findMin: return root.data deleteMin: answer = root.data Move right-most node in last row to root to restore structure property “Percolate down” to restore heap property insert: Put new node in next position on bottom row to restore structure property “Percolate up” to restore heap property Overall strategy: Preserve structure property Restore heap property

Binary Heap Operations O(log n) insert O(log n) deleteMin worst-case Very good constant factors If items arrive in random order, then insert is O(1) on average O(log n) insert O(log n) deleteMin worst-case Possible because we don’t support unneeded operations; no need to maintain a full sort

Summary: Priority Queue ADT insert deleteMin 7 18 12 23 3 45 15 6 2 Priority Queue ADT: insert comparable object, deleteMin Binary heap data structure: Complete binary tree Each node has less important priority value than its parent insert and deleteMin operations = O(height-of-tree)=O(log n) insert: put at new last position in tree and percolate-up deleteMin: remove root, put last element at root and percolate-down 99 60 40 80 20 10 700 50 85

Binary Trees Implemented with an Array From node i: left child: i*2 right child: i*2+1 parent: i/2 (wasting index 0 is convenient for the index arithmetic) A B C 2 * i D E F G (2 * i)+1 └ i / 2┘ H I J K L Left child of node I = 2 * I Right child I = (2*i) +1 Parent of node I is at i/2 (floor) if tree is complete, array won’t have empty slots (excpet maybe at end) 1 2 3 4 5 6 7 8 9 10 11 12 13

Judging the array implementation Pros: Non-data space: just index 0 and unused space on right In conventional tree representation, one edge per node (except for root), so n-1 wasted space (like linked lists) Array would waste more space if tree were not complete Multiplying and dividing by 2 is very fast (shift operations in hardware) Last used position is just index Cons: Same might-be-empty or might-get-full problems we saw with array-based stacks and queues (resize by doubling as necessary) Pros outweigh cons: min-heaps almost always use array implementation

insert (in this order): 16, 32, 4, 67, 105, 43, 2 deleteMin once Practice time! Starting with an empty array-based binary heap, which is the result after insert (in this order): 16, 32, 4, 67, 105, 43, 2 deleteMin once 1 2 3 4 5 6 7 A) C) 4 15 32 43 67 105 1 2 3 5 6 7 4 32 16 43 105 67 1 2 3 5 6 7 B) D) 16 32 4 67 105 43 1 2 3 5 6 7 4 32 16 67 105 43 1 2 3 5 6 7

(extra space for your scratch work a notes)

Semi-Pseudocode: insert into binary heap void insert(int val) { if(size==arr.length-1) resize(); size++; i=percolateUp(size,val); arr[i] = val; } int percolateUp(int hole, int val) { while(hole > 1 && val < arr[hole/2]) arr[hole] = arr[hole/2]; hole = hole / 2; } return hole; 99 60 40 80 20 10 70 0 50 85 This pseudocode uses ints. In real use, you will have data nodes with priorities. 10 20 80 40 60 85 99 700 50 1 2 3 4 5 6 7 8 9 11 12 13

Semi-Pseudocode: deleteMin from binary heap int deleteMin() { if(isEmpty()) throw… ans = arr[1]; hole = percolateDown (1,arr[size]); arr[hole] = arr[size]; size--; return ans; } int percolateDown(int hole, int val) { while(2*hole <= size) { left = 2*hole; right = left + 1; if(right > size || arr[left] < arr[right]) target = left; else target = right; if(arr[target] < val) { arr[hole] = arr[target]; hole = target; } else break; } return hole; 99 60 40 80 20 10 700 50 85 10 20 80 40 60 85 99 700 50 1 2 3 4 5 6 7 8 9 11 12 13

Example insert (in this order): 16, 32, 4, 67, 105, 43, 2 deleteMin once 1 2 3 4 5 6 7

Example insert (in this order): 16, 32, 4, 67, 105, 43, 2 deleteMin once 1 2 3 4 5 6 7 2 32 4 16 43 105 67

Other operations decreaseKey: given pointer to object in priority queue (e.g., its array index), lower its priority value by p Change priority and percolate up increaseKey: given pointer to object in priority queue (e.g., its array index), raise its priority value by p Change priority and percolate down remove: given pointer to object in priority queue (e.g., its array index), remove it from the queue decreaseKey with p = , then deleteMin Running time for all these operations? O(logn)

Build Heap Suppose you have n items to put in a new (empty) priority queue Call this operation buildHeap n inserts Only choice if ADT doesn’t provide buildHeap explicitly Why would an ADT provide this unnecessary operation? Convenience Efficiency: an O(n) algorithm Common issue in ADT design: how many specialized operations

heapify (Floyd’s Method) Use n items to make any complete tree you want That is, put them in array indices 1,…,n Fix the heap-order property Bottom-up: percolate down starting at nodes one level up from leaves, work up toward the root

heapify (Floyd’s Method): Example Use n items to make any complete tree you want Fix the heap-order property from bottom-up 6 7 1 8 9 2 10 3 11 5 12 4 Which nodes break the heap-order property? Why work from the bottom-up to fix them? Why start at one level above the leaf nodes? Where do we start here? - -

heapify (Floyd’s Method): Example 6 7 1 8 9 2 10 3 11 5 12 4

heapify (Floyd’s Method): Example 6 7 10 8 9 2 1 3 11 5 12 4

heapify (Floyd’s Method): Example 11 7 1 8 9 6 10 5 2 3 12 4

heapify (Floyd’s Method): Example 12 7 1 8 9 11 10 5 6 3 2 4

heapify (Floyd’s Method) void buildHeap() { for(int i = size/2; i>0; i--) { val = arr[i]; hole = percolateDown(i, val); arr[hole] = val; } … it “seems to work” But is it right? Let’s prove it restores the heap property Then let’s prove its running time

Correctness void buildHeap() { for(i = size/2; i>0; i--) { val = arr[i]; hole = percolateDown(i,val); arr[hole] = val; } Loop Invariant: For all j > i, arr[j] is higher priority than its children True initially: If j > size/2, then j is a leaf Otherwise its left child would be at position > size True after one more iteration: loop body and percolateDown make arr[i] higher priority than children without breaking the property for any descendants So after the loop finishes, all nodes are less than their children loop invariant is a statement that holds before and after each execution of a loop

Efficiency buildHeap is where n is size Easier argument: void buildHeap() { for(i = size/2; i>0; i--) { val = arr[i]; hole = percolateDown(i,val); arr[hole] = val; } buildHeap is where n is size Easier argument: loop iterations Each iteration does one percolateDown, each is This is correct, but there is a more precise (“tighter”) analysis of the algorithm…

Efficiency buildHeap is where n is size Better argument: void buildHeap() { for(i = size/2; i>0; i--) { val = arr[i]; hole = percolateDown(i,val); arr[hole] = val; } buildHeap is where n is size Better argument: size/2 total loop iterations: O(n) 1/2 the loop iterations percolateDown at most 1/4 the loop iterations percolateDown at most 1/8 the loop iterations percolateDown at most … ((1/2) + (2/4) + (3/8) + (4/16) + …) < 2 (page 4 of Weiss)

Lessons from buildHeap Without providing buildHeap, clients can implement their own that runs in worst case By providing a specialized operation (with access to the internal data), we can do worst case Intuition: Most data is near a leaf, so better to percolate down Can analyze this algorithm for: Correctness: Non-trivial inductive proof using loop invariant Efficiency: First (easier) analysis proved it was O(n log n) Tighter analysis shows same algorithm is O(n)

Other branching factors for Heaps d-heaps: have d children instead of 2 Makes heaps shallower Example: 3-heap Only difference: three children instead of 2 Still use an array with all positions from 1 … heapSize Indices for 3-heap Index Children Indices 1 2 3 4 5 … -

Wrapping up Heaps What are heaps a data structure for? What is it usually implemented with? Why? What are some example uses?