2015-T2 Lecture 30 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas.

Slides:



Advertisements
Similar presentations
2014-T2 Lecture 25 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas.
Advertisements

Heapsort By: Steven Huang. What is a Heapsort? Heapsort is a comparison-based sorting algorithm to create a sorted array (or list) Part of the selection.
@ Zhigang Zhu, CSC212 Data Structure - Section FG Lecture 22 Recursive Sorting, Heapsort & STL Quicksort Instructor: Zhigang Zhu Department.
More sorting algorithms: Heap sort & Radix sort. Heap Data Structure and Heap Sort (Chapter 7.6)
CS 206 Introduction to Computer Science II 11 / 04 / 2009 Instructor: Michael Eckmann.
Version TCSS 342, Winter 2006 Lecture Notes Priority Queues Heaps.
Heaps & Priority Queues Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
CSC 172 DATA STRUCTURES. Priority Queues Model Set with priorities associatedwith elements Priorities are comparable by a < operator Operations Insert.
CSE 373 Data Structures Lecture 15
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
Computer Science and Software Engineering University of Wisconsin - Platteville 12. Heap Yan Shi CS/SE 2630 Lecture Notes Partially adopted from C++ Plus.
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
Introduction to Analysing Costs 2015-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina.
COMP 103 Priority Queues, Partially Ordered Trees and Heaps.
Brought to you by Max (ICQ: TEL: ) February 5, 2005 Advanced Data Structures Introduction.
A review session 2013-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
COMP 103 Hashing 2013-T2 Lecture 28 Thomas Kuehne School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay.
Sorting with Heaps Observation: Removal of the largest item from a heap can be performed in O(log n) time Another observation: Nodes are removed in order.
2013-T2 Lecture 22 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and Thomas.
Heapsort By Pedro Oñate CS-146 Dr. Sin-Min Lee. Overview: Uses a heap as its data structure In-place sorting algorithm – memory efficient Time complexity.
data ordered along paths from root to leaf
1 Joe Meehean.  Problem arrange comparable items in list into sorted order  Most sorting algorithms involve comparing item values  We assume items.
Chapter 21 Priority Queue: Binary Heap Saurav Karmakar.
FASTER SORTING using RECURSION : QUICKSORT 2014-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus.
Outline Priority Queues Binary Heaps Randomized Mergeable Heaps.
COMP 103 Hashing 2014-T2 Lecture 32 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay.
An introduction to costs (continued), and Binary Search 2013-T2 Lecture 11 School of Engineering and Computer Science, Victoria University of Wellington.
SORTING 2014-T2 Lecture 13 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
2011-T2 Lecture 21 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, and Peter Andreae, VUW.
COMP 103 Hashing. 2 RECAP-TODAY RECAP Bitmaps are a fast way to implement Sets of integers, characters, etc TODAY  Hashing is a similar idea  Detecting.
2013-T2 Lecture 18 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and John.
CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.
More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University.
1 CSC212 Data Structure - Section AB Lecture 22 Recursive Sorting, Heapsort & STL Quicksort Instructor: Edgardo Molina Department of Computer Science City.
COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus.
2015-T2 Lecture 17 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, John Lewis,
Week 13 - Friday.  What did we talk about last time?  Sorting  Insertion sort  Merge sort  Started quicksort.
Heaps & Priority Queues
QUICKSORT 2015-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
HEAPS. Review: what are the requirements of the abstract data type: priority queue? Quick removal of item with highest priority (highest or lowest key.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
Some Other Collections: Bags, Sets, Queues and Maps COMP T2 Lecture 4 School of Engineering and Computer Science, Victoria University of Wellington.
2014-T2 Lecture 29 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae and Thomas.
2015-T2 Lecture 21 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and Thomas.
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
2014-T2 Lecture 27 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas.
Priority Queues CS 110: Data Structures and Algorithms First Semester,
Week 13 - Wednesday.  What did we talk about last time?  NP-completeness.
2015-T2 Lecture 19 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and John.
Heap Sort Uses a heap, which is a tree-based data type Steps involved: Turn the array into a heap. Delete the root from the heap and insert into the array,
2014-T2 Lecture 18 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and John.
2015-T2 Lecture 28 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae and Thomas.
Introduction to Analysing Costs 2013-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina.
COMP 103 Course Review. 2 Menu  A final word on hash collisions in Open Addressing / Probing  Course Summary  What we have covered  What you should.
1 Priority Queues (Heaps). 2 Priority Queues Many applications require that we process records with keys in order, but not necessarily in full sorted.
COMP 103 Hashing Marcus Frean 2015-T2 Lecture 31
COMP 103 SORTING Lindsay Groves 2016-T2 Lecture 26
COMP 103 Sorting with Binary Trees: Tree sort, Heap sort Alex Potanin
COMP 103 HeapSort Thomas Kuehne 2013-T1 Lecture 27
Heapsort.
More complexity analysis & Binary Search
Data Structures & Algorithms Priority Queues & HeapSort
BuildHeap & HeapSort.
CSC212 Data Structure - Section RS
CS Data Structure: Heaps.
Heapsort.
Interesting Algorithms for Real World Problems
Presentation transcript:

2015-T2 Lecture 30 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas Kuehne, VUW COMP 103 Marcus Frean HeapSort

2 RECAP-TODAY RECAP  Priority Queue, POTs, Heaps  Priority Queue  another fast sorting algorithm! TODAY  HeapSort – the PriorityQueue sort, done in an array  Introduction to Hashing (maybe)

3 recap: Heap  “Heap” = a complete POT, implemented in an array  Bottom right node is last element used  We can compute the index of parent and children of a node:  the children of nodei are at(2i+1) and (2i+2)  the parent of nodei is at (i-1)/2  note: no gaps! Bee 35 Eel 26 Kea 19 Dog 14 Fox 7 Ant 9 Hen 23 Gnu 13 Jay 2 Cat Bee 35 Eel 26 Kea 19 Dog 14 Fox 7 Ant 9 Hen 23 Gnu 13 Jay 2 Cat 4

4 Heapsort: In-Place Sorting  Use an array-based Heap  in-place sorting algorithm! 1. turn the array into a heap 2. “remove” top element, and restore heap property again 3. repeat step 2. n-1 times in-place dequeueing Sorted → and so on!

5 Heapsort: In-Place Sorting How to turn the array into a heap? Heapify for i = lastParent down to 0 sinkdown(i) heap property installed heap property installed (n-2)/

6 HeapSort: Algorithm (a) Turn data into a heap (b) Repeatedly swap root with last item and push down public void heapSort(E[] data, int size, Comparator comp) { for (int i = (size-2)/2; i >= 0; i--) sinkDown(i, data, size, comp); while (size > 0) { size--; swap(data, size, 0); sinkDown(0, data, size, comp); } } "heapify" in-place dequeueing

7 Cost of “heapify” (n/2 log (n+1) -1)  (log(n+1)-1) n/8  2 n/4  1 n/2  0 Cost = n [1/4 + 2/8 + 3/16 + 4/32 + ⋯ (log(n+1)-1)/n] swaps = ⋮ swaps =

8 Cost of “heapify” Cost= n  [ 1/4 + 2/8 + 3/16 + 4/32 + ⋯ ] = n  [ (1/4 + 1/8 +1/16 + 1/32 + ⋯ ) + ( 1/8 +1/16 + 1/32 + ⋯ ) + ( 1/16 + 1/32 + ⋯ ) + ⋮ ⋮ ≈ n  [ 1/2 + 1/4 + 1/8 + ⋯ ] ≈ n  [1] = n swaps  We can turn an unordered list into a heap in linear time!!! 1 st row 2 nd row 3 rd row 1 st row 2 nd row 3 rd row

9 HeapSort: Summary  Cost of heapify = O(n)  n  Cost of remove= O(n log n)  Total cost = O(n log n)  True for worst-case and average case!  unlike QuickSort and TreeSort  Can be done in-place  Unlike MergeSort, doesn’t need extra memory to be fast  Not stable 

10 NEW TOPIC: Sets with O(1) operations? ✔ We need a way to compute an index for an object: add(“2001 – A Space Odyssey”) “Hashing”: compute the “hash code” of an object N ✔✗✔✔✗✗✗✗✗✗✗ ⋯⋯ ✗ Hash function 581 “ 2001 – A Space Odyssey ”

11 O(1) Sets with big values?  But there are too many possible film titles!  Suppose the hash function always produces a number between 0 and 1000 ⇒ some film titles must end up with the same number! ⇒ “Collision” N ✔✗✔✔✗✗✗✗✗✗✗ ⋯⋯ ✔✔ HASH “ Gravity ” “ 2001 – A Space Odyssey ” HASH

12 Detecting collisions  Store the item in the array, instead of a boolean  Questions 1. How to choose hash function that minimises collisions? 2. How to manage collisions when they occur? N ⋯⋯ “ Gravity ” “ 2001 – A Space Odyssey ” HASH

13 Computing Hash Codes Wish list Summary for HashCode Function  Should produce an integer  Should distribute the hash codes evenly through the range minimises collisions  Should be fast to compute  Should take account of all components of the object  Must be consistent with equals() two items that are equal must have the same hash value Can we avoid clashes altogether? That would be perfect!  perfect hash function

14 A Simple Hash Function for Strings  We could add up the codes of all the characters: private int hash(String value) { int hashCode = 0; for (int i = 0; i < value.length(); i++) hashCode += value.charAt(i); return hashCode; } Why is this not very good?

15 Example: Hashing course codes 418 ← DEAF ← DEAF102 DEAF201 ⋮ 429 ← BBSC201 MDIA ← ECHI410 MDIA102 MDIA ← ECHI303 JAPA111 JAPA201 MDIA202 MDIA220 MDIA ← ARCH101 ASIA101 BBSC231 BBSC303 BBSC321 CHEM201 ECHI403 ECHI412 JAPA112 JAPA211 JAPA301 MDIA203 MDIA302 MDIA320 ⋮ 450 ← ANTH412 ARCH389 ARTH111 BIOL228 BIOL327 BIOL372 CHEM489 COML304 COML403 COML421 COMP102 COMP201 CRIM313 CRIM421 DESN215 DESN233 ECON328 ECON409 ECON418 ECON508 EDUC449 EDUC458 EDUC548 EDUC557 ENGL228 ENGL408 ENGL426 ENGL435 ENGL444 ENGL453 FREN124 FREN331 FREN403 FREN412 GEOL362 GEOL407 GERM214 GERM403 GERM412 INFO213 INFO312 INFO402 ITAL206 ITAL215 LALS501 LATI404 LING224 LING323 LING404 MAOR102 MARK304 MARK403 MATH206 MATH314 MATH323 MATH431 MOFI403 PHIL104 PHIL203 PHIL302 PHIL320 PHIL401 PHIL410 RELI321 RELI411 SAMO101 ⋮ a lot of collisions!

16 Better Hash Functions  Make the contribution of each character depend on its position: private int hash(String course) { int k = 257; int hashCode = 0; for (int i = 0; i < course.length(); i ++ ) hashCode = hashCode * k + course.charAt(i); return hashCode; } hashCode(s) = k 6 x s 0 + k 5 x s 1 + k 4 x s 2 + k 3 x s 3 + k 2 x s 4 + k 1 x s 5 + s 6 (it is best to use a prime number for the constant k)