4/10/2003CSE 202 - Quick&Heap CSE 202 - Algorithms Quicksort vs Heapsort: the “official” story Next time, we’ll get the “inside” story!

Slides:



Advertisements
Similar presentations
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Advertisements

Analysis of Algorithms
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
Using Divide and Conquer for Sorting
Quicksort CSE 331 Section 2 James Daly. Review: Merge Sort Basic idea: split the list into two parts, sort both parts, then merge the two lists
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
September 19, Algorithms and Data Structures Lecture IV Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Sorting Heapsort Quick review of basic sorting methods Lower bounds for comparison-based methods Non-comparison based sorting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
CS 253: Algorithms Chapter 6 Heapsort Appendix B.5 Credit: Dr. George Bebis.
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
Analysis of Algorithms CS 477/677
More sorting algorithms: Heap sort & Radix sort. Heap Data Structure and Heap Sort (Chapter 7.6)
CS420 lecture five Priority Queues, Sorting wim bohm cs csu.
Tirgul 4 Sorting: – Quicksort – Average vs. Randomized – Bucket Sort Heaps – Overview – Heapify – Build-Heap.
10/3/2002CSE Quick&Heap CSE Algorithms Quicksort vs Heapsort: the “official” story Next time, we’ll get the “inside” story!
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 7 Heapsort and priority queues Motivation Heaps Building and maintaining heaps.
10/10/2002CSE Memory Hierarchy CSE Algorithms Quicksort vs Heapsort: the “inside” story or A Two-Level Model of Memory.
3-Sorting-Intro-Heapsort1 Sorting Dan Barrish-Flood.
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
Tirgul 7 Heaps & Priority Queues Reminder Examples Hash Tables Reminder Examples.
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Spring, 2007 Heap Lecture Chapter 6 Use NOTES feature to see explanation.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
David Luebke 1 7/2/2015 Merge Sort Solving Recurrences The Master Theorem.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2001 Heap Lecture 2 Chapter 7 Wed. 10/10/01 Use NOTES feature to.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
David Luebke 1 10/3/2015 CS 332: Algorithms Solving Recurrences Continued The Master Theorem Introduction to heapsort.
MS 101: Algorithms Instructor Neelima Gupta
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Heaps, Heapsort, Priority Queues. Sorting So Far Heap: Data structure and associated algorithms, Not garbage collection context.
Binary Heap.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
1 Joe Meehean.  Problem arrange comparable items in list into sorted order  Most sorting algorithms involve comparing item values  We assume items.
Computer Sciences Department1. Sorting algorithm 3 Chapter 6 3Computer Sciences Department Sorting algorithm 1  insertion sort Sorting algorithm 2.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
David Luebke 1 6/3/2016 CS 332: Algorithms Analyzing Quicksort: Average Case.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
1 Heaps (Priority Queues) You are given a set of items A[1..N] We want to find only the smallest or largest (highest priority) item quickly. Examples:
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
1 CSE 326: Data Structures A Sort of Detour Henry Kautz Winter Quarter 2002.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
David Luebke 1 12/23/2015 Heaps & Priority Queues.
CSC 413/513: Intro to Algorithms Solving Recurrences Continued The Master Theorem Introduction to heapsort.
Chapter 6: Heapsort Combines the good qualities of insertion sort (sort in place) and merge sort (speed) Based on a data structure called a “binary heap”
Lecture 8 : Priority Queue Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University.
Heaps and basic data structures David Kauchak cs161 Summer 2009.
Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller.
Liang, Introduction to Java Programming, Sixth Edition, (c) 2007 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
Mergeable Heaps David Kauchak cs302 Spring Admin Homework 7?
David Luebke 1 2/5/2016 CS 332: Algorithms Introduction to heapsort.
Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.
David Luebke 1 2/19/2016 Priority Queues Quicksort.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Quicksort This is probably the most popular sorting algorithm. It was invented by the English Scientist C.A.R. Hoare It is popular because it works well.
Amortized Analysis and Heaps Intro David Kauchak cs302 Spring 2013.
"Teachers open the door, but you must enter by yourself. "
CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
Heaps, Heapsort, and Priority Queues
CSE Algorithms Quicksort vs Heapsort: the “inside” story or
Data Structures & Algorithms Priority Queues & HeapSort
Heaps, Heapsort, and Priority Queues
Lecture 3 / 4 Algorithm Analysis
Topic 5: Heap data structure heap sort Priority queue
Amortized Analysis and Heaps Intro
Heaps & Multi-way Search Trees
Presentation transcript:

4/10/2003CSE Quick&Heap CSE Algorithms Quicksort vs Heapsort: the “official” story Next time, we’ll get the “inside” story!

CSE Quick&Heap2 Overview Quicksort and Heapsort are comparable –Both use only binary comparisons –Both sort in-place Heapsort: – worst-case complexity is  (n lg n) Quicksort: –worst-case complexity is  (n 2 ). –average-case complexity is  (n lg n) –probabilistic analysis is also  (n lg n) Yet Quicksort is often considered superior.

CSE Quick&Heap3 Priority queue A “task” is an object with a “key” field Key of task x is x.key or key(x) (or whatever style you like). We won’t be concerned with the other fields. A (max-) priority queue is a data structure that has the following operations: –Insert(S, x) - add task x to the queue S. –Extract-Max(S) - return the task with largest key and remove it from the queue. –Max(S) - return task with largest key (don’t remove it). –Increase-Key(S,x,k) – Increase x’s key to be k (return error code if x.key is already > k.)

CSE Quick&Heap4 Heaps can implement priority queues A heap is a binary tree: All levels except bottom are completely filled in. Bottom level is filled in from left (no holes). Has heap property: parent’s key  either child’s A heap can be stored in an array H: Root is H[1]. Left child of H[k] is H[2k] Right child of H[k] is H[2k+1]

CSE Quick&Heap5 Heaps can implement priority queues Insert(S, x) - add task x to the queue S. Add x as new last node. “Bubble up” to re-establish heap property. Extract-Max(S) - return the task with largest key and remove it from heap. Pull task from top of heap (it has largest key). Replace it with the last node of heap. “Bubble down” (heapify) to re-establish heap property. Increase-Key(S, x, k) – Increase x’s key to be k. Report error if x.key > k Set x.key = k. “Bubble up” to re-establish heap property.

CSE Quick&Heap6 Exercise Pick a random permutation of {1,2,3,4,5,6} Insert these priorities into heap in the chosen order. Now do two Extract-Max’s.

CSE Quick&Heap7 Heapsort Insert all the n data items into a heap, then extract them all. Insert and Extract-Max operations use at most c lg n time. What property of binary trees does this use? (Aside: if it helps, we have a theorem, “If T is a non-empty binary tree of height h, then T has fewer than 2 h+1 nodes.”) So T(n) = time to sort n items < 2 n c lg n, T(n)  O(n lg n).

CSE Quick&Heap8 Build-Heap Builds heap from set S using O(n) operations (n=|S|). Still, Heapsort's asymptotic complexity is O(n lg n), since you need n Extract-Max’s. Stuffs S into a binary tree, then massages it from last parent to first to establish heap property. No comparisons needed for leaves. Each node at level h-i needs at most 2i comparisons. Comparisons bounded by: n/2 x 2x1 + n/4 x 2x2 + n/8 x 2x x 2x lg n = 2n (1/2 + 2/4 + 3/8 + 4/ lg n/n)

CSE Quick&Heap9 Summing i/2 i 1/2 + 1/4 + 1/8 + 1/16 + 1/  1 1/4 + 1/8 + 1/16 + 1/   1/2 1/8 + 1/16 + 1/  1/4 1/16 + 1/   1/ /2 + 2/4 + 3/8 + 4/16 + 5/   2

CSE Quick&Heap10 The two ways to build a heap Use Build-Heap T(n) < 2n (1/2 + 2/4 + 3/ ) = 2n x 2 = 4n Make repeated calls on Insert(H,x) In the worst case, each insertion requires “bubbling up” all the way to root. For half the nodes, this takes (lg n)-2 comparisons. So T(n) > (n/2) (lg n – 2), i.e. T(n)   (n lg n) (Average case may not be so bad.) Intuition why Build-Heap is (or might be) better: Most nodes in a heap are close to the leaves. Most nodes in a heap are far from the root.

CSE Quick&Heap11 Quicksort Basic idea: Split w.r.t A[1] Arrange A as: small elements (  A[1]) big (  A[1]) A[1] Recursively arrange “small” part and “big” parts. Doesn’t need extra array. Keep pointers to current ends of small & big parts. small unpartitioned big “Pick up” A[1] (splitter) and A[n] (current element), leaves space in to deposit element in either part drop current element appropriately; pick up next innermost.

CSE Quick&Heap12 Worst-case Quicksort complexity What happens if A is already sorted? T(n) = T(1) + T(n-1) + (n-1) T(n) = (n-1) + (n-2) +... = n(n-1)/2 Similar problem if A is nearly sorted. Is this unlikely? Can a “hack” help? E.g., splitter = median(first, middle, last)?

CSE Quick&Heap13 Average-case Quicksort complexity Intuitively, hope that on “random” input, most of the splits aren’t too uneven. –If, say, 50% of time, splits are no worse than 1/10 vs 9/10, you might be OK As long as the bad splits are evenly spread around, this kind of looks like the recurrence: T(n) < T(n/10) + T(9n/10) + 2n Actually, on random input, things are more even. But this is far from a proof!

CSE Quick&Heap14 Digression: Probability A sample space S is a set of “elementary events”. An event is a subset of S. A probability distribution is a function Pr from events to real numbers in [0,1] which satisfies certain properties. If S is discrete (finite or countably infinite), these properties amount to: –If s  S, Pr{s} =  Pr{e}, where sum is over e  s. –Pr{S} = 1. If |S| = n and Pr{e} = 1/n for all e  S, Pr is called uniform. Note: We write Pr{s} or Pr{e} rather than Pr(s) or Pr({e}).

CSE Quick&Heap15 Probability factoids Probabilities are often abused What does “There’s a 30% chance of rain” mean ?? It is not possible to have a uniform probability space on  or  (or any countably infinite set). –“Pick n  with equal probability” is meaningless. There are 3 reasons for using uniform probabilities: 1.You control selection of events and ensure uniformity. 2.Someone else assures uniformity (you can shift blame.) 3.You can’t think of anything better. Reason #3 is a lousy reason!!

CSE Quick&Heap16 Random Variables A (discrete) random variable X is a function from elementary events in a sample space to . Examples: –X(p) = p’s height for p  S = people in this class. –R n (I) = algorithm’s runtime on instance I of size n. Notation: “X>72” is the event { p  S | X(p)>72 }. X+Y is function (X+Y)(e) = X(e) + Y(e). The expectation E[X] of X is  X(e) Pr{e} E[X] is the average, weighted by the probabilities. Theorem: E[X+Y] = E[X] + E[Y]. (Proof: exercise) eSeS

CSE Quick&Heap17 Average-case complexity Let P n = {I 1, I 2,..., I k } be set of instances of size n P n is the sample space. Assume uniform probability distribution (Pr{I} = 1/k). What’s the justification? For I in P n, let R n (I) be the algorithm’s running time R is a random variable. Average-case complexity T(n) is E[R n ]. Expected value of the random variable R n. Fancy way of saying “average of R n (I 1 ), R n (I 2 ),..., R n (I k )”.

CSE Quick&Heap18 Average-case Quicksort complexity Let S n be a set of n items to be sorted. Let P n be the set instances of size n of sorting. How many elements are there in P n ? For each x, y in S n, and I in P n, define C xy (I) = 1 if Quicksort compares x to y 0 otherwise. C xy is a random variable on P n. Note that R n (I) =  C xy (I). [Sum is over all x and y] So T(n) = E[R n ] = E[  C xy ] =  E[C xy ]. by definition by Theorem

CSE Quick&Heap19 Important Quicksort Insight: Let x & y be two elements with x  y. –If Quicksort picks any splitter z with x  z  y before it picks either x or y, then it never compares x to y. –Assume it picks the splitters randomly from all the candidates in an unpartitioned group. –If x and y are k places apart in the final sorted order, the chance of picking one of x or y before picking z between them is 2/(k+1). (Intuitively, the further apart two elements are in the final order, the less likely they will be compared.)

CSE Quick&Heap20 Average-case Quicksort complexity “If x and y are k places apart in the final sorted order, the chance of picking one of x or y before picking something between them is 2/(k+1).” In other words, E[C xy ] = 2/(k+1) 1 pair x,y are n-1 apart, i.e. have E[C xy ] = 2/(k+1), 1 x 2/n 2 pairs are n-2 apart, 2 x 2/(n-1) n-1 pairs 1 apart, (n-1) x 2/2 T(n) is the sum  (n lg n)

CSE Quick&Heap21 Randomized algorithm Quicksort has “bad” problem instances. A randomized algorithm makes random choices after the instance is selected. –E.g. it can choose splitter by flipping coins. –Algorithm can ensure each possible choice has equal probability. –An adversary can’t find any particularly bad instance.

CSE Quick&Heap22 Average vs Probabilistic Complexity Average case complexity (of deterministic algorithm) Sample space is P n, the set of instances of size n For I in P n, let R n (I) be the algorithm’s running time R is a random variable. Average-case complexity T(n) is E[R n ] (i.e., average R n (I j )) Probabilistic complexity (of randomized algorithm) For each instance I in P n, let S I (x 1,x 2,...,x k ) be the running time on I when the random choices are x 1,x 2,...,x k. Sample space is choice of randomization What probabilities? What’s the justification?? S I is a random variable (for each I). Probabilistic complexity T(n) is Max{E[S I ]} Intuitively: the average runtime of the hardest instance. I  P n

CSE Quick&Heap23 Probabilistic analysis of randomized Quicksort For each instance of sorting, randomized Quicksort has expected time  (n lg n). –The same analysis (actually, easier) as for that average time of (non-randomized) Quicksort. Warning: Result only holds for “truly” random choices of pivot elements. Amazing paper by Karloff & Raghavan shows: for any standard linear congruential pseudo-random number generator (e.g. Unix’s “rand”), there is a (carefully constructed) “bad” sorting instance that, averaged over all PRNG “seeds” has expected time O(n 2 )

CSE Quick&Heap24 Why use Quicksort? “Quicksort has tight code, so the hidden constant factor in its running time is small” (Text, pg 125). –Doesn’t Heapsort also?? “[Quicksort] works well even in virtual memory environments.” (Text, pg 145). –We’ll see what that means!

CSE Quick&Heap25 Glossary (in case symbols are weird)  Ísubset  element of "for all  there exists Qbig theta  big omega  summation ³ >=  <=  about equal ¹not equal  natural numbers(N)  reals(R)  rationals(Q)  integers(Z)