Introduction to Algorithms Jiafen Liu Sept. 2013.

Slides:



Advertisements
Similar presentations
©2001 by Charles E. Leiserson Introduction to AlgorithmsDay 9 L6.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 6 Prof. Erik Demaine.
Advertisements

Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
Analysis of Algorithms
David Luebke 1 4/22/2015 CS 332: Algorithms Quicksort.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
CSE 3101: Introduction to the Design and Analysis of Algorithms
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Introduction to Algorithms Jiafen Liu Sept
Lecture 2: Divide and Conquer algorithms Phan Thị Hà Dương
Analysis of Algorithms CS 477/677 Sorting – Part B Instructor: George Bebis (Chapter 7)
Spring 2015 Lecture 5: QuickSort & Selection
Chapter 5. Probabilistic Analysis and Randomized Algorithms
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized.
Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)
Quicksort Ack: Several slides from Prof. Jim Anderson’s COMP 750 notes. UNC Chapel Hill1.
Analysis of Algorithms
Updated QuickSort Problem From a given set of n integers, find the missing integer from 0 to n using O(n) queries of type: “what is bit[j]
Quicksort Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
September 19, Algorithms and Data Structures Lecture IV Simonas Šaltenis Nykredit Center for Database Research Aalborg University
CS 253: Algorithms Chapter 7 Mergesort Quicksort Credit: Dr. George Bebis.
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different but similar analyses –Probabilistic analysis of a deterministic algorithm.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 4 Comparison-based sorting Why sorting? Formal analysis of Quick-Sort Comparison.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Quicksort CIS 606 Spring Quicksort Worst-case running time: Θ(n 2 ). Expected running time: Θ(n lg n). Constants hidden in Θ(n lg n) are small.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
1 QuickSort Worst time:  (n 2 ) Expected time:  (nlgn) – Constants in the expected time are small Sorts in place.
Computer Algorithms Lecture 10 Quicksort Ch. 7 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Chapter 7 Quicksort Ack: This presentation is based on the lecture slides from Hsu, Lih- Hsing, as well as various materials from the web.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
10/13/20151 CS 3343: Analysis of Algorithms Lecture 9: Review for midterm 1 Analysis of quick sort.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
BY Lecturer: Aisha Dawood. The hiring problem:  You are using an employment agency to hire a new office assistant.  The agency sends you one candidate.
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
1Computer Sciences Department. Book: Introduction to Algorithms, by: Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest Clifford Stein Electronic:
David Luebke 1 6/3/2016 CS 332: Algorithms Analyzing Quicksort: Average Case.
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Randomized Algorithms CSc 4520/6520 Design & Analysis of Algorithms Fall 2013 Slides adopted from Dmitri Kaznachey, George Mason University and Maciej.
Order Statistics(Selection Problem)
QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
Introduction to Algorithms Randomized Algorithms – Ch5 Lecture 5 CIS 670.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
David Luebke 1 2/19/2016 Priority Queues Quicksort.
QuickSort. Yet another sorting algorithm! Usually faster than other algorithms on average, although worst-case is O(n 2 ) Divide-and-conquer: –Divide:
CSC317 1 Hiring problem-review Cost to interview (low C i ) Cost to fire/hire … (expensive C h ) n number of candidates m hired O (c i n + c h m) Independent.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
Quick Sort Divide: Partition the array into two sub-arrays
Introduction to Algorithms Prof. Charles E. Leiserson
Order Statistics Comp 550, Spring 2015.
CS 583 Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 332: Algorithms Quicksort David Luebke /9/2019.
Algorithms: Design and Analysis
CS200: Algorithm Analysis
The Selection Problem.
Algorithms CSCI 235, Spring 2019 Lecture 17 Quick Sort II
Presentation transcript:

Introduction to Algorithms Jiafen Liu Sept. 2013

Today’s Tasks Quicksort –Divide and conquer –Partitioning –Worst-case analysis –Intuition –Randomized quicksort –Analysis

Quick Sort Proposed by Tony Hoare in Divide-and-conquer algorithm. Sorts “in place”(like insertion sort, but not like merge sort). Very practical.

Divide and conquer Quicksort an n-element array: Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray ≤ x ≤ elements in upper subarray. Conquer: Recursively sort the two subarrays. Combine: Trivial. Key: ? partitioning subroutine.

Example of Partition

Please write down the algorithm of partition an array A between index p and q.

Partitioning subroutine P ARTITION (A, p, q) //A[p.. q] x←A[p] //pivot= A[p] i←p for j← p+1 to q do if A[j] ≤x then i←i+ 1 exchange A[i] ↔ A[j] exchange A[p] ↔ A[i] return i Running Time = ? Θ (n)

Pseudo-code for Quick Sort QUICKSORT(A, p, r) if p << r then q←PARTITION(A, p,r) QUICKSORT(A, p, q–1) QUICKSORT(A, q+1, r) Initial call: QUICKSORT(A, 1, n) Boundary case: there are zero or one elements. Optimizations: Use another special-purpose sorting routine for small numbers of elements. (tail recursion )

Analysis of Quicksort Let T(n) = worst-case running time on an array of n elements. What is the worst case? –The input is sorted or reverse sorted. –Partition around min or max element. –One side of partition always has no elements.

The Worst Case Under the worst case, how can we compute T(n)? T(n) = T(0)+T(n-1)+Θ(n) = Θ(1)+T(n-1)+Θ(n) = T(n-1)+Θ(n) = ? Can you guess it ?

Recursion Tree T(n) = T(0)+ T(n-1)+ cn

Recursion Tree T(n) = T(0)+ T(n-1)+ cn

Recursion Tree T(n) = T(0)+ T(n-1)+ cn

Recursion Tree T(n) = T(0)+ T(n-1)+ cn

Recursion Tree T(n) = T(0)+ T(n-1)+ cn

Recursion Tree Height = ? n T(n) = T(0)+ T(n-1)+ cn T(n) = Θ(n 2 )+n * Θ(1) = Θ(n 2 )+Θ(n) = Θ(n 2 )

Best-case analysis (For intuition only!) What’s the best case? If we’re lucky, PARTITION splits the array evenly: T(n)= 2T(n/2) + Θ(n) = Θ(nlgn) What if the split is always1/10:9/10? What is the solution to this recurrence?

Analysis of this asymmetric case

… Height = ? T(n) ≥ cnlog 10 n

Analysis of this asymmetric case … Height = ? T(n) ≤ cnlog 10/9 n+O(n) ∴

Another case Suppose we alternate lucky, unlucky, lucky, unlucky, lucky, …. –L(n)= 2U(n/2) + Θ(n) lucky –U(n)= L(n –1) + Θ(n) unlucky Solving: L(n) = 2(L(n/2-1) + Θ(n/2)) + Θ(n) = 2L(n/2 –1) + Θ(n) = Θ(nlgn)

Analysis of Quicksort How can we make sure we are usually lucky? As far as the input is not well sorted, we are lucky. –We can arrange the elements randomly. –We can choose a random element as pivot.

Randomized quicksort IDEA: Partition around a random element. Running time is independent of the input order. No assumptions need to be made about the input distribution. No specific input elicits the worst-case behavior. The worst case is determined only by the output of a random-number generator.

Randomized Quicksort Basic Scheme: pivot on a random element. In the code for partition, before partitioning on the first element, swap the first element with some other element in the array chosen at random. So that, all the elements are all equally to be pivoted on.

Randomized Quicksort Analysis Let T(n) = the random variable for the running time of randomized quicksort on an input of size n, assuming random numbers are independent. For k= 0, 1, …, n–1, define the indicator random variable

Randomized Quicksort Analysis E[X k ] = 1* Pr {X k = 1} +0* Pr {X k = 0} = Pr {X k = 1} = 1/n – since all splits are equally likely.

Randomized Quicksort Analysis By linearity of expectation: –The expectation of a sum is the sum of the expectations. By independence of X k from other random choices. Summations have identical terms. The k = 0, 1 terms can be absorbed in the Θ(n).

Our Objective Prove:E[T(n)] ≤ anlgn for constant a > 0. –Choose a big enough so that anlgn dominates E[T(n)] for sufficiently small n ≥2. –That’s why we absorb k = 0, 1 terms –How to prove that? –Substitution Method

To prove we are going to residual desired if a is chosen large enough so that an/4 dominates the Θ(n).

Advantages of Quicksort Quicksort is a great general-purpose sorting algorithm. Quicksort is typically over twice as fast as merge sort. Quicksort can benefit substantially from code tuning. Quicksort behaves well even with caching in virtual memory.

The Birthday Paradox How many people must there be in a room if there are two of them were born on the same day of the year? How many people must there be in a room if there is a big chance that two of them were born on the same day? Such as probability of more than 50%?

Indicator Random Variable We know that the probability of i's birthday and j's birthday both fall on the same day r is –1/n, n=365 We define the indicator random variable X ij for 1 ≤ i < j ≤ k, by

Indicator Random Variable Thus we have E [X ij ] = Pr {person i and j have the same birthday} = 1/n. Letting X be the random variable that counts the number of pairs of individuals having the same birthday

The Birthday Paradox If we have at least individuals in a room, we can expect two to have the same birthday. For n = 365, if k = 28, the expected number of pairs with the same birthday is (28 · 27)/(2 · 365) ≈

Expanded Content: The hiring problem The employment agency send you one candidate each day. You will interview that person and then decide to either hire that person or not. –You must pay the employment agency fee to interview an applicant. –To actually hire an applicant is more costly. –You are committed to having, at all times, the best possible person for the job. Now we wish to estimate what that price will be.

Algorithm of hiring problem We are not concerned with the running time of HIRE-ASSISTANT, but instead with the cost incurred by interviewing and hiring. The analytical techniques used are identical whether we are analyzing cost or running time. That’s to counting the number of times certain basic operations are executed

Worst Case of hiring problem In the worst case, we actually hire every candidate that we interview. –This situation occurs if the candidates come in increasing order of quality, in which case we hire n times, for a total hiring cost of O(nc h ). we have no idea about the order in which they arrive, nor do we have any control over this order.

Probabilistic analysis Probabilistic analysis is the use of probability in the analysis of problems. –In order to perform a probabilistic analysis, we must make assumptions about the distribution of the inputs. –Then we analyze our algorithm, computing an expected running time. –The expectation is taken over the distribution of the possible inputs.

Randomized algorithms By making the behavior of part of the algorithm random, we can use probability and randomness as a tool for algorithm design and analysis. More generally, we call an algorithm randomized if its behavior is determined not only by its input but also by values produced by a random-number generator.

Using indicator random variables Assume that the candidates arrive in a random order. Let X be the random variable that indicates the number of times we hire a new office assistant. We use indicator random variables to simplify the calculation.

Using indicator random variables (P655 harmonic series)

Probabilistic Analysis and Randomized Algorithms With Probabilistic Analysis and Randomized Algorithms – Your enemy cannot produce a bad input array, since the random permutation makes the input order irrelevant. –The randomized algorithm performs badly only if the random-number generator produces an "unlucky" permutation. –A 1 = –A 2 = –A 3 =