1 Hiring Problem and Generating Random Permutations Andreas Klappenecker Partially based on slides by Prof. Welch.

Slides:



Advertisements
Similar presentations
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Advertisements

CPSC 411, Fall 2008: Set 10 1 CPSC 411 Design and Analysis of Algorithms Set 10: Randomized Algorithms Prof. Jennifer Welch Fall 2008.
April 2, 2015Applied Discrete Mathematics Week 8: Advanced Counting 1 Random Variables In some experiments, we would like to assign a numerical value to.
Probabilistic Analysis and Randomized Algorithm. Worst case analysis Probabilistic analysis  Need the knowledge of the distribution of the inputs Indicator.
Chapter 5. Probabilistic Analysis and Randomized Algorithms
UMass Lowell Computer Science Analysis of Algorithms Spring, 2002 Chapter 5 Lecture Randomized Algorithms Sections 5.1 – 5.3 source: textbook.
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different approaches –Probabilistic analysis of a deterministic algorithm –Randomized.
Analysis of Algorithms CS 477/677 Randomizing Quicksort Instructor: George Bebis (Appendix C.2, Appendix C.3) (Chapter 5, Chapter 7)
Comp 122, Spring 2004 Elementary Sorting Algorithms.
Probabilistic (Average-Case) Analysis and Randomized Algorithms Two different but similar analyses –Probabilistic analysis of a deterministic algorithm.
Ch. 7 - QuickSort Quick but not Guaranteed. Ch.7 - QuickSort Another Divide-and-Conquer sorting algorithm… As it turns out, MERGESORT and HEAPSORT, although.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 4 Comparison-based sorting Why sorting? Formal analysis of Quick-Sort Comparison.
Probabilistic Analysis and Randomized Algorithms
CS421 - Course Information Website Syllabus Schedule The Book:
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
Qsort - 1 Lin / Devi Comp 122 d(n)d(n) Today’s puzzle  How far can you reach with a stack of n blocks, each 2 units long?
Algorithm Efficiency and Sorting
Data Structure Algorithm Analysis TA: Abbas Sarraf
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
1 Randomized Algorithms Andreas Klappenecker [using some slides by Prof. Welch]
1 QuickSort Worst time:  (n 2 ) Expected time:  (nlgn) – Constants in the expected time are small Sorts in place.
Design & Analysis of Algorithms COMP 482 / ELEC 420 John Greiner.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
CSCE 2100: Computing Foundations 1 Probability Theory Tamara Schneider Summer 2013.
Randomized Algorithms (Probabilistic algorithm) Flip a coin, when you do not know how to make a decision!
Reading and Writing Mathematical Proofs
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Analysis of Algorithms
CSCI 115 Chapter 3 Counting. CSCI 115 §3.1 Permutations.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
3. Counting Permutations Combinations Pigeonhole principle Elements of Probability Recurrence Relations.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
BY Lecturer: Aisha Dawood. The hiring problem:  You are using an employment agency to hire a new office assistant.  The agency sends you one candidate.
CSCE 411 Design and Analysis of Algorithms Set 9: Randomized Algorithms Prof. Jennifer Welch Fall 2014 CSCE 411, Fall 2014: Set 9 1.
Introduction to Algorithms Jiafen Liu Sept
Slide 15-1 Copyright © 2004 Pearson Education, Inc.
CSC 211 Data Structures Lecture 13
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
Fundamentals of Algorithms MCS - 2 Lecture # 15. Bubble Sort.
Randomized Algorithms CSc 4520/6520 Design & Analysis of Algorithms Fall 2013 Slides adopted from Dmitri Kaznachey, George Mason University and Maciej.
Probabilistic Analysis and Randomized Algorithm. Average-Case Analysis  In practice, many algorithms perform better than their worse case  The average.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
Quicksort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
Sampling and estimation Petter Mostad
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
ICS 353: Design and Analysis of Algorithms
Introduction to Algorithms Randomized Algorithms – Ch5 Lecture 5 CIS 670.
27-Jan-16 Analysis of Algorithms. 2 Time and space To analyze an algorithm means: developing a formula for predicting how fast an algorithm is, based.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Great Theoretical Ideas in Computer Science for Some.
Master Method (4. 3) Recurrent formula T(n) = a  T(n/b) + f(n) 1) if for some  > 0 then 2) if then 3) if for some  > 0 and a  f(n/b)  c  f(n) for.
CSC317 1 Randomized algorithms Hiring problem We always want the best hire for a job! Using employment agency to send one candidate at a time Each day,
Review 1 Merge Sort Merge Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
Theory of Computational Complexity M1 Takao Inoshita Iwama & Ito Lab Graduate School of Informatics, Kyoto University.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting Input: sequence of numbers Output: a sorted sequence.
Applied Discrete Mathematics Week 11: Relations
CS 583 Analysis of Algorithms
Ch 7: Quicksort Ming-Te Chi
CSCE 411 Design and Analysis of Algorithms
Master Method (4. 3) Recurrent formula T(n) = aT(n/b) + f(n)
CS200: Algorithm Analysis
CS 583 Analysis of Algorithms
Topic 3: Prob. Analysis Randomized Alg.
Presentation transcript:

1 Hiring Problem and Generating Random Permutations Andreas Klappenecker Partially based on slides by Prof. Welch

2 You need to hire a new employee. The headhunter sends you a different applicant every day for n days. If the applicant is better than the current employee then fire the current employee and hire the applicant. Firing and hiring is expensive. How expensive is the whole process?

3 Worst case is when the headhunter sends you the n applicants in increasing order of goodness. Then you hire (and fire) each one in turn: n hires.

4 Best case is when the headhunter sends you the best applicant on the first day. Total cost is just 1 (fire and hire once).

5 What about the average cost? An input to the hiring problem is an ordering of the n applicants. There are n! different inputs. Assume there is some distribution on the inputs for instance, each ordering is equally likely but other distributions are also possible Average cost is expected value…

6 We want to know the expected cost of our hiring algorithm, in terms of how many times we hire an applicant Elementary event s is a sequence of the n applicants Sample space is all n! sequences of applicants Assume uniform distribution, so each sequence is equally likely, i.e., has probability 1/n! Random variable X(s) is the number of applicants that are hired, given the input sequence s What is E[X]?

7 Break the problem down using indicator random variables and properties of expectation Change viewpoint: instead of one random variable that counts how many applicants are hired, consider n random variables, each one keeping track of whether or not a particular applicant is hired. Indicator random variable X i for applicant i: 1 if applicant i is hired, 0 otherwise

8 Important fact: X = X 1 + X 2 + … + X n number hired is sum of all the indicator r.v.'s Important fact: E[X i ] = Pr["applicant i is hired"] Why? Plug in definition of expected value. Probability of hiring i is probability that i is better than the previous i-1 applicants…

9 Suppose n = 4 and i = 3. In what fraction of all the inputs is the 3rd applicant better than the 2 previous ones? /24 = 1/3

10 In general, since all permutations are equally likely, if we only consider the first i applicants, the largest of them is equally likely to occur in each of the i positions. Thus Pr[X i = 1] = 1/i.

11 Recall that X is random variable equal to the number of hires Recall that X = the sum of the X i 's (each X i is the random variable that tells whether or not the i-th applicant is hired) E[X] = E[ ∑ X i ] = ∑ E[X i ], by property of E = ∑ Pr[X i = 1], by property of X i = ∑ 1/i, by argument on previous slide ≤ ln n + 1, by formula for harmonic number

12 So average number of hires is ln n, which is much better than worst case number (n). But this relies on the headhunter sending you the applicants in random order. What if you cannot rely on that? maybe headhunter always likes to impress you, by sending you better and better applicants If you can get access to the list of applicants in advance, you can create your own randomization, by randomly permuting the list and then interviewing the applicants. Move from (passive) probabilistic analysis to (active) randomized algorithm by putting the randomization under your control!

13 Instead of relying on a (perhaps incorrect) assumption that inputs exhibit some distribution, make your own input distribution by, say, permuting the input randomly or taking some other random action On the same input, a randomized algorithm has multiple possible executions No one input elicits worst-case behavior Typically we analyze the average case behavior for the worst possible input

14 Suppose we have access to the entire list of candidates in advance Randomly permute the candidate list Then interview the candidates in this random sequence Expected number of hirings/firings is O(log n) no matter what the original input is

15 Probabilistic analysis of a deterministic algorithm: assume some probability distribution on the inputs Randomized algorithm: use random choices in the algorithm Probabilistic Analysis versus Randomized Algorithm

Generating Random Permutations

17 input: array A[1..n] for i := 1 to n do j := value in [i..n] chosen uniformly at random swap A[i] with A[j] How to Randomly Permute an Array

18 Show that after i-th iteration of the for loop: A[1..i] equals each permutation of i elements from {1,…,n} with probability (n– i)!/n! Basis: After first iteration, A[1] contains each permutation of 1 element from {1,…,n} with probability (n–1)!/n! = 1/n true since A[1] is swapped with an element drawn from the entire array uniformly at random

19 Induction: Assume that after (i–1)-st iteration of the for loop A[1..i–1] equals each permutation of i–1 elements from {1,…,n} with probability (n– (i–1))!/n! The probability that A[1..i] contains permutation x 1, x 2, …, x i is the probability that A[1..i–1] contains x 1, x 2, …, x i–1 after the (i–1)-st iteration AND that the i-th iteration puts x i in A[i].

20 Let e 1 be the event that A[1..i–1] contains x 1, x 2, …, x i–1 after the (i–1)-st iteration. Let e 2 be the event that the i-th iteration puts x i in A[i]. We need to show that Pr[e 1  e 2 ] = (n– i)!/n!. Unfortunately, e 1 and e 2 are not independent: if some element appears in A[1..i –1], then it is not available to appear in A[i].

21 Recall: e 1 is event that A[1..i–1] = x 1,…,x i–1 Recall: e 2 is event that A[i] = x i Pr[e 1  e 2 ] = Pr[e 2 |e 1 ]·Pr[e 1 ] Pr[e 2 |e 1 ] = 1/(n–i+1) because x i is available in A[i..n] to be chosen since e 1 already occurred and did not include x i every element in A[i..n] is equally likely to be chosen Pr[e 1 ] = (n–(i–1))!/n! by inductive hypothesis So Pr[e 1  e 2 ] = [1/(n–i+1)]·[(n–(i–1))!/n!] = (n–i)!/n!

22 After the last iteration (the n-th), the inductive hypothesis tells us that A[1..n] equals each permutation of n elements from {1,…,n} with probability (n–n)!/n! = 1/n! Thus the algorithm gives us a uniform random permutation.