Independence and Dependence 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University
Contents In class warm up problem Independence and Dependence In-class exercise Conditional Probability Monty Hall Problem 2
Mini-Exercise 3 Can you build a model for the snakes and ladder game? In this game, model only those transitions that cause you to reach square 6 or less. If a roll caused you to go past 6, disregard the roll. There is no need to model such a transition (Start) 23 Remember the elements of a model The game starts at square 1. A node or a circle that indicates a particular square in the game An arrow or an edge that indicates the possibility of reaching another square from the current square A label on each arrow that indicates the probability of that transition taking place
/6
Mini-Exercise - 4 Using the model of the snakes and ladders, calculate the following values The probability of reaching square 4 from square 1 Ans: 49/216 The probability of reaching square 5 OR square 6 from square 2 Ans: 631/1296 The probability of reaching square 3 AND square 5 from square 1 Ans: 49/1296 5
Let us start with the transition model we built Hint: New edge(s) might be added as an extension. If you land on square 6, the snake takes you automatically to square 3. So you can never “reach” square 6. Mini-Exercise /6
Solution to Exercise / /6 1/6
Mini-Exercise - 6 Using the new model of the snakes and ladders with the snake, calculate the following values The probability of reaching square 4 from square 1 Ans: 49/216 The probability of reaching square 5 OR square 6 from square 2 Ans: 49/216 (Remember, you cant reach 6) The probability of reaching square 3 AND square 5 from square 1 Ans: 49/1296 8
Sum Rule and Relative Frequency 9 Consider a die D1 Consider ‘n’ rolls of D1 D1 : …. n 2 = number of times D1 is 2 = 3 + … n 1 = number of times D1 is 1 = 3 + … n 6 = number of times D1 is 6 = 2 + … Relative frequency of D1 is 1 = (n 1 /n) Relative frequency of D1 is 2 = (n 2 /n) Relative frequency of D1 is 6 = (n 6 /n) N even = number of times D1 is even = 9 + … In the same experiment, Relative frequency of D1 is even = (n even /n) But n even = n 2 + n 4 + n 6 Total = n
10 So relative frequency of D1 is even = (n even /n) = (n 2 + n 4 + n 6 )/n = (n 2 /n) + (n 4 /n) + (n 6 /n) Sum Rule and Relative Frequency Therefore relative frequency of D1 is even = relative frequency of D1 is 2 + relative frequency of D1 is 4 + relative frequency of D1 is 6 p(D1 is even) = p(D1 is 2 OR 4 OR 6) = p(D1 is 2) + p(D1 is 4) + p(D1 is 6)
Take Home Exercise - 1 The sum rule for one die with 6 outcomes and the favorable outcome being as an even number is Question: Derive the sum rule using relative frequencies where the experiment can have K total outcomes and there are F different favorable outcomes. 11 p(D1 is even) = p(D1 is 2 OR 4 OR 6) = p(D1 is 2) + p(D1 is 4) + p(D1 is 6)1
Contents In class warm up problem Independence and Dependence In-class exercise Conditional Probability Monty Hall Problem 12
Independence and Dependence Till now, we have been looking at events individually. But in practice, events are interrelated. Some events are dependent on some other event taking place Let us consider the following example Let us play a small game. Bob is rolling a die. He asks Alice to guess what he rolled. 1.What is the probability that Alice is correct ? 2.Let us say that the Oracle told Alice that the outcome of the die is an even number or an odd number. Then what is the probability that Alice is correct ? 13
There is an underlying mathematical concept that can be explicitly stated to calculate the answer to the question Consider an experiment. The outcome of the experiment can be specified in terms of two different event spaces Event 1 Event 2 Event 3 … Event Even Event Odd Knowledge of the outcome of the roll of a die in terms of whether it is an even number or an odd number allows us to predict the actual outcome more precisely p p1p1 p2p2 p3p3 … p P even P odd 14 Let us calculate how much it improves our quality of our guess
Contents In class warm up problem Independence and Dependence In-class exercise Conditional Probability Monty Hall Problem 15
Exercise-5 Let us prove this concept using an exercise Divide yourselves into groups of two and give yourselves the distinguishable names Player A and Player B The game proceeds in two phases Phase 1 Player A rolls a die Player B has to guess the outcome Phase 2 Player A suggests to Player B whether the outcome was an odd or an even number Player B guesses again Repeat the game 20 times Swap the Player A and Player B rolls and repeat 16
Pseudo Random Number Generator For exercises of this type, You do not have to roll the die all the times You can use the pseudo random number generator at this link This automatically generates random numbers between the values that you specify 17
Data CASETotal number of guesses Correct Guesses Probability of correct guess* Without suggestion With suggestion Collect the data in the following form RollGuess before suggestion Guess after suggestion Correct outcome … Consolidate data from the above table in the following form 18 * Use the frequentist definition
Lessons Learnt What did you observe in the probabilities? The probability improves after some additional information Why do they improve ? Because the additional information discards some events and the size of the event space decreases So what is the relation between the information and the events ? There is a “dependence” of the events to this information Can you mathematically show what the dependence is ? The answer is “Conditional Probability” 19
Product Rule and Relative Frequency 20 Consider two dice, D1 and D2 Consider multiple simultaneous rolls of the two dice D1 : ….. D2 : ….. p 1 = number of times D1 is 1 = 3 + … p 2 = number of times D1 is 2 = 2 + … p 6 = number of times D1 is 6 = … q 1 = number of times D2 is 1 = 2 + … q 2 = number of times D2 is 2 = 1 + … q 6 = number of times D2 is 6 = … r 11 = number of times D1 is 1and D2 is 1 = 1 + … r 23 = number of times D1 is 2 and D2 is 3 = 1 + … r 66 = number of times D1 is 6 and D2 is 6 = … Total = p Total = q Total = r
Product Rule and Relative Frequency 21 Consider two dice, D1 and D2 Consider multiple simultaneous rolls of the two dice D1 : ….. D2 : ….. As the total number of trials of D1 and D2 are the same p = q = r The share of trials in which D2 is 1 with respect to all trials = (q 1 /q) The share of trials in which D2 is 1 with respect to all trials in which D1 is 1= (r 11 /p 1 ) As the two dice are independent, for a large number of trials the share of trials in which D2 is 1 should not be affected by the fact that D1 is 1 Therefore, (q 1 /q) = (r 11 /p 1 ) p1p1 r 11
22 Product Rule and Relative Frequency Therefore, we have (q 1 /q) = (r 11 /p 1 ) Relative frequency of (D1 is 1, D2 is 1) = (r 11 /r) = (p 1 (q 1 /q))/r = (p 1 /r) (q 1 /q) = (p 1 /p) (q 1 /q) = Relative frequency of D1 is 1 * Relative frequency of D2 is 1 By rearranging the terms, r 11 = p 1 (q 1 /q) Probability of D1 is 1 AND D2 is 1 = Probability of D1 is 1 * Probability of D2 is 1
Take Home Exercise - 2 The product rule for two dice with 6 outcomes for each dice is Question: Derive the product rule using relative frequencies where the there are N independent experiments with K outcomes for each experiment. 23 Probability of D1 is 1 AND D2 is 1 = Probability of D1 is 1 * Probability of D2 is 1
Contents In class warm up problem Independence and Dependence In-class exercise Conditional Probability Monty Hall Problem 24
Conditional Probability Consider the same situation Consider an experiment. The outcome of the experiment can be specified in terms of two different event spaces Event A Event B Event C … Event 1 Event 2 Event 3 … p pApA pBpB pCpC … p p1p1 p2p2 p3p3 … The refined probability of Event A when information about Event 1 is given is written as P(Event A | Event 1) { read as probability of Event A given Event 1} 25
Consider the die In the die example For example, a die is rolled. Through the knowledge of the Oracle, you know that the outcome is an even number. What is the probability that the outcome is 2 ? Now because there are only even numbers as possible events, the event space of the die is Event 2 Event 4 Event 6 As all these events are equally likely, the probability that the event 2 occurs is 1/3. 26
Conditional Probability Thus conditional probability can be defined as follows If event A is dependent on another event B, then the probability of event A given knowledge about event B is For the die problem P(Die rolled a 2 | Die rolled an even number) = P(Die rolled 2 and Die rolled even) = 1/6 = 1/3!! P(Event A | Event B) = P(Event A and Event B occurring) P(Event B occurring) 27 P (Die rolled even) 1 / 2
Intuition for conditional probability Let us try to find an equation for conditional probability. For example, let us “Event A” and “Event 1” occur simultaneously “Event A and Event 1 occurred simultaneously” is same as “Event 1 occurred” and “Event A occurring given Event 1 occurred” (Or vice versa). P(Event A and Event 1 Occurred) = P(Event 1 occurred)P(Event A Occurred | Event 1 Occurred) P(Event A Occurred | Event 1 Occurred) = P(Event A and Event 1 Occurred) 28 Event A Event B Event C … Event 1 Event 2 Event 3 … P(Event 1 Occurred)
Exercise - 6 Consider the following experiment There are two players involved in the game The first player rolls a pair of dice The second player has to guess the two outcomes Player A informs Player B of the sum Player B guesses again Calculate the probability of correctness in the both the cases using conditional probability (i)Before knowing the sum (ii)After knowing the sum 29
Data CASETotal number of guesses Correct Guesses Probability of correct guess* Without suggestion With suggestion Collect the data in the following form RollGuess before suggestion Guess after suggestion Correct outcome … Consolidate data from the above table in the following form 30 * Use the frequentist definition
Using conditional probability 31 Calculate the probability of correctness in the both the cases using conditional probability (i) Before knowing the sum (ii) After knowing the sum Recall the formula P(Event A | Event B) = P(Event A and Event B occurring simultaneously) P(Event B occurring)
32 Case 1: So for the first part of the question without the knowledge of the sum Define the favorable event as Event A : (1,5) Total number of events = 6*6 = 36 Number of favorable events = 1 Therefore, P(Event A) = 1/36 Case 2: For the second part of the question when you know the sum, Total number of events is less than 36 because of the knowledge of the sum For example, Event B: Player A informs Player B that sum =6, then the total events space is reduced to { (1,5), (2,4), (3,3), (4,2),(5,1) } First, let us try to solve the question in the conventional way using the total number of events and the number of favorable events P(Event A |Event B) = 1/5
33 Now let us use conditional probability to solve for the same answer Let Event A: The two dice having the event guessed by Player B Event B: The sum of the two dice being what Player A informed Player B Using the same example of sum = 6 Let us say that Player B guessed as (1,5) P(Event A) = 1/36 = P(Event A AND Event B) P(Event B) = P(sum of two dice having sum as 6) = 5/36 For sum = 6 Total number of events = 36 Favorable events = (1,5), (2,4), (3,3), (4,2),(5,1) P(Event A | Event B) = P(Event A and Event B) P(Event B) = (1/36) / (5/36) = 1/5
Independence and Conditional Probability 34 We introduced conditional probability to explain the magnitude of dependence of one random variable upon another. What is the conditional probability of a random variable X given another random variable Y, if X and Y are independent ? Let us see… If X and Y are independent, then the outcome of Y should not have any effect on the outcome of X. Therefore given the information about Y, the probability of X will not be affected. Therefore, p(X=x | Y=y) = p(X=x)
Independence of two random experiments 35 Two random events can be shown as independent in another way also. Consider two random experiments with the following event spaces Event A Event B Event C … Event 1 Event 2 Event 3 … Random variable XRandom variable Y
36 Recall that while discussing the method of intersection of events we mentioned that for the rule to apply the events should be independent The method of intersection of events stated that “The probability of two independent events occurring simultaneously is equal to the product of probability of individual events” But the most important condition for that to be true is that the two events should be independent Therefore another way of checking independence of two experiments is : Random variables X and Y are independent if and only if For every
Exercise 7 37 EventXprobability Head10.5 Tail00.5 EventYprobability Head10.5 Tail00.5 Eventprobability Head – Head0.25 Head – Tail0.3 Tail – Head0.3 Tail – Tail0.15 Can you check if the two random variables are independent using the formulation we just discussed? Random variable XRandom variable Y Joint experiment with Random variable X+Y
Contents In class warm up problem Independence and Dependence In-class exercise Conditional Probability Monty Hall Problem 38
Exercise 8 The Monty Hall problem is a probability puzzle based on the American television game show Let's Make a Deal. Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say Number 3, which has a goat. He then says to you, "Do you want to pick door Number 2?" Is it to your advantage to switch your choice? 39
Developing the intuition Do you think the host opening the door is independent of your choice of the door? Ans: NO! Hint: If you choose a door with a goat, the host MUST open the other door with the goat If you choose a door with a car, the host can pick one of the 2 doors with the goat. So the host opening the door is DEPENDENT on your choice of the door!! Now try to solve the problem!! 40
Solution Imagine you doing this experiment 99 times. Suppose you are asked to choose a door. No. of times you will choose a door with a goat = (99*2)/3 = 66 times. No. of times you will choose a door with a car = (99*1)/3 = 33 times. Case 1: You don’t switch No. of times you win the car after host opens door with a goat = No. of times you chose a door with a car in the first place = 33 times. Probability of winning without switching = 33/99 = 1/3. 41
Solution Case 2: You switch Important Intuition Case 2.1: If you choose a door with a goat, The host MUST choose the other door with the goat As a result, the third unopened door MUST have the car. Clearly, switching = Winning the car Case 2.2: If you choose a door with a car, Clearly, switching does not help at all! Therefore, the number of times you will win if you switch = Number of times you choose a door with a goat in the first place = 66 times!!! (2x33 times) Result: Switching doubles your chance of winning!! 42
Exercise-9 The game You will be given three cups There will be a marble under one of them. The rest of the two will be empty Divide yourselves into groups of 2 One of the two players would be the host for the first 10 rounds Then swap roles First play 10 rounds each without changing your choice after the first cup is opened Then play another 10 rounds by changing your choice Compare the probabilities in both the cases 43
Data GameFirst ChoiceFirst Cup opened Changed choice (if any) Second cup opened Correct cup … From the above data calculate the probability of each case (i)No change in choice (ii)Choice changed 44
Observations What did you observe ? Was the case where you changed your choice better or worse ? Can you mathematically explain the correct answer to the above question and also show by how much? Hint: Use conditional probabilities 45
END 46
47 Let us see if we can prove this using conditional probability Define 1,2,3 = The three doors Event C : Contestant choosing Door1 Event H : Host opening the Door 3 Probability(winning) = P(1 has car) = P(2 has car) = P(3 has car) = 1/3 Before Event C and Event H After contestant choosing Door 1 Probability(winning | Event C) = P(1 has car) = 1/3 Probability(losing | Event C) = P(2 has car) + P(3 has car) = 2/3
48 After Event H, that is HOST opening Door 3 with a goat behind it Now there are two choices for the contestant. Either switch or not switch Probability(winning | Event C and Event H and Not switch) = P(1 has car) = 1/3 Probability(losing | Event C and Event H and Not switch) = P(2 has car) + P(3 has car) = 2/3 But P(3 has car) = 0 That means, Probability(losing | Event C and Event H and Not switch) = P(2 has car) = 2/3 Therefore, as we can see that P(1 has car) = 1/3 P(2 has car) = 2/3 It is doubly advantageous to switch choice than to stay.
Let us analyze this using conditional probability 49 Door 1Door 2Door 3 Let us say that the contestant chooses Door 1 The host opens the door with the goat (say Door 3). Probability Of winning = 1/3 Probability of losing = 2/3 Now there are two choices for the contestant. Either switch or stay with the previous choice.
50 Let us analyze both the cases. CASE 1: If the contestant stays with the previous choice Door 1Door 3Door 2 Probability Of winning = 1/3 Probability of losing = 2/3 Because Door 3 has been opened. We can remove that from the choices.
51 CASE 2: If the contestant switches the choice Door 1Door 2 Probability Of winning Probability of losing Therefore switching choice is better than staying with the previous choice. 1/3 2/3