Statistics 200 Lecture #12 Thursday, September 29, 2016 Textbook: Sections 7.3, 7.4, 7.7 Objectives: • Identify complementary events and handle probability calculations • Identify mutually exclusive events and handle probability calculations • Identify independent events and handle probability calculations • Understand that conditioning on a particular individual can change risks • Develop a sense for why seeming coincidences occur frequently • Identify, and resist the temptation to fall for, the “gambler’s fallacy” • Understand a common situation where confusion of the inverse occurs
This week… 7 Thursday Probability Randomness Interpretations of probability Relative Frequency Personal Probability Probability 7 Sample spaces and events Complementary Mutually Exclusive Dependent / independent Thursday Flawed intuition Basic Rules Complement rule Addition rule Multiplication rule More probability practice
Summary of Rules Rule 1: Complement Rule P(A) + P(Ac) = 1 if Ac represents the complement of A Rule 2B: Additive Rule P(A or B) = P(A) + P(B) if Events A and B are mutually exclusive Note: two events that are complements are always mutually exclusive Rule 3B: Multiplication Rule P(A and B) = P(A)×P(B) if Events A and B are independent
Example Maria wants to take French or Spanish, or both. But classes are closed, ands he must apply to enroll in a language class. She has a 60% chance of being admitted to French, a 50% chance of being admitted to Spanish, and a 20% chance of being admitted to both French and Spanish. If she applies to both French and Spanish, the probability that she will be enrolled in either French or Spanish (or possibly both) is…. French Spanish 0.6 P(French) = ______ P(Spanish) = ________ P(French and Spanish) = ______ 0.5 0.2
Example Yes No Clicker Question: Are these events independent? The probability that she will be enrolled in either French or Spanish (or possibly both) is…. P(French or Spanish) = __________ = _______ = _____ P(French) +P(Spanish) – P(both) 0.6 + 0.5 – 0.2 0.9 Yes No Clicker Question: Are these events independent?
Example Yes No Clicker Question: Are these events mutually exclusive? The probability that she will be enrolled in either French or Spanish (or possibly both) is…. P(French or Spanish) = __________ = _______ = _____ P(French) +P(Spanish) – P(both) 0.6 + 0.5 – 0.2 0.9 Yes No Clicker Question: Are these events mutually exclusive?
Specific people vs. a random individual As people, we constantly hear risks, statistics, and probabilities 1 in 3.5 million planes crash 50% of marriages end in divorce Acceptance rate at Penn State was 50.3% for Fall 2014 Four-year graduation rate at Penn State is 66% 6-year graduation rate at Penn State 86%
Specific people vs. a random individual As people, we constantly hear risks, statistics, and probabilities 1 in 3.5 million planes crash 50% of marriages end in divorce Acceptance rate at Penn State was 50.3% for Fall 2014 Four-year graduation rate at Penn State is 66% 6-year graduation rate at Penn State 86% Do these probabilities apply specifically to YOU?
Do these probabilities apply specifically to YOU? These statistics generally apply to the bigger group – or to a person randomly selected from that population. The proper language to communicate this is often omitted. You should always understand that randomness is part of the communication in these statistics, even though it’s not explicitly mentioned.
Coincidences Coincidence: a surprising concurrence of events, perceived as meaningfully related, with no apparent causal connection. Examples Running into a friend in an unfamiliar city. Sharing a birthday with someone in your class. You meet someone who has a dog with your name.
Coincidences Coincidence: a surprising concurrence of events, perceived as meaningfully related, with no apparent causal connection. Many occurrences are more common than we suspect. Many are unlikely for a specific instance but do happen when lots of instances are possible. Examples Running into a friend in an unfamiliar city. Sharing a birthday with someone in your class. You meet someone who has a dog with your name.
What's the difference between these two statements? "I'm confident that there is at least one set of matching birthdays in this room” "I'm confident that there is at least one person in this room whose birthday matches my birthday" Which statement is more likely to be true? How many possible pairs of people are eligible for matching in each case? Assume 50 people are in the room.
With 50 people in the room… There are 49 possible pairs with me. Pr (No match with my birthday) = (364/365)49 = 0.874 There are 49+48+47+…+1 = 1225 total possible pairs. Pr (No match at all) = .030 and we can estimate by (364/365)1225 = .035
Shuffle two decks of cards. Stack the two decks side-by-side, face down next to each other. One by one, flip over one card from each deck. I bet I see at least one match. Do you want to bet against me?
Probability of no match: Probability of no match on 1st flip: 51/52 Probability of no match on 2nd flip: 51/52 … Probability of no match on 52nd flip: 51/52 These events are NOT independent; however, they are APPROXIMATELY independent because, say, whether a match occurs on the 36th flip doesn't influence whether a match occurs on the 47th flip very strongly. Thus,
Confusion of the inverse Suppose that a particular disease affects 1% of those who get tested for it. Also suppose that the test is 98% accurate. What would you advise a patient who tests positive if the test result were the only piece of information? True probability of disease: about 33%
Cancer testing: confusion of the inverse Suppose we have a cancer test for a certain type of cancer. Sensitivity of the test: If you have cancer then the probability of a positive test is .98. Pr(+ given you have C) = .98 Specificity of the test: If you do not have cancer then the probability of a negative test is .98. Pr(– given you do not have C) = .98 Base rate: The percent of the population who has the cancer. This is the probability that someone has C. Suppose for our example it is 1%. Hence, Pr(C) = .01.
Percent table + Positive – Negative C (Cancer) .98 .02 .01 no C (no Cancer) .99 Base Rate Sensitivity Specificity false positive false negative Suppose you go in for a test and it comes back positive. What is the probability that you have cancer?
Table of proportions (given): + – Base rate C .98 .02 .01 no C .99 Hypothetical table of counts: + – C 98 2 100 no C 198 9702 9,900 296 9704 10,000 Pr(C given a positive test result) = 98/296 = 33.1%
Probability review Suppose we roll two fair dice, one red and one blue. Let the event A be that we roll the same number on both dice. Let the event B be that the sum of the two dice is even. What is P(A)? What is P(B)? What is P(A and B)? What is P(A or B)?
More probability review Suppose that you flip a fair coin until the first occurrence of tails. What is the probability that the first “tails” occurs on the third flip? 1/2 1/3 (1/2) × (1/2) × (1/2) (1/2) × (1/4) × (1/8) (1/2) + (1/4) + (1/8)
Gambler’s fallacy Long-term probabilities should apply in the short term (false!) Random events should be “self-correcting” (false!) Example: a gambler who loses 48 times at a slot machine thinks that they are about to win, since he knows the slot machine pays big 1 in every 50 times in the long run.
Law of large numbers (this is true!) If an event is repeated many times independently with the same probability of success each time, the long-run success proportion will approach that probability. With independent events, knowing what has happened tells you nothing about what will happen. Misunderstanding this leads to the gambler’s fallacy, also known as: The “law of small numbers” (not a real thing), which is that small samples will always be representative of the population from which they are drawn.
More on Gambler’s Fallacy Suppose you flip four coins, keeping track of the results in order. Which is more likely, HHHH or HTTH? Which is more likely, four total heads or two total heads? Note: These questions are not the same! One of these questions is often mistakenly answered due to belief in the "Law of small numbers" (also known as the Gambler's Fallacy).
More probability review Suppose that you roll a fair 6-sided die until the first occurrence of a 4. What is the probability that the first 4 occurs on the third roll? 1/6 5/6 (5/6) × (5/6) × (1/6) (1/6) × (1/6) × (1/6) (1/6) + (1/6) + (1/6)
If you understand today’s lecture… 7.83, 7.84, 7.85, 7.88, 7.89, 7.91, 7.92 Recall Objectives: • Identify complementary events and handle probability calculations • Identify mutually exclusive events and handle probability calculations • Identify independent events and handle probability calculations • Understand that conditioning on a particular individual can change risks • Develop a sense for why seeming coincidences occur frequently • Identify, and resist the temptation to fall for, the “gambler’s fallacy” • Understand a common situation where confusion of the inverse occurs