Probability.

Slides:



Advertisements
Similar presentations
C4, L2, S1 Probabilities and Proportions Probabilities and proportions are numerically equivalent. (i.e. they convey the same information.) e.g. The proportion.
Advertisements

Presentation 5. Probability.
How likely something is to happen.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 15 Chances, Probabilities, and Odds 15.1Random Experiments and.
1 Probability Part 1 – Definitions * Event * Probability * Union * Intersection * Complement Part 2 – Rules Part 1 – Definitions * Event * Probability.
Conditional Probability, Intersection and Independence
Dependent and Independent Events. If you have events that occur together or in a row, they are considered to be compound events (involve two or more separate.
Probability.
1 Many people debate basic questions of chance in games such as lotteries. The Monty Hall problem is a fun brain teaser that Marilyn vos Savant addressed.
Probability. I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that I have the greatest chance of winning.
Chapter 6 Probability.
5.1 Basic Probability Ideas
C4, L1, S1 Probabilities and Proportions. C4, L1, S2 I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take.
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
C4, L1, S1 Chapter 2 Probability. C4, L1, S2 I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that.
Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing two population proportions using independent samples.
C4, L1, S1 Chapter 3 Probability. C4, L1, S2 I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that.
MM207 Statistics Welcome to the Unit 7 Seminar With Ms. Hannahs.
Rules of Probability. Recall: Axioms of Probability 1. P[E] ≥ P[S] = 1 3. Property 3 is called the additive rule for probability if E i ∩ E j =
Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein.
Math 30-2 Probability & Odds. Acceptable Standards (50-79%)  The student can express odds for or odds against as a probability determine the probability.
Introduction Lecture 25 Section 6.1 Wed, Mar 22, 2006.
STT 315 This lecture note is based on Chapter 3
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Probability. I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that I have the greatest chance of winning.
6.3 Binomial and Geometric Random Variables
C4, L2, S1 Probabilities and Proportions Probabilities and proportions are numerically equivalent. (i.e. they convey the same information.) e.g. The proportion.
Chance We will base on the frequency theory to study chances (or probability).
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
Probability. Definitions Probability: The chance of an event occurring. Probability Experiments: A process that leads to well- defined results called.
Probability and Probability Distributions. Probability Concepts Probability: –We now assume the population parameters are known and calculate the chances.
1 COMP2121 Discrete Mathematics Principle of Inclusion and Exclusion Probability Hubert Chan (Chapters 7.4, 7.5, 6) [O1 Abstract Concepts] [O3 Basic Analysis.
Fundamentals of Probability
Essential Ideas for The Nature of Probability
Copyright © 2009 Pearson Education, Inc.
Probabilities What is the probability that among 23 people (this class) there will be a shared birthday?
Introduction to Probability Distributions
Chapter 6 6.1/6.2 Probability Probability is the branch of mathematics that describes the pattern of chance outcomes.
Statistics 200 Lecture #12 Thursday, September 29, 2016
Natural Language Processing
Probability.
Probability and Statistics Chapter 3 Notes
Patrick's Casino. Patrick's Casino What is the probability of picking an ace?
Probability Distributions; Expected Value
Section 5.1 Basic Ideas.
A Survey of Probability Concepts
5.1 Probability of Simple Events
Statistics for 8th Edition Chapter 3 Probability
Natural Language Processing
Applicable Mathematics “Probability”
Introduction to Probability
Chapter 5 Probability.
Probability Probability underlies statistical inference - the drawing of conclusions from a sample of data. If samples are drawn at random, their characteristics.
Probability 14.1 Experimental Probability 14.2 Principles of Counting
Great Theoretical Ideas In Computer Science
Probabilities and Proportions
Chapter 3 Probability.
Honors Statistics From Randomness to Probability
Random Variables Binomial Distributions
Lecture 2: Probability.
Analysis of Categorical Data
7.3 Conditional Probability, Intersection and Independence
STATISTICS AND PROBABILITY
From Randomness to Probability
Probability Rules.
From Randomness to Probability
Presentation transcript:

Probability

Lotto I am offered two lotto cards: Card 1: has numbers Which card should I take so that I have the greatest chance of winning lotto?

Roulette In the casino I wait at the roulette wheel until I see a run of at least five reds in a row. I then bet heavily on a black. I am now more likely to win.

Coin Tossing I am about to toss a coin 20 times. What do you expect to happen? Suppose that the first four tosses have been heads and there are no tails so far. What do you expect will have happened by the end of the 20 tosses ?

Coin Tossing Option A Option B Still expect to get 10 heads and 10 tails. Since there are already 4 heads, now expect to get 6 heads from the remaining 16 tosses. In the next few tosses, expect to get more tails than heads. Option B There are 16 tosses to go. For these 16 tosses I expect 8 heads and 8 tails. Now expect to get 12 heads and 8 tails for the 20 throws.

TV Game Show In a TV game show, a car will be given away. 3 keys are put on the table, with only one of them being the right key. The 3 finalists are given a chance to choose one key and the one who chooses the right key will take the car. If you were one of the finalists, would you prefer to be the 1st, 2nd or last to choose a key?

Let’s Make a Deal Game Show You pick one of three doors two have booby prizes behind them one has lots of money behind it The game show host then shows you a booby prize behind one of the other doors Then he asks you “Do you want to change doors?” Should you??! (Does it matter??!) See the following website: http://www.stat.sc.edu/~west/javahtml/LetsMakeaDeal.html

Game Show Dilemma Suppose you choose door A. In which case Monty Hall will show you either door B or C depending upon what is behind each. No Switch Strategy ~ here is what happens Result A B C Win Car Goat Lose P(WIN) = 1/3

Game Show Dilemma Suppose you choose door A, but ultimately switch. Again Monty Hall will show you either door B or C depending upon what is behind each. Switch Strategy ~ here is what happens Result A B C Lose Car Goat Win P(WIN) = 2/3 !!!!

Matching Birthdays In a room with 23 people what is the probability that at least two of them will have the same birthday? Answer: .5073 or 50.73% chance!!!!! How about 30? .7063 or 71% chance! How about 40? .8912 or 89% chance! How about 50? .9704 or 97% chance!

Probability In our discussion of probability Introduce the basic ideas about probabilities: what they are and where they come from simple probability models (genetics) conditional probabilities independent events Baye’s Rule Examine how to calculate probabilities: Using counting methods, tables of counts (not in text) and using properties of probabilities such as independence.

What is the probability it will turn up heads? I toss a fair coin (where fair means ‘equally likely outcomes’) What are the possible outcomes? Head and tail ~ This is called a “dichotomous experiment” because it has only two possible outcomes. S = {H,T}. What is the probability it will turn up heads? 1/2 I choose a duck nest at random and observe whether it gets predated or not. Predated or Not Predated (“Success” and “Failure”) What is the probability of predation? ????? What factors influence this probability? ?????

What are Probabilities? A probability is a number between 0 & 1 that quantifies uncertainty. A probability of 0 identifies impossibility A probability of 1 identifies certainty

Where do probabilities come from? Probabilities from models: The probability of getting a four when a fair dice is rolled is 1/6 (0.1667 or 16.7% chance) The probability of winning craps on a “Pass Line” bet is .493 The probability of winning the jackpot in a Powerball lottery is .000000006844

Where do probabilities come from? Probabilities from data or Empirical probabilities What is the probability that a randomly selected starling is female? A random sample n = 67 starlings was taken. 40 of these starlings are female. The estimated probability that a randomly chosen starling will be a female is 40/67 (0.597 or 59.7% chance)

Where do probabilities come from? Subjective Probabilities The probability that there will be another outbreak of ebola in Africa within the next year is 0.1. The probability of snow in the next 24 hours is very high. Perhaps the weather forecaster might say a there is a 80% chance of snow. A doctor may state your chance of successful treatment, e.g. 70% chance of remission.

Simple Probability Models “The probability that an event A occurs” is written in shorthand as P(A). For equally likely outcomes, and a given event A: P(A) = Number of outcomes in A Total number of outcomes

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Had positive response to treatment P(Pos) = 314/538 = .584 or 58.4% chance

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Had at least some response to treatment P(Par and Pos) = (98 + 314)/538 = 412/538 = .766 or 76.6% chance

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Had LP and positive response to treatment P(LP and Pos) = 74/538 = .138 or 13.8%

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Had LP or NS as their histological type. P(LP or NS) = (104 + 96)/538 = .372 or 37.2% chance

Conditional Probability We wish to find the probability of an event occuring given information about occurrence of another event. For example, what is probability of developing lung cancer given that we know the person smoked a pack of cigarettes a day for the past 30 years. Key words that indicate conditional probability are: “given that”, “of those”, “if …”, “assuming that”

Conditional Probability “The probability of event A occurring given that event B has already occurred” is written in shorthand as P(A|B)

Independence Events A and B are said to be independent if P(A|B) = P(A) and P(B|A) = P(B) i.e. knowing something about the occurrence of B tells you nothing about the occurrence of A.

Independence  

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Conditional Probs P(Pos|LD) = ? P(Pos|LD) = 18/72 = .25 or a 25% chance

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Conditional Probs P(Pos|LP) = ? P(Pos|LP) = 74/104 = .712 or a 71.2% chance

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Conditional Probs P(Pos|MC) = ? P(Pos|MC) = 154/266 = .579 or a 57.9% chance

Example 1: Hodgkin’s Disease Response to Treatment Type None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Conditional Probs P(Pos|NS) = ? P(Pos|NS) = 68/96 = .708 or a 70.8% chance

Example 1: Hodgkin’s Disease See the tutorial “Bivariate Displays for Categorical Data …” for a full details on this example and how to construct this plot in JMP. Under JMP Tutorials, not the new JMP Videos page.

Example 1: Hodgkin’s Disease

Example 1: Hodgkin’s Disease Right-click on a colored panel in the mosaic plot and select Cell Labeling > Show Percents.  

Example 2: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) Brain Injury (BI) No Brain Injury (NBI) Row Totals No Helmet (NH) 97 1918 2015 Helmet Worn (H) 17 977 994 Column 114 2895 3009 BI = the event the motorcyclist sustains brain injury NBI = no brain injury H = the event the motorcyclist was wearing a helmet NH = no helmet worn What is the probability that a motorcyclist involved in a accident sustains brain injury? P(BI) = 114 / 3009 = .0379

Example 2: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) Brain Injury (BI) No Brain Injury (NBI) Row Totals No Helmet (NH) 97 1918 2015 Helmet Worn (H) 17 977 994 Column 114 289 3009 BI = the event the motorcyclist sustains brain injury NBI = no brain injury H = the event the motorcyclist was wearing a helmet NH = no helmet worn What is the probability that a motorcyclist involved in a accident was wearing a helmet? P(H) = 994 / 3009 = .3303

Example 2: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) Brain Injury (BI) No Brain Injury (NBI) Row Totals No Helmet (NH) 97 1918 2015 Helmet Worn (H) 17 977 994 Column 114 2895 3009 BI = the event the motorcyclist sustains brain injury NBI = no brain injury H = the event the motorcyclist was wearing a helmet NH = no helmet worn What is the probability that the cyclist sustained brain injury given they were wearing a helmet? P(BI|H) = 17 / 994 = .0171

Example 2: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) Brain Injury (BI) No Brain Injury (NBI) Row Totals No Helmet (NH) 97 1918 2015 Helmet Worn (H) 17 977 994 Column 114 2895 3009 BI = the event the motorcyclist sustains brain injury NBI = no brain injury H = the event the motorcyclist was wearing a helmet NH = no helmet worn What is the probability that the cyclist not wearing a helmet sustained brain injury? P(BI|NH) = 97 / 2015 = .0481

Example 2: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) Brain Injury (BI) No Brain Injury (NBI) Row Totals No Helmet (NH) 97 1918 2015 Helmet Worn (H) 17 977 994 Column 114 2895 3009 How many times more likely is a non-helmet wearer to sustain brain injury? .0481 / .0171 = 2.81 times more likely. This is called the relative risk or risk ratio (denoted RR).

Example 2: Helmet Use and Head Injuries in Motorcycle Accidents (Wisconsin, 1991) The shading for Brain Injury for the No Helmet group is roughly three times higher than the shading for Brain Injury for the Helmet Worn group. Motorcyclists not wearing a helmet are at three times the risk of suffering brain injury.

Building a Contingency Table from a Story Example 3: HIV and Condom Use (not in notes) A European study on the transmission of the HIV virus involved 470 heterosexual couples. Originally only one of the partners in each couple was infected with the virus. There were 293 couples that always used condoms. From this group, 3 of the non-infected partners became infected with the virus. Of the 177 couples who did not always use a condom, 20 of the non-infected partners became infected with the virus.

Example 3: HIV and Condom Use (not in notes) Let C be the event that the couple always used condoms. (NC be the complement) Let I be the event that the non-infected partner became infected. (NI be the complement) Condom Usage Infection Status C NC Total NI I Total

Example 3: HIV and Condom Use (not in notes) A European study on the transmission of the HIV virus involved 470 heterosexual couples. Originally only one of the partners in each couple was infected with the virus. There were 293 couples that always used condoms. From this group, 3 of the non-infected partners became infected with the virus. Condom Usage Infection Status C NC Total NI I 3 Total 293 470

Example 3: HIV and Condom Use (not in notes) Of the 177 couples who did not always use a condom, 20 of the non-infected partners became infected with the virus. Condom Usage Infection Status C NC Total NI I 3 20 23 290 157 447 Total 293 177 470

Example 3: HIV and Condom Use (not in notes) What proportion of the couples in this study always used condoms? P(C ) C NC NI I Total Condom Usage Infection Status 470 293 3 20 177 290 157 23 447

Example 3: HIV and Condom Use (not in notes) What proportion of the couples in this study always used condoms? P(C ) = 293/470 (= 0.623) C NC NI I Total Condom Usage Infection Status 470 293 3 20 177 290 157 23 447

Example 3: HIV and Condom Use (not in notes) If a non-infected partner became infected, what is the probability that he/she was one of a couple that always used condoms? P(C|I ) = 3/23 = 0.130 C NC NI I Total Condom Usage Infection Status 470 293 3 20 177 290 157 23 447

Example 3: HIV and Condom Use (not in notes) c) In what percentage of couples did the non-HIV partner become infected amongst those that did not use condoms? P(I|NC) = 20/177 = .113 or 11.3% Amongst those that did where condoms? P(I|C) = 3/293 = .0102 or 1.02% What is relative risk of infection associated with not wearing a condom? RR = P(I|NC) / P(I|C) = 11.08 times more likely to become infected.

Example 3: HIV and Condom Use (not in notes) The percentage of couples where the non-HIV partner became infected in the non-condom user group is 11 times higher than that for condom group.

Relative Risk (RR) and Odds Ratio (OR) Example: Age at First Pregnancy and Cervical Cancer A case-control study was conducted to determine whether there was increased risk of cervical cancer amongst women who had their first child before age 25. A sample of 49 women with cervical cancer was taken of which 42 had their first child before the age of 25. From a sample of 317 “similar” women without cervical cancer it was found that 203 of them had their first child before age 25. Q: Do these data suggest that having a child at or before age 25 increases risk of cervical cancer?

Odds Ratio (OR) The ODDS for an event A are defined as P(A) 1 – P(A) For example suppose we roll a single die the odds for a 6 are: Odds for 6 = P(6)/(1 – P(6)) = = (1/6)/(1 – (1/6)) = 1/5 (1:5 odds for or 5:1 odds against) i.e. 1 six for every 5 rolls that don’t result in a six.

Odds Ratio (OR) OR = _________________________ P(Disease|Risk Factor) Odds for disease amongst those with risk factor present The Odds Ratio (OR) for a disease associated with a risk factor is ratio of the odds for disease for those with risk factor and the odds for disease for those without the risk factor OR = _________________________ P(Disease|Risk Factor) _____________________ 1 – P(Disease|Risk Factor) P(Disease|No Risk Factor) _______________________ 1 – P(Disease|No Risk Factor) Odds for disease amongst those without the risk factor. The Odds Ratio gives us the multiplicative increase in odds associated with having the “risk factor”.

Relative Risk (RR) and Odds Ratio (OR) Age at 1st Pregnancy Case Control Row Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 n = 366 Cervical Cancer a) Why can’t we calculate P(Cervical Cancer | Age < 25)? Because the number of women with disease was fixed in advance and therefore NOT RANDOM !

Relative Risk (RR) and Odds Ratio (OR) Age at 1st Pregnancy Case Control Row Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 n = 366 Cervical Cancer b) What is P(risk factor|disease status) for each group? P(Age < 25|Case) = 42/49 = .857 or 85.7% P(Age < 25|Control) = 203/317 = .640 or 64.0%

Relative Risk (RR) and Odds Ratio (OR) Age at 1st Pregnancy Case Control Row Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 n = 366 Cervical Cancer c) What are the odds for the risk factor amongst the cases? Amongst the controls? Odds for risk factor cases = .857/(1-.857) = 5.99 Odds for risk factor controls = .64/(1- .64) = 1.78

Relative Risk (RR) and Odds Ratio (OR) Age at 1st Pregnancy Case Control Row Totals Age < 25 42 203 245 Age > 25 7 114 121 Column Totals 49 317 n = 366 Cervical Cancer d) What is the odds ratio for the risk factor associated with being a case? Odds Ratio (OR) = 5.99/1.78 = 3.37, the odds for having 1st child on or before age 25 are 3.37 times higher for women who currently have cervical cancer versus those that do not have cervical cancer.

Relative Risk (RR) and Odds Ratio (OR) The ratio of dark to light shading is 3.37 times larger for the cervical cancer group than it is for the control group.

Relative Risk (RR) and Odds Ratio (OR) Even though it is inappropriate to do so calculate P(disease|risk status). P(case|Age<25) = 42/245 = .171 or 17.1% P(case|Age>25) = 7/121 = .058 or 5.8% Now calculate the odds for disease given the risk factor status Odds for Disease for 1st Preg. Age < 25 = .171/(1 - .171) = .207 Odds for Disease for 1st Preg. Age > 25 = .058/(1 - .058) = .061

Relative Risk (RR) and Odds Ratio (OR) f) Finally calculate the odds ratio for disease associated with 1st pregnancy age < 25 years of age. Odds Ratio = .207/.061 = 3.37 This is exactly the same as the odds ratio for having the risk factor (Age < 25) associated with being in the cervical cancer group!!!! Final Conclusion: Women who have their first child at or before age 25 have 3.37 times the odds of developing cervical cancer when compared to women who had their first child after the age of 25.

Relative Risk (RR) and Odds Ratio (OR) Risk Factor Status Case Control Risk Factor Present a b Risk Factor Absent c d Disease Status a X d OR = _____ b X c Much easier computational formula!!!

Relative Risk (RR) and Odd’s Ratio (OR) When the disease is fairly rare, i.e. P(disease) < .10 or 10%, then one can show that the odds ratio (OR) and relative risk (RR) are similar in value. OR is approximately equal to RR when P(disease) < .10 or 10% chance. In these cases we can use the phrase: “… times more likely” when interpreting the OR.

Relative Risk (RR) and Odds Ratio (OR) Age at 1st Pregnancy Case Control Row Totals Age < 25 a 42 b 203 245 Age > 25 c 7 d 114 121 Column Totals 49 317 n = 366 OR = (42 X 114)/(7 X 203) = 3.37 Because less than 10% of the population of women develop cervical cancer we can say: “Women who have their first child at or before age 25 are 3.37 times more likely to develop cervical cancer than women who have their first child after age 25.”