Presentation is loading. Please wait.

Presentation is loading. Please wait.

The law of total probability and Bayes Formula. Law of Total Probability F1F1 F2F2 F3F3 E Suppose a sample space S is the union of n pairwise disjoint.

Similar presentations


Presentation on theme: "The law of total probability and Bayes Formula. Law of Total Probability F1F1 F2F2 F3F3 E Suppose a sample space S is the union of n pairwise disjoint."— Presentation transcript:

1 The law of total probability and Bayes Formula

2 Law of Total Probability F1F1 F2F2 F3F3 E Suppose a sample space S is the union of n pairwise disjoint sets F 1, F 2,..., F n, and suppose E is any subset of S. Then E will be the union of the n pairwise disjoint sets E∩F 1, E∩F 2,..., E∩F n, and therefore P(E) will be the following sum: P( E ) = P( E∩F 1 ) + P( E∩F 2 ) +... + P( E∩F n ) The multiplication theorem for conditional probability can be applied to each of the set-intersections here (i.e., P( E∩F k ) = P( F k )P( E │ F k ) for k=1,2,...,n ), giving: P( E ) = P( F 1 )P( E │ F 1 ) + P( F 2 )P( E │ F 2 ) +... + P( F n )P( E │ F n )

3 Tree Diagram E∩F 1 P( F 1 )P( E │ F 1 ) F 1 E∩F 1 c E∩F 2 P( F 2 )P( E │ F 2 ) F 2 E∩F 2 c... E∩F n P( F n )P( E │ F n ) F n E∩F n c Total probability of event E is the sum of these n products

4 A 'total probability' example At the University of San Francisco “Women make up 61% of the student body” -- From Wikipedia, the free encyclopedia “Half of women and 10 per cent of men have, to some degree, a fear of spiders” -- From What is the probability that a USF student selected at random has a fear of spiders? Event E: student selected at random has some fear of spiders ('Arachnophobia') Partition of the sample-space: F 1 = the student is male, F 2 = the student is female P( E ) = P(F 1 )P(E│F 1 ) + P(F 2 )P(E│F 2 ) = (.39)(.1) + (.61)(.5) =.039 +.305 =.344 solution problem

5 Bayes Formula Recalling the formula that defines conditional probability P( F k │E ) = P( F k ∩E )/P(E) and the multiplication theorem for conditional probability P(F k ∩E) = P(F k )P(E│F k ), we can use the law of total probability in the denominator of that defining formula: P(F k )P(E│F k ) P(F 1 )P(E│F 1 ) + P(F 2 )P(E│F 2 ) +... + P(F n )P(E│F n ) P(F k │E) = where F 1, F 2,..., F n are pairwise disjoint sets whose union is the full sample space. Continuing with our spider-fear example, we can use Bayes Formula to answer questions such as: What is the probability of a randomly selected student being female if we know that this student has a fear of spiders to some degree?

6 A 'Bayes Formula' exercise fear of spiders (10%) male (39%) unafraid (90%) fear of spiders (50%) female (61%) unafraid (50%) Recall that we already calculated the total probability: P( afraid ) = 0.344 So, applying Bayes Formula, we learn the learn the conditional probability of a random student being male, given that this student has no fear of spiders: P( male | afraid ) = P( male )P( afraid│male ) / P( afraid ) = (.39)(.1) / (.344) In other words, the “extra” knowledge that the randomly selected student has a fear of spiders makes it somewhat less probable that this student is a male.

7 The U.S. Census Bureau reported these figures: Year 2000 population: 281,421,906 Year 2010 population: 308,745,536 linearly If population grew linearly during this decade, then in year 2007 the U.S. population was (281421906)(.3) + (308745536)(.7) = 300,548,447 U.S. population statistics

8 Incidence of lung cancer For 2007 the U.S. Center for Disease Control reported 203,586 diagnoses of lung cancer in the United States So the probability of being diagnosed with lung cancer in the U.S. during year 2007 was 203,586 / 300,548,447 ≈ 0.00068

9

10 Smokers versus non-smokers About 10% of lung cancers occur in nonsmokers, according to American Cancer Society estimates Event S: person is a smoker Event C: person gets a lung cancer diagnosis So which one of these equalities is correct? P( C | S ) = 0.90 or P( S | C ) = 0.90

11 Venn Diagram Total U.S. population in year 2007: 300,548,447 people Number of 2007 lung cancer diagnoses: 203,586 people Those diagnosed were 18 times more likely to be smokers 33% smokers67% nonsmokers received lung cancer diagnosis in 2007 How many sample-points (i.e., people) belong to each region of this Venn Diagram?

12 Tree Diagram exercise smoker lung cancer diagnosis nonsmoker smoker no lung cancer diagnosis nonsmoker Event C: person gets lung cancer diagnosis (probability = 0.00068) Event S: person is a smoker (probability = 0.33) conditional probabilities: P( S | C ) = 0.90 and P( S c | C ) = 0.10 Where do these numbers go in this tree diagram?

13 A useful question How likely was a lung cancer diagnosis for someone who was a smoker? P( C | S ) = ??? Bayes Formula This is a question Bayes Formula may answer

14 Applying the formulas Event S: a person selected at random is a smoker Event C: a person selected at random received a lung cancer diagnosis P( S│C ) = 0.90(i.e., 90% of lung cancer victims are smokers) P( S'│C ) = 0.10(i.e., 10% of lung cancer victins are nonsmokers) P(C) = 203,586 / 300,548,447 = 0.00068 (i.e., 0.068% chance of diagnosis) Law of Total Probability: P(C) = P(S)P(C│S) + P(S')P(C│S') P(S) = 0.33 (i.e., about one-third of the U.S. population are smokers) P(S') = 0.67 (i.e., thus two-thirds of the population are nonsmokers) Bayes Theorem: P(S│C) = P(S)P(C│S) / [ P(S)P(C│S) + P(S')P(C│S') ] Substituting:0.90 = (0.33)P(C│S) / 0.00068 and solving for P(C│S) gives P(C│S) = (0.90)(0.00068)/(0.33) = 0.00185 i.e., the probability of a smoker being diagnosed with lung cancer in 2007 versus probability 0.00068 of any person, smoker or not, being diagnosed

15 An insurance viewpoint Our foregoing calculations show approximately 68 people out of every 100,000 individuals in the United States got a lung cancer diagnosis in 2007. When this happens, it can be a catastrophic event for any individual affected, resulting in a huge unplanned financial outlay for medical treatments and other expenses. If we have estimated the average yearly cost for lung cancer care at $50,000 per patient, then this amounts to $3.4-million per 100,000 individuals in the nation, or $34 per person per year. So an opportunity exists for an insurance company to sell an affordable policy that will spread the financial risk, yet still allow a profit. But if individuals who smoke, and who are thus nearly 3 times as likely to get a lung cancer diagnosis, are allowed to purchase the insurance policy at the same cost as for nonsmokers, and if they do so in numbers greater than their overall proportion of the population, then the insurance company will likely find that its payouts exceed its revenues, which can result in bankruptcy. This observation highlights the significance of what may at first seem like very small probabilities.


Download ppt "The law of total probability and Bayes Formula. Law of Total Probability F1F1 F2F2 F3F3 E Suppose a sample space S is the union of n pairwise disjoint."

Similar presentations


Ads by Google