The Role of Probability Chapter 5
Objectives Understand probability as it pertains to statistical inference Understand the concepts “at random” and “equally likely” Understand the attributes and applications of the binomial distribution Understand the attributes and applications of the normal distribution
Objectives Understand and apply the results of the Central Limit Theorem
Sampling: Population Size=N, Sample Size=n Simple random sample –Enumerate all members of population N (sampling frame), select n individuals at random (each has same probability of being selected) Systematic sample –Start with sampling frame; determine sampling interval (N/n); select first person at random from first (N/n) and every (N/n) thereafter
Sampling: Population Size=N, Sample Size=n Stratified sample –Organize population into mutually exclusive strata; select individuals at random within each stratum Convenience sample –Non-probability sample (not for inference) Quota sample –Select a pre-determined number of individuals into sample from groups of interest
Basics Probability reflects the likelihood that outcome will occur 0 < Probability < 1
Example 5.1. Basic Probability Age Total Boys Girls Total P(Select any child) = 1/5290 =
Example 5.1. Basic Probability P(Select a boy) = 2560/5290 = P(Select boy age 10) = 418/5290 = P(Select child at least 8 years of age) = ( )/5290 = 2645/5290 = 0.500
Conditional Probability Probability of outcome in a specific sub- population Example 5.1, P(Select 9 year old from among girls) = P(Select 9 year old|girl) = 461/2730 = P(Select boy|6 years of age) = 379/892=0.425
Example 5.2. Conditional Probability Prostate Cancer No Prostate Cancer Total Low PSA36164 Moderate PSA High PSA12315 Total
Example 5.2. Conditional Probability P(Prostate Cancer|Low PSA) = 3/64 = P(Prostate Cancer|Moderate PSA) = 13/41 = P(Prostate Cancer|High PSA) = 12/15 = 0.80
Sensitivity and Specificity Sensitivity = true positive fraction = P(test +|disease) Specificity = true negative fraction = P(test -|disease free) False negative fraction=P(test -|disease) False positive fraction=P(test +|disease free)
Example 5.4. Sensitivity and Specificity Affected Fetus Unaffected Fetus Total Positive Screen Negative Screen Total
Sensitivity and Specificity Sensitivity = P(test +|disease) =9/10=0.90 Specificity = P(test -|disease free) = 4449/4800 = False negative fraction= P(test -|disease) = 1/10 = 0.10 False positive fraction=P(test +|disease free) = 351/4800 = 0.073
Independence Two events, A and B, are independent if P(A|B) = P(A) or if P(B|A) = P(B) Example 5.2. Is screening test independent of prostate cancer diagnosis? –P(Prostate Cancer) = 28/120 = –P(Prostate Cancer|Low PSA) = –P(Prostate Cancer|Moderate PSA) = –P(Prostate Cancer|High PSA) = 0.80
Bayes Theorem
Example 5.8. Bayes Theorem P(disease) = Sensitivity = 0.85 = P(test +|disease) P(test +)=0.08 and P(test -) = 0.92 What is P(disease|test +)?
Binomial Distribution Model for discrete outcome Process or experiment has 2 possible outcomes: success and failure Replications of process are independent P(success) is constant for each replication
Binomial Distribution Notation: n=number of times process is replicated, p=P(success), x=number of successes of interest 0< x<n
Example 5.9. Binomial Distribution Medication for allergies is effective in reducing symptoms in 80% of patients. If medication is given to 10 patients, what is the probability it is effective in 7? = 120(0.2097)(0.008) =
Normal Distribution Model for continuous outcome Mean=median=mode
Normal Distribution Notation: =mean and =standard deviation
Example Normal Distribution Body mass index (BMI) for men age 60 is normally distributed with a mean of 29 and standard deviation of 6? What is the probability that a male has BMI less than 35?
Example Normal Distribution P(X<35)=?
Standard Normal Distribution Z Normal distribution with =0 and =
Example Normal Distribution P(X<35)= P(Z<1) = ?
Example Normal Distribution P(X<35) = P(Z<1). Using Table 1, P(Z<1.00) = Table 1. Probabilities of Z Table entries represent P(Z < Zi) Zi … … … …
Example Normal Distribution P(X<30)=? What is the probability that a male has BMI less than 30?
Example Normal Distribution P(X<30)= P(Z<0.17) =
Percentiles of the Normal Distribution A percentile is a value that holds a specified percentage of the distribution below it. The median is the 50 th percentile, Q 1 is the 25 th percentile and Q 3 is the 75 th percentile.
Percentiles of the Normal Distribution Percentiles are determined by: x = + Z where z is the desired percentile from the standard normal distribution (See Table 1A)
Example Percentiles of the Normal Distribution BMI in men follows a normal distribution with =29, =6. BMI in women follows a normal distribution with =28, =7. The 90 th percentile of BMI for men: X = (6) = The 90 th percentile of BMI for women: X = (7) =
Central Limit Theorem Suppose we have a population with known mean and standard deviation . If we take simple random samples of size n with replacement, then for large n, the sampling distribution of the sample means is approximately normal with mean and standard deviation
Application Non-normal population Take samples of size n – as long as n is sufficiently large (usually n > 30 suffices) The distribution of the sample mean is approximately normal, therefore can use Z to compute probabilities
Example Central Limit Theorem HDL cholesterol has a mean of 54 and standard deviation of 17 in patients over 50. A physician has 40 patients over age 50 and wants to know the probability that their mean cholesterol is above 60.
Example Central Limit Theorem