Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 3 Discrete Random Variables and Probability Distributions

Similar presentations


Presentation on theme: "Chapter 3 Discrete Random Variables and Probability Distributions"— Presentation transcript:

1 Chapter 3 Discrete Random Variables and Probability Distributions
3.2 - Probability Distributions for Discrete Random Variables 3.3 - Expected Values 3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative Binomial Distributions 3.6 - The Poisson Probability Distribution

2 Discrete random variable X Examples: shoe size, dosage (mg), # cells,…
Recall… POPULATION Discrete random variable X Examples: shoe size, dosage (mg), # cells,… Pop values x Probabilities p(x) Cumul Probs F (x) x1 p(x1) x2 p(x2) p(x1) + p(x2) x3 p(x3) p(x1) + p(x2) + p(x3) 1 Total X Total Area = 1 Mean Variance

3 ~ The Binomial Distribution ~
Used only when dealing with binary outcomes (two categories: “Success” vs. “Failure”), with a fixed probability of Success () in the population. Calculates the probability of obtaining any given number of Successes in a random sample of n independent “Bernoulli trials.” Has many applications and generalizations, e.g., multiple categories, variable probability of Success, etc.

4 How can we calculate the probability of
POPULATION 40% Male, % Female For any randomly selected individual, define a binary random variable: RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) x p(x) x1 p(x1) x2 p(x2) x3 p(x3) 1 F(x) F(x1) F(x2) 1 How can we calculate the probability of How can we calculate the probability of P(X = x), for x = 0, 1, 2, 3, …,100? p(x) = P(X = x), for x = 0, 1, 2, 3, …,100? P(X = 0), P(X = 1), P(X = 2), …, P(X = 99), P(X = 100)? p(x) = F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?

5 How can we calculate the probability of
POPULATION 40% Male, % Female For any randomly selected individual, define a binary random variable: RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Example: How can we calculate the probability of F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? p(25) = P(X = 25)? P(X = x), for x = 0, 1, 2, 3, …,100? p(x) = Solution: Solution: Model the sample as a sequence of independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female), where P(H) = 0.4, P(T) = 0.6 .… etc….

6 How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
3 4 5 97 98 99 100 X = 25 Heads: { H1, H2, H3,…, H25 } HOWEVER… permutations of 25 among 100 There are 100 possible open slots for H1 to occupy. For each one of them, there are 99 possible open slots left for H2 to occupy. For each one of them, there are 98 possible open slots left for H3 to occupy. …etc…etc…etc… For each one of them, there are 77 possible open slots left for H24 to occupy. For each one of them, there are 76 possible open slots left for H25 to occupy. Hence, there are ?????????????????????? possible outcomes. 100  99  98  …  77  76 This value is the number of permutations of the coins, denoted 100P25.

7 How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
3 4 5 97 98 99 100 X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 HOWEVER… permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. For example: We would not want to count this as a distinct outcome. 1 2 3 4 5 97 98 99 100

8 How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
3 4 5 97 98 99 100 X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 HOWEVER… permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. How many is that? By the same logic…... 25  24  23  …  3  2  1 “25 factorial” - denoted 25! 100  99  98  …  77  76 25  24  23  …  3  2  1 100!_ 25! 75! = R: choose(100, 25) Calculator: 100 nCr 25 “100-choose-25” - denoted or 100C25 This value counts the number of combinations of 25 Heads among 100 coins.

9 How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
3 4 5 97 98 99 100 0.4 0.6 Answer: What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = P(Tails) = 1 –  = 0.6 Answer: Via independence in binary outcomes between any two coins, 0.4  0.6  0.6  0.4  0.6  …  0.6  0.4  0.4  0.6 = Therefore, the probability P(X = 25) is equal to……. R: dbinom(25, 100, .4)

10 How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
3 4 5 97 98 99 100 0.5 0.4 0.6 Answer: This is the “equally likely” scenario! What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = P(Tails) = 1 –  = 0.6  = 0.5 1 –  = 0.5 Answer: Via independence in binary outcomes between any two coins, 0.4  0.6  0.6  0.4  0.6  …  0.6  0.4  0.4  0.6 = 0.5  0.5  0.5  0.5  0.5  …  0.5  0.5  0.5  0.5 = Therefore, the probability P(X = 25) is equal to……. Question: What if the coin were “fair” (unbiased), i.e.,  = 1 –  = 0.5 ?

11 independent, with constant probability () per trial
POPULATION 40% Male, % Female For any randomly selected individual, define a binary random variable: “Success” vs. “Failure” “Failure” “Success” 1 –  RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, n) Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Discrete random variable X = # “Successes” in sample (0, 1, 2, 3, …, n) size n Example: What is the probability P(X = 25)? F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? x x = 0, 1, 2, 3, …,100 n Solution: Model the sample as a sequence of n = 100 independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female). Solution: n Bernoulli trials with P(“Success”) = , P(“Failure”) = 1 – . independent, with constant probability () per trial Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability mass function” p(x) = , x = 0, 1, 2, …, n. .… etc….

12 Example: Blood Type probabilities, revisited
Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Check: 1. Independent outcomes? Reasonably assume that outcomes “Type O” vs. “Not Type O” between two individuals are independent of each other.  2. Constant probability  ? Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) From table,  = P(Type O) = .461 throughout population.  Binomial model applies?

13 Example: Blood Type probabilities, revisited
p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 1 (.461)1 (.539)9 = 2 (.461)2 (.539)8 = 3 (.461)3 (.539)7 = 4 (.461)4 (.539)6 = 5 (.461)5 (.539)5 = 6 (.461)6 (.539)4 = 7 (.461)7 (.539)3 = 8 (.461)8 (.539)2 = 9 (.461)9 (.539)1 = 10 (.461)10 (.539)0 = Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461)

14 Example: Blood Type probabilities, revisited
p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 1 (.461)1 (.539)9 = 2 (.461)2 (.539)8 = 3 (.461)3 (.539)7 = 4 (.461)4 (.539)6 = 5 (.461)5 (.539)5 = 6 (.461)6 (.539)4 = 7 (.461)7 (.539)3 = 8 (.461)8 (.539)2 = 9 (.461)9 (.539)1 = 10 (.461)10 (.539)0 = Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461)

15 n = 10 p = .461 pmf = function(x)(dbinom(x, n, p)) N = x = 0:10 bin.dat = rep(x, N*pmf(x)) hist(bin.dat, freq = F, breaks = c(-.5, x+.5), col = "green") axis(1, at = x) axis(2)

16 Example: Blood Type probabilities, revisited
p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 1 (.461)1 (.539)9 = 2 (.461)2 (.539)8 = 3 (.461)3 (.539)7 = 4 (.461)4 (.539)6 = 5 (.461)5 (.539)5 = 6 (.461)6 (.539)4 = 7 (.461)7 (.539)3 = 8 (.461)8 (.539)2 = 9 (.461)9 (.539)1 = 10 (.461)10 (.539)0 = Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 4.61 = (10)(.461) n (1 – ) = 2.48

17 Example: Blood Type probabilities, revisited
p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 1 (.461)1 (.539)9 = 2 (.461)2 (.539)8 = 3 (.461)3 (.539)7 = 4 (.461)4 (.539)6 = 5 (.461)5 (.539)5 = 6 (.461)6 (.539)4 = 7 (.461)7 (.539)3 = 8 (.461)8 (.539)2 = 9 (.461)9 (.539)1 = 10 (.461)10 (.539)0 = Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 4.61 n (1 – ) = 2.48

18 Example: Blood Type probabilities, revisited
Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. RARE EVENT! Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to Binomial model applies. X ~ Bin(10, .461) X ~ Bin(1500, .007) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 10.5 n (1 – ) 2.48 =

19 Example: Blood Type probabilities, revisited
Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? RARE EVENT! Long positive skew as x  1500 …but contribution  0

20 Chapter 3 Discrete Random Variables and Probability Distributions
3.2 - Probability Distributions for Discrete Random Variables 3.3 - Expected Values 3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative Binomial Distributions 3.6 - The Poisson Probability Distribution

21 Example: Blood Type probabilities, revisited
Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? Poisson distribution x = 0, 1, 2, …, where mean and variance are  = n and  2 = n RARE EVENT! Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to = 10.5 Binomial model applies. X ~ Bin(1500, .007) X ~ Poisson(10.5) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 10.5 Notation: Sometimes the symbol  (“lambda”) is used instead of  (“mu”). n (1 – ) =

22 Example: Blood Type probabilities, revisited
Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? Poisson distribution x = 0, 1, 2, …, where mean and variance are  = n and  2 = n RARE EVENT! Suppose n = 1500 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) = 10.5 X ~ Poisson(10.5) Ex: Probability of exactly X = 15 Type(AB–) individuals = ? Poisson: Binomial: (both ≈ .0437)

23 Example: Deaths in Wisconsin

24 Example: Deaths in Wisconsin
Assuming deaths among young adults are relatively rare, we know the following: Average deaths per year λ = Mortality rate (α) seems constant. Therefore, the Poisson distribution can be used as a good model to make future predictions about the random variable X = “# deaths” per year, for this population (15-24 yrs)… assuming current values will still apply. Probability of exactly X = 600 deaths next year P(X = 600) = 0.0131 R: dpois(600, 584) Probability of exactly X = 1200 deaths in the next two years Mean of 584 deaths per yr  Mean of 1168 deaths per two yrs, so let λ = 1168: P(X = 1200) = Probability of at least one death per day: λ = = 1.6 deaths/day P(X ≥ 1) = P(X = 1) + P(X = 2) + P(X = 3) + … True, but not practical. P(X ≥ 1) = 1 – P(X = 0) = 1 – = 1 – e–1.6 = 0.798

25 Classical Discrete Probability Distributions
Binomial ~ X = # Successes in n trials, P(Success) =  Poisson ~ As above, but n large,  small, i.e., Success RARE Negative Binomial ~ X = # trials for k Successes, P(Success) =  Geometric ~ As above, but specialized to k = 1 Hypergeometric ~ As Binomial, but  changes between trials Multinomial ~ As Binomial, but for multiple categories, with 1 + 2 + … + last = 1 and x1 + x2 + … + xlast = n


Download ppt "Chapter 3 Discrete Random Variables and Probability Distributions"

Similar presentations


Ads by Google