Chapter 3 Discrete Random Variables and Probability Distributions Random Variables.2 - Probability Distributions for Discrete Random Variables.3 - Expected Values.4 - The Binomial Probability Distribution.5 - Hypergeometric and Negative Binomial Distributions.6 - The Poisson Probability Distribution
POPULATION Discrete random variable X Examples: shoe size, dosage (mg), # cells,… Pop values x Probabilities f (x) Cumul Probs F (x) x1x1 f (x 1 )f(x1)f(x1) x2x2 f (x 2 )f(x 1 ) + f(x 2 ) x3x3 f (x 3 ) f(x 1 ) + f(x 2 ) + f(x 3 ) ⋮⋮ ⋮1⋮1 Total1 Mean Variance X Total Area = 1 Recall… R e c a l l …
~ The Binomial Distribution ~ Used only when dealing with binary outcomes (two categories: “Success” vs. “Failure”), with a fixed probability of Success ( ) in the population. Calculates the probability of obtaining any given number of Successes in a random sample of n independent “Bernoulli trials.” Has many applications and generalizations, e.g., multiple categories, variable probability of Success, etc.
4 For any randomly selected individual, define a binary random variable: POPULATION 40% Male, 60% Female RANDOM SAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) How can we calculate the probability of P(X = 0), P(X = 1), P(X = 2), …, P(X = 99), P(X = 100)?P(X = x), for x = 0, 1, 2, 3, …,100? f(x) = F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? How can we calculate the probability of P(X = x), for x = 0, 1, 2, 3, …,100?f(x) = xf (x)f (x) x1x1 f (x 1 ) x2x2 f (x 2 ) x3x3 f (x 3 ) ⋮⋮ 1
P(X = x), for x = 0, 1, 2, 3, …,100?f(x) = f(25) = P(X = 25)? How can we calculate the probability of F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? 5 For any randomly selected individual, define a binary random variable: POPULATION 40% Male, 60% Female RANDOM SAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Example: Solution: Model the sample as a sequence of independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female), where P(H) = 0.4, P(T) = 0.6 Solution:.… etc….
permutations of 25 among 100 …etc…etc…etc… There are 100 possible open slots for H 1 to occupy. X = 25 Heads: { H 1, H 2, H 3,…, H 25 } For each one of them, there are 76 possible open slots left for H 25 to occupy. How many possible outcomes of n = 100 tosses exist with X = 25 Heads? For each one of them, there are 99 possible open slots left for H 2 to occupy. For each one of them, there are 98 possible open slots left for H 3 to occupy. For each one of them, there are 77 possible open slots left for H 24 to occupy. Hence, there are ?????????????????????? possible outcomes. 100 99 98 … 77 76 How many possible outcomes of n = 100 tosses exist? … This value is the number of permutations of the coins, denoted 100 P 25.
permutations of 25 among 100 How many possible outcomes of n = 100 tosses exist with X = 25 Heads? How many possible outcomes of n = 100 tosses exist? 100 99 98 … 77 76 X = 25 Heads: { H 1, H 2, H 3,…, H 25 } This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. For example:We would not want to count this as a distinct outcome.
permutations of 25 among 100 How many possible outcomes of n = 100 tosses exist with X = 25 Heads? How many possible outcomes of n = 100 tosses exist? 100 99 98 … 77 76 X = 25 Heads: { H 1, H 2, H 3,…, H 25 } This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. How many is that?By the same logic… 24 23 … 3 2 1 “25 factorial” - denoted 25! “100-choose-25” - denoted or 100 C 25 This value counts the number of combinations of 25 Heads among 100 coins. 100 99 98 … 77 24 23 … 3 2 1 100!_ 25! 75! = 100 nCr 25 on your calculator.
How many possible outcomes of n = 100 tosses exist with X = 25 Heads? What is the probability of each such outcome? Answer: Via independence in binary outcomes between any two coins, 0.4 0.6 0.6 0.4 0.6 … 0.6 0.4 0.4 0.6 =. Therefore, the probability P(X = 25) is equal to……. How many possible outcomes of n = 100 tosses exist? Question: What if the coin were “fair” (unbiased), i.e., = 1 – = 0.5 ? Answer: Recall that, per toss, P(Heads) = = 0.4 P(Tails) = 1 – = 0.6
How many possible outcomes of n = 100 tosses exist with X = 25 Heads? What is the probability of each such outcome? Answer: Via independence in binary outcomes between any two coins, 0.4 0.6 0.6 0.4 0.6 … 0.6 0.4 0.4 0.6 =. Therefore, the probability P(X = 25) is equal to……. How many possible outcomes of n = 100 tosses exist? Question: What if the coin were “fair” (unbiased), i.e., = 1 – = 0.5 ? Answer: 0.5 0.5 0.5 0.5 … 0.5 0.5 0.5 0.5 = = 0.51 – = 0.5 This is the “equally likely” scenario!
“Failure” “Success” x = 0, 1, 2, 3, …,100 What is the probability P(X = 25)? F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? 11 For any randomly selected individual, define a binary random variable: POPULATION 40% Male, 60% Female RANDOM SAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Example: Solution: Model the sample as a sequence of n = 100 independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female). Solution:.… etc…. x 1 – size n n Bernoulli trials with P(“Success”) = , P(“Failure”) = 1 – . n “Success” vs. “Failure” Discrete random variable X = # Males in sample (0, 1, 2, 3, …, n) independent, with constant probability ( ) per trial Discrete random variable X = # “Successes” in sample (0, 1, 2, 3, …, n) Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” f(x) =, x = 0, 1, 2, …, n.
Rh Factor Blood Type +– O A B AB Example: Blood Type probabilities, revisited Reasonably assume that outcomes “Type O” vs. “Not Type O” between two individuals are independent of each other. Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies? Check: 1. Independent outcomes? 2. Constant probability ? From table, = P(Type O) =.461 throughout population.
Example: Blood Type probabilities, revisited X ~ Bin(10,.461) xf (x)f (x)F (x) 0 (.461) 0 (.539) 10 = (.461) 1 (.539) 9 = (.461) 2 (.539) 8 = (.461) 3 (.539) 7 = (.461) 4 (.539) 6 = (.461) 5 (.539) 5 = (.461) 6 (.539) 4 = (.461) 7 (.539) 3 = (.461) 8 (.539) 2 = (.461) 9 (.539) 1 = (.461) 10 (.539) 0 = Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. R: dbinom(0:10, 10,.461) Rh Factor Blood Type +– O A B AB Also, can show mean = x f (x) = and variance 2 = (x – ) 2 f (x) = nn n (1 – ) = (10)(.461) = 4.61 f(x) = (.461) x (.539) 10 – x = 2.48
xf (x)f (x)F (x) 0 (.461) 0 (.539) 10 = (.461) 1 (.539) 9 = (.461) 2 (.539) 8 = (.461) 3 (.539) 7 = (.461) 4 (.539) 6 = (.461) 5 (.539) 5 = (.461) 6 (.539) 4 = (.461) 7 (.539) 3 = (.461) 8 (.539) 2 = (.461) 9 (.539) 1 = (.461) 10 (.539) 0 = R: dbinom(0:10, 10,.461) Also, can show mean = x f (x) = and variance 2 = (x – ) 2 f (x) = Rh Factor Blood Type +– O A B AB Example: Blood Type probabilities, revisited X ~ Bin(10,.461) Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. f(x) = (.461) x (.539) 10 – x nn n (1 – ) = 4.61 = 2.48
X ~ Bin(10,.461)X ~ Bin(1500,.007) 2.48 Rh Factor Blood Type +– O A B AB Example: Blood Type probabilities, revisited Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to Rh Factor Blood Type +– O A B AB Therefore, f(x) = x = 0, 1, 2, …, RARE EVENT! Binomial model applies. Also, can show mean = x f (x) = and variance 2 = (x – ) 2 f (x) = nn n (1 – ) = 10.5 = 10.43
Chapter 3 Discrete Random Variables and Probability Distributions Random Variables.2 - Probability Distributions for Discrete Random Variables.3 - Expected Values.4 - The Binomial Probability Distribution.5 - Hypergeometric and Negative Binomial Distributions.6 - The Poisson Probability Distribution
X ~ Bin(1500,.007) Also, can show mean = x f (x) = and variance 2 = (x – ) 2 f (x) = Poisson distribution x = 0, 1, 2, …, where mean and variance are = n and 2 = n Is there a better alternative? Rh Factor Blood Type +– O A B AB Example: Blood Type probabilities, revisited Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to Rh Factor Blood Type +– O A B AB Therefore, f(x) = x = 0, 1, 2, …, Binomial model applies. RARE EVENT! = 10.5 X ~ Poisson(10.5) = 10.5 = nn n (1 – ) Notation: Sometimes the symbol (“lambda”) is used instead of (“mu”).
Suppose n = 1500 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) Poisson distribution x = 0, 1, 2, …, where mean and variance are = n and 2 = n Rh Factor Blood Type +– O A B AB Example: Blood Type probabilities, revisited Rh Factor Blood Type +– O A B AB RARE EVENT! = 10.5 X ~ Poisson(10.5) Ex: Probability of exactly X = 15 Type(AB–) individuals = ? Binomial: Poisson: (both ≈.0437) Therefore, f(x) = x = 0, 1, 2, …, 1500.
Example: Deaths in Wisconsin
Assuming deaths among young adults are relatively rare, we know the following: Average 584 deaths per year λ = Mortality rate ( α) seems constant. Therefore, the Poisson distribution can be used as a good model to make future predictions about the random variable X = “# deaths” per year, for this population (15-24 yrs)… assuming current values will still apply. Probability of exactly X = 600 deaths next year P(X = 600) = Probability of exactly X = 1200 deaths in the next two years P(X = 1200) = R: dpois(600, 584) Mean of 584 deaths per yr Mean of 1168 deaths per two yrs, so let λ = 1168: Probability of at least one death per day: λ = = 1.6 deaths/day P(X = 1) + P(X = 2) + P(X = 3) + … P(X ≥ 1) = True, but not practical. 1 – P(X = 0) = 1 – = 1 – e –1.6 = 0.798
● Binomial ~ X = # Successes in n trials, P(Success) = ● Poisson ~ As above, but n large, small, i.e., Success RARE ● Negative Binomial ~ X = # trials for k Successes, P(Success) = ● Geometric ~ As above, but specialized to k = 1 ● Hypergeometric ~ As Binomial, but changes between trials ● Multinomial ~ As Binomial, but for multiple categories, with 1 + 2 + … + last = 1 and x 1 + x 2 + … + x last = n