Chapter 3 Discrete Random Variables and Probability Distributions

Slides:



Advertisements
Similar presentations
MOMENT GENERATING FUNCTION AND STATISTICAL DISTRIBUTIONS
Advertisements

Chapter 12 Probability © 2008 Pearson Addison-Wesley. All rights reserved.
Chapter 5 Discrete Random Variables and Probability Distributions
ฟังก์ชั่นการแจกแจงความน่าจะเป็น แบบไม่ต่อเนื่อง Discrete Probability Distributions.
Chapter 4 Discrete Random Variables and Probability Distributions
Review.
Probability Distributions
Continuous Random Variables and Probability Distributions
C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 5.2.
Section 15.8 The Binomial Distribution. A binomial distribution is a discrete distribution defined by two parameters: The number of trials, n The probability.
DISCRETE PROBABILITY DISTRIBUTIONS
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions 
Binomial Experiment A binomial experiment (also known as a Bernoulli trial) is a statistical experiment that has the following properties:
Chapter 4. Discrete Random Variables A random variable is a way of recording a quantitative variable of a random experiment. A variable which can take.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete.
Chapter 3 Discrete Random Variables and Probability Distributions  Random Variables.2 - Probability Distributions for Discrete Random Variables.3.
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
Chapter 3 Discrete Random Variables and Probability Distributions  Random Variables.2 - Probability Distributions for Discrete Random Variables.3.
Random Variables Example:
Probability Distributions, Discrete Random Variables
Chapter 3 Discrete Random Variables and Probability Distributions  Random Variables.2 - Probability Distributions for Discrete Random Variables.3.
Bernoulli Trials, Geometric and Binomial Probability models.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions 
3.1 Discrete Random Variables Present the analysis of several random experiments Discuss several discrete random variables that frequently arise in applications.
Chapter 3 Applied Statistics and Probability for Engineers
Chapter 3 Discrete Random Variables and Probability Distributions
MAT 446 Supplementary Note for Ch 3
Supplemental Lecture Notes
Supplemental Lecture Notes
Discrete Probability Distributions
Random variables (r.v.) Random variable
Random Variables.
Probability Distributions
Chapter 5 Joint Probability Distributions and Random Samples
Hypergeometric Distribution
Discrete random variable X Examples: shoe size, dosage (mg), # cells,…
Probability Distributions; Expected Value
Binomial Distribution
Virtual University of Pakistan
ENGR 201: Statistics for Engineers
Chapter 3 Discrete Random Variables and Probability Distributions
Chapter 4 Continuous Random Variables and Probability Distributions
Week 8 Chapter 14. Random Variables.
Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions
STA 291 Spring 2008 Lecture 7 Dustin Lueker.
ASV Chapters 1 - Sample Spaces and Probabilities
Introduction to Probability and Statistics
Discrete random variable X Examples: shoe size, dosage (mg), # cells,…
Chapter 3 Discrete Random Variables and Probability Distributions
Some Discrete Probability Distributions
STATISTICAL MODELS.
Random Variables Binomial Distributions
Some Discrete Probability Distributions
Chapter 3 Discrete Random Variables and Probability Distributions
DISCRETE RANDOM VARIABLES AND THEIR PROBABILITY DISTRIBUTIONS
Lecture 11: Binomial and Poisson Distributions
Introduction to Probability and Statistics
Known Probability Distributions
Known Probability Distributions
Known Probability Distributions
Each Distribution for Random Variables Has:
Known Probability Distributions
Chapter 11 Probability.
Applied Statistical and Optimization Models
Presentation transcript:

Chapter 3 Discrete Random Variables and Probability Distributions 3.2 - Probability Distributions for Discrete Random Variables 3.3 - Expected Values 3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative Binomial Distributions 3.6 - The Poisson Probability Distribution

Discrete random variable X Examples: shoe size, dosage (mg), # cells,… Recall… POPULATION Discrete random variable X Examples: shoe size, dosage (mg), # cells,… Pop values x Probabilities p(x) Cumul Probs F (x) x1 p(x1) x2 p(x2) p(x1) + p(x2) x3 p(x3) p(x1) + p(x2) + p(x3) ⋮ 1 Total X Total Area = 1 Mean Variance

~ The Binomial Distribution ~ Used only when dealing with binary outcomes (two categories: “Success” vs. “Failure”), with a fixed probability of Success () in the population. Calculates the probability of obtaining any given number of Successes in a random sample of n independent “Bernoulli trials.” Has many applications and generalizations, e.g., multiple categories, variable probability of Success, etc.

How can we calculate the probability of POPULATION 40% Male, 60% Female For any randomly selected individual, define a binary random variable: RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) x p(x) x1 p(x1) x2 p(x2) x3 p(x3) ⋮ 1 F(x) F(x1) F(x2) ⋮ 1 How can we calculate the probability of How can we calculate the probability of P(X = x), for x = 0, 1, 2, 3, …,100? p(x) = P(X = x), for x = 0, 1, 2, 3, …,100? P(X = 0), P(X = 1), P(X = 2), …, P(X = 99), P(X = 100)? p(x) = F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?

How can we calculate the probability of POPULATION 40% Male, 60% Female For any randomly selected individual, define a binary random variable: RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Example: How can we calculate the probability of F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? p(25) = P(X = 25)? P(X = x), for x = 0, 1, 2, 3, …,100? p(x) = Solution: Solution: Model the sample as a sequence of independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female), where P(H) = 0.4, P(T) = 0.6 .… etc….

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 … X = 25 Heads: { H1, H2, H3,…, H25 } HOWEVER… permutations of 25 among 100 There are 100 possible open slots for H1 to occupy. For each one of them, there are 99 possible open slots left for H2 to occupy. For each one of them, there are 98 possible open slots left for H3 to occupy. …etc…etc…etc… For each one of them, there are 77 possible open slots left for H24 to occupy. For each one of them, there are 76 possible open slots left for H25 to occupy. Hence, there are ?????????????????????? possible outcomes. 100  99  98  …  77  76 This value is the number of permutations of the coins, denoted 100P25.

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 HOWEVER… permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. For example: We would not want to count this as a distinct outcome. 1 2 3 4 5 . . . . . . 97 98 99 100

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 X = 25 Heads: { H1, H2, H3,…, H25 } 100  99  98  …  77  76 HOWEVER… permutations of 25 among 100 This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions. How many is that? By the same logic…... 25  24  23  …  3  2  1 “25 factorial” - denoted 25! 100  99  98  …  77  76 25  24  23  …  3  2  1 100!_ 25! 75! = R: choose(100, 25) Calculator: 100 nCr 25 “100-choose-25” - denoted or 100C25 This value counts the number of combinations of 25 Heads among 100 coins.

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 0.4 0.6 . . . . . . Answer: What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = 0.4 P(Tails) = 1 –  = 0.6 Answer: Via independence in binary outcomes between any two coins, 0.4  0.6  0.6  0.4  0.6  …  0.6  0.4  0.4  0.6 = . Therefore, the probability P(X = 25) is equal to……. R: dbinom(25, 100, .4)

How many possible outcomes of n = 100 tosses exist with X = 25 Heads? 3 4 5 . . . . . . 97 98 99 100 0.5 . . . . . . 0.4 0.6 . . . . . . Answer: This is the “equally likely” scenario! What is the probability of each such outcome? Recall that, per toss, P(Heads) =  = 0.4 P(Tails) = 1 –  = 0.6  = 0.5 1 –  = 0.5 Answer: Via independence in binary outcomes between any two coins, 0.4  0.6  0.6  0.4  0.6  …  0.6  0.4  0.4  0.6 = . 0.5  0.5  0.5  0.5  0.5  …  0.5  0.5  0.5  0.5 = Therefore, the probability P(X = 25) is equal to……. Question: What if the coin were “fair” (unbiased), i.e.,  = 1 –  = 0.5 ?

independent, with constant probability () per trial POPULATION 40% Male, 60% Female For any randomly selected individual, define a binary random variable: “Success” vs. “Failure” “Failure” “Success”  1 –  RANDOMSAMPLE n = 100 Discrete random variable X = # Males in sample (0, 1, 2, 3, …, n) Discrete random variable X = # Males in sample (0, 1, 2, 3, …, 99, 100) Discrete random variable X = # “Successes” in sample (0, 1, 2, 3, …, n) size n Example: What is the probability P(X = 25)? F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100? x x = 0, 1, 2, 3, …,100 n Solution: Model the sample as a sequence of n = 100 independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female). Solution: n Bernoulli trials with P(“Success”) = , P(“Failure”) = 1 – . independent, with constant probability () per trial Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability mass function” p(x) = , x = 0, 1, 2, …, n. .… etc….

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Check: 1. Independent outcomes? Reasonably assume that outcomes “Type O” vs. “Not Type O” between two individuals are independent of each other.  2. Constant probability  ? Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) From table,  = P(Type O) = .461 throughout population.  Binomial model applies?

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461)

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461)

n = 10 p = .461 pmf = function(x)(dbinom(x, n, p)) N = 100000 x = 0:10 bin.dat = rep(x, N*pmf(x)) hist(bin.dat, freq = F, breaks = c(-.5, x+.5), col = "green") axis(1, at = x) axis(2)

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 4.61 = (10)(.461) n (1 – ) = 2.48

Example: Blood Type probabilities, revisited p(x) = (.461)x (.539)10 – x Example: Blood Type probabilities, revisited R: dbinom(0:10, 10, .461) Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 x p(x) F (x) (.461)0 (.539)10 = 0.00207 0.00207 1 (.461)1 (.539)9 = 0.01770 0.01977 2 (.461)2 (.539)8 = 0.06813 0.08790 3 (.461)3 (.539)7 = 0.15538 0.24328 4 (.461)4 (.539)6 = 0.23257 0.47585 5 (.461)5 (.539)5 = 0.23870 0.71455 6 (.461)6 (.539)4 = 0.17013 0.88468 7 (.461)7 (.539)3 = 0.08315 0.96783 8 (.461)8 (.539)2 = 0.02667 0.99450 9 (.461)9 (.539)1 = 0.00507 0.99957 10 (.461)10 (.539)0 = 0.00043 1.00000 Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type O) Binomial model applies. X ~ Bin(10, .461) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 4.61 n (1 – ) = 2.48

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. RARE EVENT! Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to Binomial model applies. X ~ Bin(10, .461) X ~ Bin(1500, .007) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 10.5 n (1 – ) 2.48 = 10.43

Example: Blood Type probabilities, revisited Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? RARE EVENT! Long positive skew as x  1500 …but contribution  0

Chapter 3 Discrete Random Variables and Probability Distributions 3.2 - Probability Distributions for Discrete Random Variables 3.3 - Expected Values 3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative Binomial Distributions 3.6 - The Poisson Probability Distribution

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? Poisson distribution x = 0, 1, 2, …, where mean and variance are  = n and  2 = n RARE EVENT!  Suppose n = 10 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) n = 1500 individuals are to = 10.5 Binomial model applies. X ~ Bin(1500, .007) X ~ Poisson(10.5) Also, can show mean  =  x p(x) = and variance  2 =  (x – ) 2 p(x) = n = 10.5 Notation: Sometimes the symbol  (“lambda”) is used instead of  (“mu”). n (1 – ) = 10.43

Example: Blood Type probabilities, revisited Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Rh Factor Blood Type + – O .384 .077 .461 A .323 .065 .388 B .094 .017 .111 AB .032 .007 .039 .833 .166 .999 Therefore, p(x) = x = 0, 1, 2, …, 1500. Is there a better alternative? Poisson distribution x = 0, 1, 2, …, where mean and variance are  = n and  2 = n RARE EVENT! Suppose n = 1500 individuals are to be selected at random from the population. Probability table for X = #(Type AB–) = 10.5 X ~ Poisson(10.5) Ex: Probability of exactly X = 15 Type(AB–) individuals = ? Poisson: Binomial: (both ≈ .0437)

Example: Deaths in Wisconsin

Example: Deaths in Wisconsin Assuming deaths among young adults are relatively rare, we know the following: Average 584 deaths per year λ = Mortality rate (α) seems constant. Therefore, the Poisson distribution can be used as a good model to make future predictions about the random variable X = “# deaths” per year, for this population (15-24 yrs)… assuming current values will still apply. Probability of exactly X = 600 deaths next year P(X = 600) = 0.0131 R: dpois(600, 584) Probability of exactly X = 1200 deaths in the next two years Mean of 584 deaths per yr  Mean of 1168 deaths per two yrs, so let λ = 1168: P(X = 1200) = 0.00746 Probability of at least one death per day: λ = = 1.6 deaths/day P(X ≥ 1) = P(X = 1) + P(X = 2) + P(X = 3) + … True, but not practical. P(X ≥ 1) = 1 – P(X = 0) = 1 – = 1 – e–1.6 = 0.798

Classical Discrete Probability Distributions Binomial ~ X = # Successes in n trials, P(Success) =  Poisson ~ As above, but n large,  small, i.e., Success RARE Negative Binomial ~ X = # trials for k Successes, P(Success) =  Geometric ~ As above, but specialized to k = 1 Hypergeometric ~ As Binomial, but  changes between trials Multinomial ~ As Binomial, but for multiple categories, with 1 + 2 + … + last = 1 and x1 + x2 + … + xlast = n