Random Variables Binomial Distributions Statistics Random Variables Binomial Distributions
McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables Where We’re Going Develop the notion of a random variable Numerical data and discrete random variables Discrete random variables and their probabilities McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
Random Variable A random variable x takes on a defined set of values with different probabilities. For example, if you roll a die, the outcome is random (not fixed) and there are 6 possible outcomes, each of which occur with probability one-sixth. For example, if you poll people about their voting preferences, the percentage of the sample that responds “Yes on Proposition 100” is a also a random variable (the percentage will be slightly differently every time you poll). Roughly, probability is how frequently we expect different outcomes to occur if we repeat the experiment over and over
Two Types of Random Variables A discrete random variable can assume a countable number of values. Number of steps to the top of the Eiffel Tower* A continuous random variable can assume any value along a given interval of a number line. The time a tourist stays at the top once s/he gets there * McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
Two Types of Random Variables Discrete random variables Number of sales Number of calls Shares of stock People in line Mistakes per page Continuous random variables Length Depth Volume Time Weight McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
Probability functions A probability function maps the possible values of x against their respective probabilities of occurrence, p(x) p(x) is a number from 0 to 1.0. The area under a probability function is always 1. »»»»»»(probabilities all end up adding to 1) It turns out that if you were to go out and sample many, many times, most sample statistics that you could calculate would follow a normal distribution. What are the 2 parameters (from last time) that define any normal distribution? Remember that a normal curve is characterized by two parameters, a mean and a variability (SD) What do you think the mean value of a sample statistic would be? The standard deviation? Remember standard deviation is natural variability of the population Standard error can be standard error of the mean or standard error of the odds ratio or standard error of the difference of 2 means, etc. The standard error of any sample statistic.
Discrete example: roll of a die p(x) 1/6 1 4 5 6 2 3
Probability mass function (pmf) x p(x) 1 p(x=1)=1/6 2 p(x=2)=1/6 3 p(x=3)=1/6 4 p(x=4)=1/6 5 p(x=5)=1/6 6 p(x=6)=1/6 1.0
Cumulative distribution function (CDF) x P(x) 1/6 1 4 5 6 2 3 1/3 1/2 2/3 5/6 1.0
Cumulative distribution function x P(x≤A) 1 P(x≤1)=1/6 2 P(x≤2)=2/6 3 P(x≤3)=3/6 4 P(x≤4)=4/6 5 P(x≤5)=5/6 6 P(x≤6)=6/6
Practice Problem: Find the probability that in a given hour: The number of patients seen in the ER in any given hour is a random variable represented by x. The probability distribution for x is: x 10 11 12 13 14 P(x) .4 .2 .1 Find the probability that in a given hour: a. exactly 14 patients arrive b. At least 12 patients arrive c. At most 11 patients arrive p(x=14)= .1 p(x12)= (.2 + .1 +.1) = .4 p(x≤11)= (.4 +.2) = .6
Review Question 1 If you toss a die, what’s the probability that you roll a 3 or less? 1/6 1/3 1/2 5/6 1.0
Review Question 1 If you toss a die, what’s the probability that you roll a 3 or less? 1/6 1/3 1/2 5/6 1.0
Review Question 2 Two dice are rolled and the sum of the face values is six? What is the probability that at least one of the dice came up a 3? 1/5 2/3 1/2 5/6 1.0
Review Question 2 Two dice are rolled and the sum of the face values is six. What is the probability that at least one of the dice came up a 3? 1/5 2/3 1/2 5/6 1.0 How can you get a 6 on two dice? 1-5, 5-1, 2-4, 4-2, 3-3 One of these five has a 3. 1/5
Probability Distributions for Discrete Random Variables The probability distribution of a discrete random variable is a graph, table or formula that specifies the probability associated with each possible outcome the random variable can assume. p(x) ≥ 0 for all values of x p(x) = 1 McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
Probability Distributions for Discrete Random Variables x P(x) 1 .30 2 .21 3 .15 4 .11 5 .07 6 .05 7 .04 8 .02 9 10 .01 Say a random variable x follows this pattern: p(x) = (.3)(.7)x-1 for x > 0. This table gives the probabilities (rounded to two digits) for x between 1 and 10. McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
Important discrete probability distribution: The binomial
Binomial Probability Distribution FOUR ESSENTIAL REQUIREMENTS: A fixed number of observations (trials), n e.g., 15 tosses of a coin; 20 patients; 1000 people surveyed A binary outcome e.g., head or tail in each toss of a coin; disease or no disease Generally called “success” and “failure” Probability of success is p, probability of failure is 1 – p Constant probability for each observation e.g., Probability of getting a tail is the same each time we toss the coin The trials are independent (outcome of one trial does not affect the other trials)
EXAMPLE: Random Experiments (Binomial or Not?) Let’s consider a few random experiments. In each of them, we’ll decide whether the random variable is binomial. If it is, we’ll determine the values for n and p. If it isn’t, we’ll explain why not. Example A: A fair coin is flipped 20 times; X represents the number of heads. X is binomial with n = 20 and p = 0.5. Example B: You roll a fair die 50 times; X is the number of times you get a six. X is binomial with n = 50 and p = 1/6.
Example C: Roll a fair die repeatedly; X is the number of rolls it takes to get a six. X is not binomial, because the number of trials is not fixed. Example D: Draw 3 cards at random, one after the other, without replacement, from a set of 4 cards consisting of one club, one diamond, one heart, and one spade; X is the number of diamonds selected. X is not binomial, because the selections are not independent. (The probability (p) of success is not constant, because it is affected by previous selections.)
The Binomial Distribution A Binomial Random Variable n identical trials Two outcomes: Success or Failure P(S) = p; P(F) = q = 1 – p Trials are independent x is the number of Successes in n trials McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
The Binomial Distribution Flip a coin 3 times Outcomes are Heads or Tails P(H) = .5; P(F) = 1-.5 = .5 A head on flip i doesn’t change P(H) of flip i + 1 A Binomial Random Variable n identical trials Two outcomes: Success or Failure P(S) = p; P(F) = q = 1 – p Trials are independent x is the number of S’s in n trials McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
The Binomial Distribution The Binomial Probability Distribution p = Probability of success on a single trial q = 1 – p (i.e probability of failure) n = fixed number of trials x = specified number of successes FORMULA»»»»» McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
The Binomial Distribution Say 40% of the class is female. What is the probability that 6 of the first 10 students walking in will be female? McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
Combinations Formula: The formula show us the number of ways (combinations) a sample of “r” (or “x” or “k”) elements can be obtained from a larger set of “n” distinguishable objects where order does not matter and repetitions are not allowed. Also referred to as r-combination or "n choose r" or the binomial coefficient. In some resources the notation uses k instead of r so you may see these referred to as k-combination or "n choose k." McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables Binomial coefficient Okay, so how do we compute binomial coefficients? As you might have guessed, there is a formula: Binomial coeffients formula The exclamation points are actually part of the formula (and they don't mean the numbers are excited). The notation n! is called the factorial of n, and it means to multiply n times (n - 1) times (n - 2), times every whole number down to 1. For example, 5! = 5 * 4 * 3 * 2 * 1 = 120. Just be careful of one special case: 0! = 1, by definition. On the other hand, most people will end up using the second form of the formula, in which the multiplications are written out more explicitly (and some cancellation has already been done for you). McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
Evaluating Binomial Coefficients Example Evaluate (a) (b) Solution (a) (b)
McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables GIFT!!!!! Just use this calculator https://www.danielsoper.com/statcalc/calculator.aspx?id=72 McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables Let’s do another example using the formula »»»» Red lights on the way to school McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables
And if it’s too complicated (or boring), let’s use the table!! McClave, Statistics, 11th ed. Chapter 4: Discrete Random Variables