Section 5.1 Review and Preview
Review and Preview This chapter combines the methods of descriptive statistics presented in Chapter 2 and 3 and those of probability presented in Chapter 4 to describe and analyze probability distributions. Probability Distributions describe what will probably happen instead of what actually did happen, and they are often given in the format of a graph, table, or formula. Emphasize the combination of the methods in Chapter 3 (descriptive statistics) with the methods in Chapter 4 (probability). Chapter 3 one would conduct an actual experiment and find and observe the mean, standard deviation, variance, etc. and construct a frequency table or histogram Chapter 4 finds the probability of each possible outcome Chapter 5 presents the possible outcomes along with the relative frequencies we expect Page 200 of Elementary Statistics, 10th Edition
Preview In order to fully understand probability distributions, we must first understand the concept of a random variable, and be able to distinguish between discrete and continuous random variables. In this chapter we focus on discrete probability distributions. In particular, we discuss binomial probability distributions. Emphasize the combination of the methods in Chapter 3 (descriptive statistics) with the methods in Chapter 4 (probability). Chapter 3 one would conduct an actual experiment and find and observe the mean, standard deviation, variance, etc. and construct a frequency table or histogram Chapter 4 finds the probability of each possible outcome Chapter 5 presents the possible outcomes along with the relative frequencies we expect Page 200 of Elementary Statistics, 10th Edition Emphasize the combination of the methods in Chapter 3 (descriptive statistics) with the methods in Chapter 4 (probability). Chapter 3 one would conduct an actual experiment and find and observe the mean, standard deviation, variance, etc. and construct a frequency table or histogram Chapter 4 finds the probability of each possible outcome Chapter 5 presents the possible outcomes along with the relative frequencies we expect
Combining Descriptive Methods and Probabilities In this chapter we will construct probability distributions by presenting possible outcomes along with the relative frequencies we expect. page 200 of Elementary Statistics, 10th Edition
Section 5.2 Random Variables
Learning Targets This section introduces the important concept of a probability distribution, which gives the probability for each value of a variable that is determined by chance. Give consideration to distinguishing between outcomes that are likely to occur by chance and outcomes that are “unusual” in the sense they are not likely to occur by chance.
Learning Targets The concept of random variables and how they relate to probability distributions Distinguish between discrete random variables and continuous random variables Develop formulas for finding the mean, variance, and standard deviation for a probability distribution Determine whether outcomes are likely to occur by chance or they are unusual (in the sense that they are not likely to occur by chance)
Is life insurance worth the cost? Why Is This Important??? There is a 0.9986 probability that a randomly selected 30 year-old male lives through the year (based on data from the U.S Department of Health and Human Services). A Fidelity life insurance company charges $161 for insuring that the male will live through the year. If the male does not survive the year, the policy pays out $100,000 as a death benefit. Is life insurance worth the cost?
(Number of peas with green pods) For Example … The Secret Garden A Probability Distribution Mr. Llorens has a strange fascination with pea pods. Every summer, Mr. Llorens grows peas in his garden and keeps track of how many green pea pods and yellow pea pods each plant produces. Here is a sample of some of the results in his creeper journal. x (Number of peas with green pods) P(x) 0.01 1 0.015 2 0.088 3 0.264 4 0.396 5 0.228
Random Variable Probability Distribution Random variable: a variable (typically represented by x) that has a single numerical value, determined by chance, for each outcome of a procedure. EX: the number of peas with green pods among 5 offspring peas Probability distribution: a description that gives the probability for each value of the random variable; often expressed in the format of a graph, table, or formula. page 201 of Elementary Statistics, 10th Edition
Requirements for Probability Distribution The sum of all probabilities is 1. ΣP(x) = 1, where x assumes all possible values. (values such as 0.999 or 1.001 are acceptable because they result from rounding errors) Each individual probability is a value between 0 and 1 inclusive. 0 P(x) 1, for every individual value of x. Page 203 of Elementary Statistics, 10th Edition
Characteristics of a Probability Distribution The sum of all probabilities must be 1, but values such as 0.999 or 1.001 are acceptable because they result from rounding errors. Each probability value must be between 0 and 1 inclusive.
Discrete and Continuous Random Variables Discrete random variable: either a finite number of values or countable number of values, where “countable” refers to the fact that there might be infinitely many values, but they result from a counting process. (it cannot be a decimal!!) Continuous random variable: infinitely many values, and those values can be associated with measurements on a continuous scale without gaps or interruptions. (if something could be a decimal, it is continuous) This chapter deals exclusively with discrete random variables - experiments where the data observed is a ‘countable’ value. Give examples. Following chapters will deal with continuous random variables. Page 201 of Elementary Statistics, 10th Edition
CHECK YOURSELF!!! x P(x) 0.250 1 0.382 2 0.119 3 0.481 4 0.260 5 0.127 No ! P(x) > 1
CHECK YOURSELF!!! 0.23 0.165 0.125 0.10 0.08 0.075 0.06 0.03 0.045 0.035 0.025 0.01 0.02
Section 5.2 only deals with discrete random variables. CAUTION!!! Section 5.2 only deals with discrete random variables.
Graphs The probability histogram is very similar to a relative frequency histogram, but the vertical scale shows probabilities. page 202 of Elementary Statistics, 10th Edition Probability Histograms relate nicely to Relative Frequency Histograms of Chapter 2, but the vertical scale shows probabilities instead of relative frequencies based on actual sample results Observe that the probabilities of each random variable is also the same as the AREA of the rectangle representing the random variable. This fact will be important when we need to find probabilities of continuous random variables - Chapter 6.
Example 1: Identify the given random variable as being discrete or continuous. a) The number of people now driving a car in the United States. b) The weight of the gold stored in Fort Knox. c) The height of the last airplane departed from JFK Airport in New York City. Discrete Continuous page 201 of Elementary Statistics, 10th Edition Continuous
Example 1 continued: Identify the given random variable as being discrete or continuous. d) The number of cars in San Francisco that crashed last year. e) The time required to fly from Los Angeles to London. Discrete Continuous page 201 of Elementary Statistics, 10th Edition
The New Found about the Mean Suppose we want to find the mean of a data set: 1, 1, 1, 1, 1, 1, 2, 2, 2, 4, 4, 5, 5, 5, 5, 7, 8, 8, 8, 9. 6 of 1’s, 3 of 2’s, 2 of 4’s, 4 of 5’s, 1 of 7, 3 of 8’s, 1 of 9. The mean of this data set is: µ = 1∗6+2∗3+4∗2+5∗4+7∗1+8∗3+9∗1 20 = 1∗ 6 20 + 2∗ 3 20 + 4∗ 2 20 + 5∗ 4 20 + 7∗ 1 20 + 8∗ 3 20 + 9∗ 1 20 = x P(x) x 1 2 4 5 7 8 9 P(x) 6/20 3/20 2/20 4/20 1/20
“Freaking Finally, Ms. P!!!” – Direct Quote by You Now that you have the background info … on to the MAIN EVENT Formula 5-1 Mean for a probability distribution Formula 5-2 Variance for a probability distribution (easier to understand) Formula 5-3 Variance for a probability distribution (easier to compute) Formula 5-4 Standard deviation for a probability distribution
Roundoff Rule for µ, , and 2 Round results by carrying one more decimal place than the number of decimal places used for the random variable x. If the values of x are integers, round µ, σ, and σ2 to one decimal place. If the rounding results in a value that looks as if the mean is ‘all’ or none’ (when, in fact, this is not true), then leave as many decimal places as necessary to correctly reflect the true mean. Page 205 of Elementary Statistics, 10th Edition
So, How Do We Use These? Let’s Calculate! Remember Mr. Llorens and his creepy plant obsession? Let’s Calculate! x (Number of peas with green pods) P(x) 0.01 1 0.015 2 0.088 3 0.264 4 0.396 5 0.228 Find the mean, variance, and standard deviation for the probability distribution. What does this mean if these results are the number of green pods out of 5 possible pods?
Solution for µ, 2, and P(x) = sum(L2) = 1.001, requirement is OK. Save values of variable x to L1 Save values of P(x) to L2 Calculate µ = xP(x) = sum(L1*L2) = 3.707 A Calculate 2 = x2P(x) – µ2 = sum((L1) 2*L2) – A2 = 1.0372 Calculate = 1.0184 The value µ = 3.707 is the mean number of green pods out of 5 pods. If the rounding results in a value that looks as if the mean is ‘all’ or none’ (when, in fact, this is not true), then leave as many decimal places as necessary to correctly reflect the true mean. Page 205 of Elementary Statistics, 10th Edition
Example 2: Determine whether or not a probability distribution is given. If a probability distribution is given, find its mean and standard deviation. If a probability distribution is not given, identify the requirement(s) that are not satisfied. Three males with X-linked genetic disorder have one child each. The random variable x is the number of children among the three who inherit the X-linked genetic disorder. x P(x) 0.125 1 0.375 2 3 page 201 of Elementary Statistics, 10th Edition
Solution for µ, 2, and P(x) = sum(L2) = 1, requirement is OK. Save values of variable x to L1 Save values of P(x) to L2 Calculate µ = xP(x) = sum(L1*L2) = 1.5 A Calculate 2 = x2P(x) – µ2 = sum((L1) 2*L2) – A2 = 0. 75 Calculate = 0.866 The value µ = 1.5 is the mean number of children among the three who inherit the X-linked genetic disorder. The standard deviation is 0.866. If the rounding results in a value that looks as if the mean is ‘all’ or none’ (when, in fact, this is not true), then leave as many decimal places as necessary to correctly reflect the true mean. Page 205 of Elementary Statistics, 10th Edition
Example 3: Determine whether or not a probability distribution is given. If a probability distribution is given, find its mean and standard deviation. If a probability distribution is not given, identify the requirement(s) that are not satisfied. Air America has a policy of routinely overbooking flights. The random variable x represents the number of passengers who cannot be boarded because there are more passengers than seats (based on data from an IBM research paper by Lawrence, Hong, and Cherrier.) x P(x) 0.051 1 0.141 2 0.274 3 0.331 4 0.187 page 201 of Elementary Statistics, 10th Edition
Solution for µ, 2, and P(x) = sum(L2) =0.984, requirement is not satisfied. It is not a probability distribution. If the rounding results in a value that looks as if the mean is ‘all’ or none’ (when, in fact, this is not true), then leave as many decimal places as necessary to correctly reflect the true mean. Page 205 of Elementary Statistics, 10th Edition
Example 4: Determine whether or not a probability distribution is given. If a probability distribution is given, find its mean and standard deviation. If a probability distribution is not given, identify the requirement(s) that are not satisfied. x P(x) 1 0.572 2 0.163 3 0.135 4 0.08 5 0.05 page 201 of Elementary Statistics, 10th Edition
Solution for µ, 2, and P(x) = sum(L2) = 1, requirement is OK. Save values of variable x to L1 Save values of P(x) to L2 Calculate µ = xP(x) = sum(L1*L2) = 1.873 A Calculate 2 = x2P(x) – µ2 = sum((L1) 2*L2) – A2 = 1.4609 Calculate = 1.2087 The value µ = 1.873 is the mean. The standard deviation is 1.2087. If the rounding results in a value that looks as if the mean is ‘all’ or none’ (when, in fact, this is not true), then leave as many decimal places as necessary to correctly reflect the true mean. Page 205 of Elementary Statistics, 10th Edition
Identifying Unusual Results Range Rule of Thumb According to the range rule of thumb, most values should lie within 2 standard deviations of the mean. We can therefore identify “unusual” values by determining if they lie outside these limits: Maximum usual value = μ + 2σ Minimum usual value = μ – 2σ Page 205-206 of Elementary Statistics, 10th Edition
So how do we know what to expect? The Exciting Stuff So how do we know what to expect? Range Rule of Thumb Maximum usual value = Minimum usual value =
Identifying Unusual Results Probabilities Rare Event Rule for Inferential Statistics If, under a given assumption (such as the assumption that a coin is fair), the probability of a particular observed event (such as 992 heads in 1000 tosses of a coin) is extremely small, we conclude that the assumption is probably not correct. The discussion of this topic in the text includes some difficult concepts, but it also includes an extremely important approach used often in statistics. Page 207-208 of Elementary Statistics, 10th Edition
Identifying Unusual Results Probabilities Using Probabilities to Determine When Results Are Unusual Unusually high: x successes among n trials is an unusually high number of successes if P(x or more) ≤ 0.05. The discussion of this topic in the text includes some difficult concepts, but it also includes an extremely important approach used often in statistics. Page 207-208 of Elementary Statistics, 10th Edition Unusually low: x successes among n trials is an unusually low number of successes if P(x or fewer) ≤ 0.05.
Example 5: Refer to the table, which describes results from eight offspring peas. The random variable x represents the number of offspring peas with green pods. a) Find the probability of getting exactly 7 peas with green pods. b) Find the probability of getting 7 or more peas with green pods. x P(x) 0+ 1 2 0.004 3 0.023 4 0.087 5 0.208 6 0.311 7 0.267 8 0.100 P(X = 7) = 0.267 page 201 of Elementary Statistics, 10th Edition P(X 7) = P(X = 7) + P(X = 8) = 0.267 + 0.100 = 0.367
Example 5 continued: Refer to the table, which describes results from eight offspring peas. The random variable x represents the number of offspring peas with green pods. c) Which probability is relevant for determining whether 7 is an unusually high number of peas with green pods: the result from part (a) or part (b)? d) Is 7 an unusually high number of peas with green pods? Why or why not? x P(x) 0+ 1 2 0.004 3 0.023 4 0.087 5 0.208 6 0.311 7 0.267 8 0.100 page 201 of Elementary Statistics, 10th Edition P(X 7) = P(X = 7) + P(X = 8) = 0.267 + 0.100 = 0.367 > 0.05. It is not unusual high. P(X = 7) = 0.267 > 0.05. It is not unusual high.
Example 6: Based on past results found in the Information Please Almanac, there is a 0.1919 probability that a baseball World Series contest will last four games, a 0.2121 probability that it will last five games, a 0.2222 probability that it will last six games, and a 0.3737 probability that it will last seven games. a) Does the given information describe a probability distribution? x P(x) 4 0.1919 5 0.2121 6 0.2222 7 0.3737 page 201 of Elementary Statistics, 10th Edition Yes. Since 0.1919 + 0.2121 + 0.2222 + 0.3737 = 0.9999 close enough to 1.
Example 6 continued: Based on past results found in the Information Please Almanac, there is a 0.1919 probability that a baseball World Series contest will last four games, a 0.2121 probability that it will last five games, a 0.2222 probability that it will last six games, and a 0.3737 probability that it will last seven games. b) Assuming that the given information describes a probability distribution, find the mean and standard deviation for the numbers of games in World Series contests. page 201 of Elementary Statistics, 10th Edition
Solution for µ, 2, and P(x) = sum(L2) = 0.9999, requirement is OK. Save values of variable x to L1 Save values of P(x) to L2 Calculate µ = xP(x) = sum(L1*L2) = 5.7772 A Calculate 2 = x2P(x) – µ2 = sum((L1) 2*L2) – A2 = 1.3074 Calculate = 1.1434 The value µ = 5.7772 is the mean. The standard deviation is 1.1434. If the rounding results in a value that looks as if the mean is ‘all’ or none’ (when, in fact, this is not true), then leave as many decimal places as necessary to correctly reflect the true mean. Page 205 of Elementary Statistics, 10th Edition
Example 6 continued: Based on past results found in the Information Please Almanac, there is a 0.1919 probability that a baseball World Series contest will last four games, a 0.2121 probability that it will last five games, a 0.2222 probability that it will last six games, and a 0.3737 probability that it will last seven games. c) Is it unusual for a team to “sweep” by winning in four games? Why or why not? page 201 of Elementary Statistics, 10th Edition P(X = 4) = 0.1919 > 0.05. It is usual.
Your Turn: Based on data from CarMax Your Turn: Based on data from CarMax.com, when a car is randomly selected, the number of bumper stickers and the corresponding probabilities are as follows: 0 (0.824); 1 (0.083); 2 (0.039); 3 (0.014); 4 (0.012); 5 (0.008); 6 (0.008); 7 (0.004); 8 (0.004); 9(0.004). a) Does the given information describe a probability distribution? page 201 of Elementary Statistics, 10th Edition 0.824 + 0.083 + 0.039 +0.014 +0.012 + 0.008 *2 + 0.004*3 = 1 Yes, it describes a probability distribution.
Your Turn: Based on data from CarMax Your Turn: Based on data from CarMax.com, when a car is randomly selected, the number of bumper stickers and the corresponding probabilities are as follows: 0 (0.824); 1 (0.083); 2 (0.039); 3 (0.014); 4 (0.012); 5 (0.008); 6 (0.008); 7 (0.004); 8 (0.004); 9(0.004). b) Assuming that a probability distribution is described, find its mean and standard deviation. page 201 of Elementary Statistics, 10th Edition = 0.435 = 1.2774
Your Turn: Based on data from CarMax Your Turn: Based on data from CarMax.com, when a car is randomly selected, the number of bumper stickers and the corresponding probabilities are as follows: 0 (0.824); 1 (0.083); 2 (0.039); 3 (0.014); 4 (0.012); 5 (0.008); 6 (0.008); 7 (0.004); 8 (0.004); 9(0.004). c) Use the range rule of thumb to identify the range of values for usual numbers of bumper stickers. page 201 of Elementary Statistics, 10th Edition – 2 = 0.435 – 2(1.2774) = – 2.1198 + 2 = 0.435 + 2(1.2774) = 2.9898 Usual number of bumper stickers ranges from 0 to 2.9898
Example 7 continued: Based on data from CarMax Example 7 continued: Based on data from CarMax.com, when a car is randomly selected, the number of bumper stickers and the corresponding probabilities are as follows: 0 (0.824); 1 (0.083); 2 (0.039); 3 (0.014); 4 (0.012); 5 (0.008); 6 (0.008); 7 (0.004); 8 (0.004); 9(0.004). d) Is it unusual for a car to have more than one bumper sticker? Why or why not? page 201 of Elementary Statistics, 10th Edition 0.039 + 0.014 + 0.012 + 0.008 *2 + 0.004*3 = 0.093 > 0.05 It is usual.
Example 7: Let the random variable x represent the number of girls in a family of three children. Construct a table describing the probability distribution, then find the mean and standard deviation. (HINT: List the different possible outcomes.) Is it unusual for a family of three children to consist of three girls?
Different possible outcomes of having three children: x P(x) 1/8 1 3/8 2 3 BBB 0 girls: 1 girl: 2 girls: 3 girls: GBB BGB BBG GGB GBG BGG GGG µ = 1.5, σ = 0.866 Is it unusual for a family of three children to consist of three girls? – 2 = 1.5 – 2(0.866) = – 0.2321 + 2 = 1.5 + 2(0.866) = 3.2321 X = 3 is in the usual data range. /or, P(X=3) = 1/8 = 0.125 > 0.05
THOUGHT FROM A GAME You and your neighbor are going to play a game of rolling a die. The rules of the game are as follows: You and your neighbor each roll 1 die, one time. If you roll a number less than 5, then your neighbor gives you $2. If you roll a 5 or 6, then you give your neighbor $4. Does this seem fair?
E = Σ[x • P(x)] Expected Value The expected value of a discrete random variable is denoted by E, and it represents the mean value of the outcomes. It is obtained by finding the value of Σ [x • P(x)]. E = Σ[x • P(x)] Also called expectation or mathematical expectation Plays a very important role in decision theory page 208 of Elementary Statistics, 10th Edition
Example 8: In the Illinois Pick 3 lottery game, you pay 50¢ to select a sequence of three digits, such as 233. If you select the same sequence of three digits that are drawn, you win and collect $250. a) How many different selections are possible? b) What is the probability of winning? c) If you win, what is your net profit? x P(x) -0.5 999/1000 249.5 1/000 10 x 10 x 10 = 1000 page 201 of Elementary Statistics, 10th Edition 1/1000 $249.5
Example 8 continued: In the Illinois Pick 3 lottery game, you pay 50¢ to select a sequence of three digits, such as 233. If you select the same sequence of three digits that are drawn, you win and collect $250. d) Find the expected value. e) If you bet 50 ¢ in Illinois’ Pick 4 game, the expected value is –25¢. Which bet is better: A 50¢ bet in the Illinois Pick 3 game or a 50¢ bet in the Illinois Pick 4 game? Explain. x P(x) -0.5 999/1000 249.5 1/000 E = - 0.5 * 999/1000 + 249.5 * 1/1000 = - 0.25 page 201 of Elementary Statistics, 10th Edition
a) How many different selections are possible? 10 x 10 x 10 x 10 = 10000 b) What is the probability of winning? 1/10000 c) Find your expected value of winning. E = - 0.5 * 9999/10000 + 249.5 * 1/10000 = - 0.475 Averagely speaking, the Pick 4 will lose 47.5 ¢ comparing with the Pick 3 which will lose 25 ¢. So the Pick 3 will be the better bet. x P(x) -0.5 9999/10000 249.5 1/10000
Example 9: When playing roulette at the Bellagio casino in Las Vegas, a gambler is trying to decide whether to bet $5 on the number 13 or to bet $5 that the outcome is any one of these five possibilities: 0 or 00 or 1 or 2 or 3. The expected value for a $5 bet for a single number is –26¢. For the $5 bet that the outcome is 0 or 00 or 1 or 2 or 3, there is a probability of 5/38 of making a net profit of $30 and a 33/38 probability of losing $5. a) Find the expected value for the $5 bet that the outcome is 0 or 00 or 1 or 2 or 3. page 201 of Elementary Statistics, 10th Edition x P(x) -5 33/38 30 5/38 E = - 5 * 33/38 + 30 * 5/38 = - 0.395
Example 9 continued: When playing roulette at the Bellagio casino in Las Vegas, a gambler is trying to decide whether to bet $5 on the number 13 or to bet $5 that the outcome is any one of these five possibilities: 0 or 00 or 1 or 2 or 3. The expected value for a $5 bet for a single number is –26¢. For the $5 bet that the outcome is 0 or 00 or 1 or 2 or 3, there is a probability of 5/38 of making a net profit of $30 and a 33/38 probability of losing $5. b) Which bet is better: A $5 bet on the number 13 or a $5 bet that the outcome is 0 or 00 or 1 or 2 or 3? Why? page 201 of Elementary Statistics, 10th Edition
Example 10: There is a 0.9986 probability that a randomly selected 30-year-old male lives through the year (based on data from the U.S. Department of Health and Human Services). A Fidelity life insurance company charges $161 for insuring that the male will live through the year. If the males does not survive the year, the policy pays out $100,000 as a death benefit. a) From the perspective of the 30-year-old male, what are the values corresponding to the two events of surviving the year and not surviving? page 201 of Elementary Statistics, 10th Edition - $161, $99839
Example 10 continued: There is a 0 Example 10 continued: There is a 0.9986 probability that a randomly selected 30-year-old male lives through the year (based on data from the U.S. Department of Health and Human Services). A Fidelity life insurance company charges $161 for insuring that the male will live through the year. If the males does not survive the year, the policy pays out $100,000 as a death benefit. b) If a 30-year-old males purchases the policy, what is his expected value? x P(x) -161 0.9986 99839 0.0014 page 201 of Elementary Statistics, 10th Edition E = - 161 * 0.9986 + 99839 * 0.0014 = - 21
Example 10 continued: There is a 0 Example 10 continued: There is a 0.9986 probability that a randomly selected 30-year-old male lives through the year (based on data from the U.S. Department of Health and Human Services). A Fidelity life insurance company charges $161 for insuring that the male will live through the year. If the males does not survive the year, the policy pays out $100,000 as a death benefit. c) Can the insurance company expect to make a profit from many such policies? Why? page 201 of Elementary Statistics, 10th Edition Averagely speaking, each policy holder will lose $21 for purchasing this life insurance. Or, the insurance company will make $21 profit from each policy holder.
Recap In this section we have discussed: Combining methods of descriptive statistics with probability. Random variables and probability distributions. Probability histograms. Requirements for a probability distribution. Mean, variance and standard deviation of a probability distribution. Identifying unusual results. Expected value.
HOMEWORK Pg. 216-217 #16, 17, 18, 28, 29