9.3 Tests about a Population Mean Objectives SWBAT: STATE and CHECK the Random, 10%, and Normal/Large Sample conditions for performing a significance test.

Slides:



Advertisements
Similar presentations
+ Paired Data and Using Tests Wisely Chapter 9: Testing a Claim Section 9.3b Tests About a Population Mean.
Advertisements

CHAPTER 9 Testing a Claim
Inference for the Mean of a Population
Do political “attack ads” work
Confidence Interval and Hypothesis Testing for:
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
5-3 Inference on the Means of Two Populations, Variances Unknown
Testing a Claim Chapter 9 notes.
More About Significance Tests
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 9: Testing a Claim Section 9.3a Tests About a Population Mean.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Chapter 10 Comparing Two Means Target Goal: I can use two-sample t procedures to compare two means. 10.2a h.w: pg. 626: 29 – 32, pg. 652: 35, 37, 57.
Chapter 9: Testing a Claim Section 9.3 Tests About a Population Mean.
CHAPTER 18: Inference about a Population Mean
Significance Tests: THE BASICS Could it happen by chance alone?
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Section 10.1 Confidence Intervals
Chapter 9: Testing a Claim Section 9.2 Tests About a Population Proportion.
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
8.3 Estimating a Population Mean Objectives SWBAT: STATE and CHECK the Random, 10%, and Normal/Large Sample conditions for constructing a confidence interval.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.3 Tests About a Population.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 STATE the null and alternative hypotheses for.
AP STATISTICS LESSON 11 – 1 (DAY 2) The t Confidence Intervals and Tests.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 9: Testing a Claim Section 9.3 Tests About a Population Mean.
10.2 Comparing Two Means Objectives SWBAT: DESCRIBE the shape, center, and spread of the sampling distribution of the difference of two sample means. DETERMINE.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Review of Power of a Test
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Unit 5: Hypothesis Testing
Warm Up Check your understanding P. 586 (You have 5 minutes to complete) I WILL be collecting these.
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 12 More About Regression
AP Stats Check In Where we’ve been…
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 18: Inference about a Population Mean
AP Stats Check In Where we’ve been…
Tests About a Population Mean
TESTs about a population mean
CHAPTER 9 Testing a Claim
Tests About a Population Mean
Unit 5: Hypothesis Testing
CHAPTER 12 More About Regression
CHAPTER 10 Comparing Two Populations or Groups
Unit 5: Hypothesis Testing
CHAPTER 9 Testing a Claim
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 9 Testing a Claim
CHAPTER 18: Inference about a Population Mean
CHAPTER 12 More About Regression
CHAPTER 18: Inference about a Population Mean
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 10 Comparing Two Populations or Groups
Presentation transcript:

9.3 Tests about a Population Mean Objectives SWBAT: STATE and CHECK the Random, 10%, and Normal/Large Sample conditions for performing a significance test about a population mean. PERFORM a significance test about a population mean. USE a confidence interval to draw a conclusion for a two-sided test about a population parameter. PERFORM a significance test about a mean difference using paired data.

What are the three conditions for conducting a significance test for a population mean? How are these different than the conditions for calculating a confidence interval for a population mean? Conditions For Performing A Significance Test About A Mean Random: The data come from a well-designed random sample or randomized experiment. o 10%: When sampling without replacement, check that n ≤ (1/10)N. Normal/Large Sample: The population has a Normal distribution or the sample size is large (n ≥ 30). If the population distribution has unknown shape and n < 30, use a graph of the sample data to assess the Normality of the population. Do not use t procedures if the graph shows strong skewness or outliers. The conditions are the same as the conditions for constructing a confidence interval.

What test statistic do we use when testing a population mean? Is the formula on the formula sheet? As with proportions, all we get on the formula sheet is the generic Reminder: we always operate under the assumption that the null is true. Also, with means, we use the t-distribution unless there is a rare instance when we know the population standard deviation. Because the population standard deviation σ is usually unknown, we use the sample standard deviation s x in its place. The resulting test statistic has the standard error of the sample mean in the denominator When the Normal condition is met, this statistic has a t distribution with n - 1 degrees of freedom.

How do you calculate p-values using the t distributions? You can use either the table or the calculator to get a p-value. Let’s look at an example. A battery company wants to test H 0 : µ = 30 versus H a : µ > 30 based on an SRS of 15 new AAA batteries with mean lifetime and standard deviation The P-value is the probability of getting a result this large or larger in the direction indicated by H a, that is, P(t ≥ 1.54). Go to the df = 14 row. Since the t statistic falls between the values and 1.761, the “Upper-tail probability p” is between 0.10 and The P-value for this test is between 0.05 and Upper-tail probability p df %90%95% Confidence level C Reminder: for two-sided tests, double the p-value

Alternate Example: Short Subs Abby and Raquel like to eat sub sandwiches. However, they noticed that the lengths of the “6-inch sub” sandwiches they get at their favorite restaurant seemed shorter than the advertised length. To investigate, they randomly selected 24 different times during the next month and ordered a “6-inch” sub. Here are the actual lengths of each of the 24 sandwiches (in inches):

Alternate Example: Short Subs Abby and Raquel like to eat sub sandwiches. However, they noticed that the lengths of the “6-inch sub” sandwiches they get at their favorite restaurant seemed shorter than the advertised length. To investigate, they randomly selected 24 different times during the next month and ordered a “6-inch” sub. Here are the actual lengths of each of the 24 sandwiches (in inches): Calculator: tcdf(lower: ; upper: -2.38, df: 23) =

Alternate Example: Short Subs Abby and Raquel like to eat sub sandwiches. However, they noticed that the lengths of the “6-inch sub” sandwiches they get at their favorite restaurant seemed shorter than the advertised length. To investigate, they randomly selected 24 different times during the next month and ordered a “6-inch” sub. Here are the actual lengths of each of the 24 sandwiches (in inches): b) Given your conclusion in part (a), which kind of mistake – a Type I or a Type II error – could you have made? Explain what this mistake would mean in context. Because we rejected the null hypothesis, it is possible that we made a Type I error. In other words, it is possible that we found convincing evidence that the mean length was less than 6 inches when in reality the mean length is 6 inches.

To do this in the calculator: Start by entering the data into a list. STAT > TESTS > 2: T-Test List: wherever you put the data Calculate t = p =

Can you use your calculator for the Do step? Are there any drawbacks? As we’ve seen, yes, you can use your calculator. However, if you just show calculator results with no work, and one or more values are wrong, you won’t get any credit for the “do” step. If you opt for the calculator-only method, name the procedure (t test) and report the test statistic, degrees of freedom, and p-value.

Alternate Example: Don’t break the ice In the children’s game Don’t Break the Ice, small plastic ice cubes are squeezed into a square frame. Each child takes turns tapping out a cube of “ice” with a plastic hammer, hoping that the remaining cubes don’t collapse. For the game to work correctly, the cubes must be big enough so that they hold each other in place in the plastic frame but not so big that they are too difficult to tap out. The machine that produces the plastic cubes is designed to make cubes that are 29.5 millimeters (mm) wide, but the actual width varies a little. To ensure that the machine is working well, a supervisor inspects a random sample of 50 cubes every hour and measures their width. The Fathom output summarizes the data from a sample taken during one hour. a)Interpret the standard deviation and the standard error provided by the computer output. Standard deviation: The widths of the cubes are typically about mm from the mean width ( ). Standard error: In random samples of size 50, the sample mean will typically be about mm from the true mean.

Alternate Example: Don’t break the ice In the children’s game Don’t Break the Ice, small plastic ice cubes are squeezed into a square frame. Each child takes turns tapping out a cube of “ice” with a plastic hammer, hoping that the remaining cubes don’t collapse. For the game to work correctly, the cubes must be big enough so that they hold each other in place in the plastic frame but not so big that they are too difficult to tap out. The machine that produces the plastic cubes is designed to make cubes that are 29.5 millimeters (mm) wide, but the actual width varies a little. To ensure that the machine is working well, a supervisor inspects a random sample of 50 cubes every hour and measures their width. The Fathom output summarizes the data from a sample taken during one hour. b) 1)The true mean really is not 29.5 mm 2)The true mean really is 29.5 mm but we produced mm by random chance

Alternate Example: Don’t break the ice In the children’s game Don’t Break the Ice, small plastic ice cubes are squeezed into a square frame. Each child takes turns tapping out a cube of “ice” with a plastic hammer, hoping that the remaining cubes don’t collapse. For the game to work correctly, the cubes must be big enough so that they hold each other in place in the plastic frame but not so big that they are too difficult to tap out. The machine that produces the plastic cubes is designed to make cubes that are 29.5 millimeters (mm) wide, but the actual width varies a little. To ensure that the machine is working well, a supervisor inspects a random sample of 50 cubes every hour and measures their width. The Fathom output summarizes the data from a sample taken during one hour.

Alternate Example: Don’t break the ice In the children’s game Don’t Break the Ice, small plastic ice cubes are squeezed into a square frame. Each child takes turns tapping out a cube of “ice” with a plastic hammer, hoping that the remaining cubes don’t collapse. For the game to work correctly, the cubes must be big enough so that they hold each other in place in the plastic frame but not so big that they are too difficult to tap out. The machine that produces the plastic cubes is designed to make cubes that are 29.5 millimeters (mm) wide, but the actual width varies a little. To ensure that the machine is working well, a supervisor inspects a random sample of 50 cubes every hour and measures their width. The Fathom output summarizes the data from a sample taken during one hour.

Alternate Example: Don’t break the ice In the children’s game Don’t Break the Ice, small plastic ice cubes are squeezed into a square frame. Each child takes turns tapping out a cube of “ice” with a plastic hammer, hoping that the remaining cubes don’t collapse. For the game to work correctly, the cubes must be big enough so that they hold each other in place in the plastic frame but not so big that they are too difficult to tap out. The machine that produces the plastic cubes is designed to make cubes that are 29.5 millimeters (mm) wide, but the actual width varies a little. To ensure that the machine is working well, a supervisor inspects a random sample of 50 cubes every hour and measures their width. The Fathom output summarizes the data from a sample taken during one hour. p-value: using the t distribution with 40 degrees of freedom, the p-value is greater than 2(0.25) = Using technology: With df = 49, the calculator’s t-test gives a p-value = 2( ) =

Alternate Example: Don’t break the ice In the children’s game Don’t Break the Ice, small plastic ice cubes are squeezed into a square frame. Each child takes turns tapping out a cube of “ice” with a plastic hammer, hoping that the remaining cubes don’t collapse. For the game to work correctly, the cubes must be big enough so that they hold each other in place in the plastic frame but not so big that they are too difficult to tap out. The machine that produces the plastic cubes is designed to make cubes that are 29.5 millimeters (mm) wide, but the actual width varies a little. To ensure that the machine is working well, a supervisor inspects a random sample of 50 cubes every hour and measures their width. The Fathom output summarizes the data from a sample taken during one hour.

On the calculator: T-Test Stats t = p = df = 50-1 = 49

Alternate Example: Don’t break the ice In the children’s game Don’t Break the Ice, small plastic ice cubes are squeezed into a square frame. Each child takes turns tapping out a cube of “ice” with a plastic hammer, hoping that the remaining cubes don’t collapse. For the game to work correctly, the cubes must be big enough so that they hold each other in place in the plastic frame but not so big that they are too difficult to tap out. The machine that produces the plastic cubes is designed to make cubes that are 29.5 millimeters (mm) wide, but the actual width varies a little. To ensure that the machine is working well, a supervisor inspects a random sample of 50 cubes every hour and measures their width. The Fathom output summarizes the data from a sample taken during one hour.

Paired Data Comparative studies are more convincing than single-sample investigations. For that reason, one-sample inference is less common than comparative inference. Study designs that involve making two observations on the same individual, or one observation on each of two similar individuals, result in paired data. When paired data result from measuring the same quantitative variable twice, we can make comparisons by analyzing the differences in each pair. If the conditions for inference are met, we can use one-sample t procedures to perform inference about the mean difference µ d. These methods are sometimes called paired t procedures. Our null hypothesis will that there is no mean difference between the two items being compared (H 0 : µ d = 0). Our alternative will be that there is a mean difference (H a : µ d > 0). Be sure to define the order in which you are subtracting.

Alternate Example: Is the express lane faster? For their second semester project in AP Statistics, Libby and Kathryn decided to investigate which line was faster in the supermarket: the express lane or the regular lane. To collect their data, they randomly selected 15 times during a week, went to the same store, and bought the same item. However, one of them used the express lane and the other used a regular lane. To decide which lane each of them would use, they flipped a coin. If it was heads, Libby used the express lane and Kathryn used the regular lane. If it was tails, Libby used the regular lane and Kathryn used the express lane. They entered their randomly assigned lanes at the same time, and each recorded the time in seconds it took them to complete the transaction. Carry out a test to see if there is convincing evidence that the express lane is faster. Since these data are paired, we will consider the differences in time (regular − express). Here are the 15 differences. In this case, a positive difference means that the express lane was faster. 5246− −94 − −3942 Now find the mean difference by adding up these differences and dividing by /15 = 42.7

Alternate Example: Is the express lane faster? For their second semester project in AP Statistics, Libby and Kathryn decided to investigate which line was faster in the supermarket: the express lane or the regular lane. To collect their data, they randomly selected 15 times during a week, went to the same store, and bought the same item. However, one of them used the express lane and the other used a regular lane. To decide which lane each of them would use, they flipped a coin. If it was heads, Libby used the express lane and Kathryn used the regular lane. If it was tails, Libby used the regular lane and Kathryn used the express lane. They entered their randomly assigned lanes at the same time, and each recorded the time in seconds it took them to complete the transaction. Carry out a test to see if there is convincing evidence that the express lane is faster.

Alternate Example: Is the express lane faster? For their second semester project in AP Statistics, Libby and Kathryn decided to investigate which line was faster in the supermarket: the express lane or the regular lane. To collect their data, they randomly selected 15 times during a week, went to the same store, and bought the same item. However, one of them used the express lane and the other used a regular lane. To decide which lane each of them would use, they flipped a coin. If it was heads, Libby used the express lane and Kathryn used the regular lane. If it was tails, Libby used the regular lane and Kathryn used the express lane. They entered their randomly assigned lanes at the same time, and each recorded the time in seconds it took them to complete the transaction. Carry out a test to see if there is convincing evidence that the express lane is faster.

Alternate Example: Is the express lane faster? For their second semester project in AP Statistics, Libby and Kathryn decided to investigate which line was faster in the supermarket: the express lane or the regular lane. To collect their data, they randomly selected 15 times during a week, went to the same store, and bought the same item. However, one of them used the express lane and the other used a regular lane. To decide which lane each of them would use, they flipped a coin. If it was heads, Libby used the express lane and Kathryn used the regular lane. If it was tails, Libby used the regular lane and Kathryn used the express lane. They entered their randomly assigned lanes at the same time, and each recorded the time in seconds it took them to complete the transaction. Carry out a test to see if there is convincing evidence that the express lane is faster.

Alternate Example: Is the express lane faster? For their second semester project in AP Statistics, Libby and Kathryn decided to investigate which line was faster in the supermarket: the express lane or the regular lane. To collect their data, they randomly selected 15 times during a week, went to the same store, and bought the same item. However, one of them used the express lane and the other used a regular lane. To decide which lane each of them would use, they flipped a coin. If it was heads, Libby used the express lane and Kathryn used the regular lane. If it was tails, Libby used the regular lane and Kathryn used the express lane. They entered their randomly assigned lanes at the same time, and each recorded the time in seconds it took them to complete the transaction. Carry out a test to see if there is convincing evidence that the express lane is faster.

On the calculator: T-Test Data Mu 0: 0 List Mu: >m0 Df: 14 t = p =

What is the difference between statistical and practical significance? Statistical Significance and Practical Importance When a null hypothesis (“no effect” or “no difference”) can be rejected at the usual levels (α = 0.05 or α = 0.01), there is good evidence of a difference. But that difference may be very small. When large samples are available, even tiny deviations from the null hypothesis will be significant. Statistical Significance and Practical Importance When a null hypothesis (“no effect” or “no difference”) can be rejected at the usual levels (α = 0.05 or α = 0.01), there is good evidence of a difference. But that difference may be very small. When large samples are available, even tiny deviations from the null hypothesis will be significant. Remember the wise saying: Statistical significance is not the same thing as practical importance. The remedy for attaching too much importance to statistical significance is to pay attention to the actual data as well as to the p-value. Plot the data and examine it carefully. Are there outliers or other departures from a consistent pattern? A few outlying observations can produce highly significant tests.

What is the problem of multiple tests? Beware of Multiple Analyses Statistical significance ought to mean that you have found a difference that you were looking for. The reasoning behind statistical significance works well if you decide what difference you are seeking, design a study to search for it, and use a significance test to weigh the evidence you get. In other settings, significance may have little meaning.

Suppose that 20 significance tests were conducted and in each case the null hypothesis was true. What is the probability that we avoid a Type I error in all 20 tests?