Chapter 7 Inference for Means

Slides:



Advertisements
Similar presentations
Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Statistics and Quantitative Analysis U4320
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
CHAPTER 9 Testing a Claim
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Confidence Interval and Hypothesis Testing for:
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
BCOR 1020 Business Statistics
Inferences On Two Samples
Chapter 9: Inferences Involving One Population Student’s t, df = 5 Student’s t, df = 15 Student’s t, df = 25.
Chapter 11: Inference for Distributions
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS
Statistics 270– Lecture 25. Cautions about Z-Tests Data must be a random sample Outliers can distort results Shape of the population distribution matters.
5-3 Inference on the Means of Two Populations, Variances Unknown
CHAPTER 10 ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved.
Objective: To test claims about inferences for two sample means, under specific conditions.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Experimental Statistics - week 2
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing
Inferences Based on Two Samples
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Education 793 Class Notes T-tests 29 October 2003.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
Ch 11 – Inference for Distributions YMS Inference for the Mean of a Population.
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
More About Significance Tests
Dependent Samples: Hypothesis Test For Hypothesis tests for dependent samples, we 1.list the pairs of data in 2 columns (or rows), 2.take the difference.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
CHAPTER 18: Inference about a Population Mean
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
1 Happiness comes not from material wealth but less desire.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Essential Statistics Chapter 131 Introduction to Inference.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Comparing Two Means and Two Standard Deviations Module 23.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
AP Statistics Chapter 24 Comparing Means.
Psych 230 Psychological Measurement and Statistics
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Matched Pairs t-test Module 22a. Matched Pairs t-test To this point we have only looked at tests for single samples. Soon we will look at confidence intervals.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.3 Tests About a Population.
Inference for distributions: - Comparing two means.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Chapter 7 Inference Concerning Populations (Numeric Responses)
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
AP Statistics Chapter 24 Comparing Means. Objectives: Two-sample t methods Two-Sample t Interval for the Difference Between Means Two-Sample t Test for.
Chapter 9: Inferences Involving One Population
Presentation transcript:

Chapter 7 Inference for Means Statistics 303 Chapter 7 Inference for Means

Inference for Means To this point, when examining the mean of a population we have always assumed that the population standard deviation (s) was known. In practice this is seldom the case. We usually must estimate the population standard deviation with the sample standard deviation s (for a review of s, see pp. 49-50 of the book). When we do this, the sampling distribution of the sample mean is no longer normally distributed, because of the adjustment for estimating s with s. Thus, instead of using the Z, the standard normal distribution, we must use the appropriate t-distribution.

Inference for Means The t-distribution Although there is only one Z-distribution, there are many, many t-distributions. In fact, there is a different t-distribution for each sample size used. The shape of each t-distribution is very similar to the Z-distribution, but is slightly flatter. The larger the sample size, the closer the t-distribution is to the Z-distribution.

Inference for Means The t-distribution The way we distinguish between various t-distributions is by finding the degrees of freedom (df) that correspond to the sample size. When we are looking at only one sample, the degrees of freedom are the sample size minus one: df = n – 1. We say that the one-sample t-statistic: has the t distribution with n – 1 degrees of freedom.

Inference for Means The t-distribution A table of t distribution critical values can be found in Table D (the last page of the book). Note that these values are areas to the right, not areas to the left as in the Z-table. In Table D, the degrees of freedom are listed in the left column. The probabilities are on top (these probabilities are inside for the Z-table) The individual t-values are inside the table. Make sure to get acquainted with this table and how it differs from the Z-table.

Inference for Means The t-distribution In the book, p.452, we see an example of how the distributions compare:

Inference for Means The t-distribution With the change from s to s, and the change from z* to t*, the steps in producing confidence intervals and hypothesis tests are the same as we have seen previously. In Chapter 1, p. 50, we find that s is calculated from the data using the formula: This formula is very cumbersome. Ideally, a computer is used to calculate s, particularly for large data sets.

Confidence Interval for m with Unknown s The formula for a confidence interval for m with unknown s is Calculated from the data. Calculated from the data. t* is found in table D at the back of the book. It must correspond to the appropriate df = n – 1. It is easiest to find the confidence level at the bottom of the table and go up to the correct df. Sample size

Confidence Interval for m with Unknown s Confidence Interval Example An economist wants to determine the average amount a family of four in the United States spends on housing annually. He randomly selects 85 families of size four and finds the amount they spent on housing the previous year. The economist wishes to estimate the mean with 99% confidence.

Confidence Interval for m with Unknown s Confidence Interval Example Information given: Sample size: n = 85. Data: $6,789, $8,233, $4,784, …, $5,974 (85 numbers) Calculated from the data. df = n – 1 = 85 – 1 = 84

Confidence Interval for m with Unknown s Confidence Interval Example t* is found in table D. We first go to the 99% confidence level at the bottom. Then we go up to 80 df (always round down). Thus, t* = 2.639. This is a 99% confidence interval for the true average amount a family of four in the United States spends on housing annually.

Hypothesis Test for m with Unknown s The steps for a hypothesis test are the same as those seen previously, namely, 1. State the null hypothesis. 2. State the alternative hypothesis. 3. State the level of significance (i.e., a = 0.05). 4. Calculate the test statistic (note change):

Hypothesis Test for m with Unknown s 5. Find the P-value: For a two-sided test: For a one-sided test: Because of the limited number of t-values given in Table D, it is more common to find a range for the P-value, rather than the exact value (as will be seen in the example). Computers can be used to obtain exact values.

Hypothesis Test for m with Unknown s 6. Reject or fail to reject H0 based on the P-value. If the P-value is less than or equal to a, reject H0. It the P-value is greater than a, fail to reject H0. 7. State your conclusion. If H0 is rejected, “There is significant statistical evidence that the population mean is different than m0.” If H0 is not rejected, “There is not significant statistical evidence that the population mean is different than m0.” Notice that these last two steps are exactly the same as for the case where s is known.

Hypothesis Test for m with Unknown s T.V. Example Suppose that the data collected from our class survey is a random sample from the entire university (which it obviously is not). We wish to see if there is evidence that the average amount of television watched for students here is more than 7 hours per week.

Hypothesis Test for m with Unknown s 3 4 10 2 5 20 6 1 9 30 15 21 T.V. Example Information given: Sample size: n = 38.

Hypothesis Test for m with Unknown s T.V. Example 1. State the null hypothesis: 2. State the alternative hypothesis: 3. State the level of significance from “is more than” Assume a = 0.05

Hypothesis Test for m with Unknown s T.V. Example 4. Calculate the test statistic. 5. Find the P-value. Remember the table gives probabilities to the right so we do not use the technique of subtracting from 1. Use df = 30 (rounding down)

Hypothesis Test for m with Unknown s T.V. Example 6. Do we reject or fail to reject H0 based on the P-value? 7. State the conclusion. P-value = between 0.15 and 0.20 is greater than a = 0.05. Therefore, we fail to reject H0 “There is not significant statistical evidence that the average amount of television watched is more than 7 hours per week at the 0.05 level of significance.”

Matched Pairs t-test To this point we have only looked at tests for single samples. Soon we will look at confidence intervals and hypothesis tests for comparing two groups. When each individual can be given both treatments, we can reduce the two samples to a single sample using a matched pairs design. Examples: Students are each given a pre-test and a post-test to determine the amount of material learned in a given time interval. To examine the effect of a new drug, a large group of identical twins is identified. One twin is given a treatment and the other a placebo. A ophthalmologist is examining the importance of the dominant eye in reading. A large group of subjects is asked to read a passage with dominant eye covered and again with the non-dominant eye covered. It can be seen in each of these examples that something pairs the two responses.

Matched Pairs t-test To analyze matched pairs data, we first reduce the data from two samples to one sample and then analyze the data using one-sample techniques. The data is reduced from two samples to one by subtracting one of the responses from the other. We could subtract each pre-test score from each post-test score. We could subtract each placebo response from each treatment response. We could subtract the time taken to read the passage with the non-dominant eye from the time taken to read the passage with the dominant eye.

Matched Pairs t-test Example: Keyboards “Suppose we want to compare two brands of computer keyboards, which we will denote as keyboard 1 and keyboard 2. Keyboard 1 is a standard keyboard, while keyboard 2 is specially designed so that the keys need very little pressure to make them respond. The manufacturer of keyboard 2 would like to claim that typing can be done faster using keyboard 2…A simple random sample of n = 30 teachers was selected from a population of high-school teachers attending a national conference. Each teacher typed the same page of text once using keyboard 1 and once using keyboard 2. For each teacher the order in which the keyboards were used was determined by the toss of a coin. For each teacher the variable measured was the time (in seconds) to correctly type the page of text…” (from Graybill, Iyer and Burdick, Applied Statistics, 1998).

Matched Pairs t-test Example: Keyboards Information given: Reduction to one sample Example: Keyboards Information given: Sample size: n = 30.

Matched Pairs t-test Example: Keyboards 1. State the null hypothesis: 2. State the alternative hypothesis: 3. State the level of significance from carefully reading Assume a = 0.05

Matched Pairs t-test Example: Keyboards 4. Calculate the test statistic. 5. Find the P-value. Remember the table gives probabilities to the right. Use df = 29

Matched Pairs t-test Example: Keyboards 6. Do we reject or fail to reject H0 based on the P-value? 7. State the conclusion. P-value = between 0.01 and 0.02 is less than a = 0.05. Therefore, we reject H0 “There is significant statistical evidence that the average amount of time needed to type the passage is lower for keyboard 2 than keyboard 1 at the 0.05 level of significance.”

Matched Pairs Confidence Interval After reducing the data to a single sample, we use the same formula as for a confidence interval for m with unknown s, namely, using the mean and standard deviation of the differences.

Matched Pairs Confidence Interval Example: Golf Balls “In the manufacture of golf balls two procedures are used. Method I utilizes a liquid center and method II, a solid center. To compare the distance obtained using both types of balls, 12 golfers are allowed to drive a ball of each type, and the length of the drive (in yards) is measured.” (from Milton, McTeer, and Corbet, Introduction to Statistics, 1997) The manufacturer wants to estimate the mean difference with 90% confidence.

Matched Pairs Confidence Interval Example: Golf Balls Information given: Sample size: n = 12. df = n – 1 = 12 – 1 = 11

Matched Pairs Confidence Interval Example: Golf Balls t* is found in table D. We first go to the 90% confidence level at the bottom. Then we go up to 11 df. Thus, t* = 1.796. This is a 90% confidence interval for the true average difference for the distance traveled for the two types of golf balls.

Comparing Two Means We use the same basic principles for comparing two population means as those used for examining one population mean. If the standard deviations s1 and s2 for each of the two populations are known, the two-sample z-statistic is then But it is very rare that both population standard deviations are known. We will examine the situation in which they are not known.

Comparing Two Means When we are interested in comparing two population means and we are estimating the population standard deviations s1 and s2 with s1 and s2, the two-sample t-statistic is then with degrees of freedom equal to the smaller of n1-1 and n2-1 (or an appropriate estimate using computer software).

Comparing Two Means The null hypothesis can be any of the following: The alternative hypothesis can be any of the following (depending on the question being asked): The other steps are the same as those used for the tests we have looked at previously.

Comparing Two Means Example: Tomatoes “There has been some discussion among amateur gardeners about the virtues of black plastic versus newspapers as weed inhibitors for growing tomatoes. To compare the two, several rows of tomatoes are planted. Black plastic is used around nine randomly selected plants and newspaper around the remaining ten. All plants start at virtually the same height and receive the same care. The response of interest is the height in feet after a month’s growth.” (from Milton, McTeer, and Corbet, Introduction to Statistics, 1997). Perform a test to see if there is any difference between the average heights with significance level 0.10.

Comparing Two Means Example: Tomatoes Information given: Sample sizes: n1 = 9, n2 = 10.

Comparing Two Means Example: Tomatoes 1. State the null hypothesis: 2. State the alternative hypothesis: 3. State the level of significance from “any difference between” a = 0.10

Comparing Two Means Example: Tomatoes 4. Calculate the test statistic. 5. Find the P-value. Remember the table gives probabilities to the right. Use df = 8

Comparing Two Means Example: Tomatoes 6. Do we reject or fail to reject H0 based on the P-value? 7. State the conclusion. P-value = between 0.10 and 0.20 is greater than a = 0.10. Therefore, we fail to reject H0 “There is not significant statistical evidence that the average tomato plant heights are different for the two types of weed inhibitors at the 0.10 level of significance.”

Comparing Two Means The confidence interval for the difference of two population means (m1- m2) is Where t* comes from Table D and corresponds to the confidence level desired and df = smaller of n1-1 and n2-1 .

Comparing Two Means Example: Commercials “There is some concern that TV commercial breaks are becoming longer. The observations on the following slide are obtained on the length in minutes of commercial breaks for the 1984 viewing season and the current season.” (from Milton, McTeer, and Corbet, Introduction to Statistics, 1997) Find a 95% confidence interval for the difference between the true averages of the two seasons.

Comparing Two Means Example: Commercials Information given: Sample sizes: n1 = 16, n2 = 16.

Comparing Two Means Example: Commercials t* is found in table D. We first go to the 95% confidence level at the bottom. Then we go up to 15 df. Thus, t* = 2.131. This is a 95% confidence interval for the true difference of average length in minutes for commercials between 1984 and the present.

Pooled t test: Comparing Two Means The null hypothesis can be any of the following: The alternative hypothesis can be any of the following (depending on the question being asked):

Pooled Estimator Previously, we discussed two-sample t procedures from two populations with two unknown standard deviations. We then used the sample standard deviations to estimate the population standard deviations. But what about when the two populations have the same standard deviation. This estimate is called the pooled estimator of σ2 because it combines the information in both samples.

Test Statistic Suppose that an SRS of size n1 is drawn from a normal population with unknown mean μ1 and that an independent SRS of size n2 is drawn from another normal population with unknown mean μ2. Suppose also that the two populations have the SAME standard deviation. Thus, the two-sample t statistic is With degrees of freedom equal to n1 + n2 – 2

Confidence Interval A level C confidence interval for μ1 – μ2 is Where t* comes from Table D and corresponds to the confidence level desired and df = n1 + n2 – 2