Chapter 10 Estimation and Hypothesis Testing II: Independent and Paired Sample T-Test
Learning Objectives: Describe the difference between an independent and a dependent sample Describe the difference between and uses of an independent sample t-test and a paired sample t-test List the assumptions for the independent sample t-test Test a hypothesis using two independent samples Construct confidence intervals for the difference in means for the independent sample t-test List the assumptions for the paired sample t-test Test a hypothesis using two dependent samples Construct confidence intervals for the difference in means for the paired sample t-test © 2012 McGraw-Hill Ryerson Ltd.
Independent Samples Independent samples are considered to be independent of one another, if they consist of different individuals and the selection of the individuals in the first group does not influence the selection of the second group. For example, you want to determine whether grade 9 students involved in afterschool extracurricular activities have different grade point averages than grade 9 students who are not involved in after-school extracurricular activities. LO1 © 2012 McGraw-Hill Ryerson Ltd.
Dependent Samples Dependent samples are considered to be dependent on one another if they consist of the same individuals and/or the selection of the individuals in the first group determines the selection of the second group. For example, you want to know if a certain diet is effective at reducing LDL cholesterol levels (the bad kind of cholesterol). You randomly select 20 individuals and measure their LDL cholesterol level. You then have the 20 individuals participate in the diet for three months. At the end of the three months, you measure the LDL cholesterol levels of the same 20 individuals. You now have two samples (or groups). LO1 © 2012 McGraw-Hill Ryerson Ltd.
Independent and Paired Sample t-test Independent sample t-test: When we have two samples that are independent from each other and we want to compare the means of the two samples, we use a independent sample t-test. Paired sample t-test: When we have two samples that are dependent on each other and we want to compare the means of the two samples, we use a paired sample t-test. LO2 © 2012 McGraw-Hill Ryerson Ltd.
Assumptions of the Independent Sample t-test The variables must be measured at either the interval or ratio level of measurement. The two groups from which you collect the data must be independent of one another. The data must be normally distributed. The variance in the population must be equal for both groups, which means they are not statistically significantly different. This is also called the assumption of homogeneity of variances. LO3 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means The general formula for the independent sample t-test is: Technically, the formula for the independent sample t-test is: Since we are testing the null hypothesis where μ1= μ2, and in the null hypothesis μ1-μ2=0, we set the term (μ1-μ2) to zero. Since is the same as writing we often leave the term (μ1-μ2,), which equals 0, out of the equation. LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means The point estimator for the difference in means is still . However, as we are assuming that σ = σ , we label this common variance σ2, and estimate it by pooling the two sample variances. The pooled estimate of the variance is given by: For an independent sample t-test, the degrees of freedom are calculated as df = (n1 + n2 – 2) LO-1; LO-2 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means The pooled variance is the weighted average of the two sample variances. So that the group with more observations is more reliable, and gets a higher weighting. The standard error for a difference in two means is given in (9.2). As we assume that the two population variances are equal, we estimate them both with the common pooled variance s2p. So the standard error becomes: LO-1; LO-2 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means The confidence interval for (μ1 – μ2) will use the t- distribution for small sample sizes. The degrees of freedom for s21 is (n1 – 1) and s22 is (n2 – 1). When we pool the information we have (n1 – 1) + (n2 – 1) = (n1 + n2 – 2) = f Where f is the degrees of freedom. The 95% confidence interval for (μ1 – μ2) will be the point estimator plus or minus the t-table value times the standard error. LO-1; LO-2 LO5 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means Then, a 95% confidence interval for μ1 – μ2 Where the degrees of freedom for the t-distribution is f = (n1 + n2 – 2). LO-1; LO-2 LO5 © 2012 McGraw-Hill Ryerson Ltd.
Test of Hypotheses Recall the 5 steps of hypothesis testing: Define the Null and Alternative Hypothesis. Define the Sampling Distribution and Critical Values Calculate the Test Statistic Using the Sample Data Make the Decision Regarding the Hypothesis Interpret the results LO4 © 2012 McGraw-Hill Ryerson Ltd.
Test of Hypotheses If the data is collected using two independent samples then we use the corresponding t-test. Again the hypotheses are: H0: µ1 = µ2 Ha: µ1 ≠ µ2 We can also perform one sided tests if we have prior knowledge of what we would be looking for. LO-2; LO-5 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Test of Hypotheses We first calculate the pooled sample variance The t-test for the difference in two means (independent sample case) LO-2; LO-5 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Test of Hypotheses We then reject the null hypothesis of equal means if the t-value is too large or too small. The degrees of freedom associated with the two sample t-test is f = n1 + n2 -2. We look up the t- value with 0.025 in the right tail and we get the value t0.025:f. The null hypothesis of equal means is rejected if |t| > t0.025:f where t0.025:f is the table value with f=n1+n2-2 degrees of freedom. LO-2; LO-5 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means – Example "There is no criminal type" according to Dr. Charles Goring, Deputy Medical Officer of H.M. Prison, London. He measured facial characteristics on 3000 convicts according to The New York times, November 2, 1913. The article explains that there was no significant difference between facial measurements of convicts compared to the general population. A similar data set was taken in 1904. LO-1; LO-2 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means – Example The following data represents left ear measurements on convicts at Parkhurst prison in 1904. There are two groups used here representing Ordinary Murderers and Other Criminals Ordinary murderers Other Criminals 59 63 60 56 58 62 50 61 68 55 LO-1; LO-2 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means – Example For this data the summary statistics are: Griffiths, G.B. (1904). Measurements of One Hundred and Thirty Criminals. Biometrika, 3, 60-62. Sexually Abused Comparison Group Sample Means Sample Variances Sample Sizes n1 = 10 n2 = 10 Population Means 1 2 Population Variances LO-1; LO-2 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means – Example LO-1; LO-2 LO4 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means – Example LO-2; LO-5 LO5 © 2012 McGraw-Hill Ryerson Ltd.
Independent Sample t-test for Two Means – Example (alternative calculation) LO5 © 2012 McGraw-Hill Ryerson Ltd.
Assumptions for the Paired Sample t-test The variables must be measured at either the interval or ratio level of measurement. The two groups must be dependent on one another. The differences between the two samples must be relatively normally distributed. LO6 © 2011 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means LO-6 LO7 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means LO-6 LO7 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means In the example we compared the means of two types of criminals. The subjects were not related in any way. We consider them to be independent samples. The variance parameter measures the variability in ear measurements of people, and can be quite large as people are highly variable. When we compare two groups of people, we can reduce the variability in the data by using pairs of individuals. So we see toothpaste commercials comparing two brands of toothpaste and measuring the number of cavities over a six month period. LO-3 LO7 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means They could use two groups of subjects and randomly assign them to a brand of toothpaste. But they normally pick sets of twins and randomly assigned one of each twin to a brand of toothpaste. In this example, the variance measures variability in twins rather than variability in people. We expect the variability to be much less. We can choose twins, or pairs of subjects matched up by demographical measurements such as age, sex etc. The idea is to control for differences due to a number of other factors and to reduce the standard errors of our estimates. LO-3 LO7 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means LO-3 LO7 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means LO-3 LO7 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means LO-3 LO8 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means – Example Consider the left and right ear measurements on the "other criminals" in the example. Note ∑d = 8 and ∑d2 = 84 Subject Right ear (x1) Left Ear (x2)) Difference d=x1 - x2 1 63 2 57 56 3 62 4 58 59 -1 5 -4 6 50 7 8 61 9 55 10 LO-3 LO7 © 2012 McGraw-Hill Ryerson Ltd.
Paired Sample t-test for Two Means – Example LO-6 LO7,8 © 2012 McGraw-Hill Ryerson Ltd.
Conclusion The independent sample t-test and the paired sample t-test are useful methods for testing hypotheses with two sample means. Hypotheses with two sample means and are considered independent use the independent sample t-test. Testing two sample means from dependent sample use the paired sample t-test. Our research often involves comparing more than two groups so we need to use what is called an analysis of variance (ANOVA). © 2012 McGraw-Hill Ryerson Ltd.