Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012.

Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012

Judging whether differences occur by chance… How do we judge whether it is plausible that two population means are the same and that any difference between sample means simply reflect sampling error? Example: Household size of minority ethnic groups (HOH = Head of household; data adapted by Richard Lampard from 1991 Census) 1.The size of the difference between the two sample means Mean Indian HOH: 3.0 Bangladeshi HOH:5.0 Mean Indian HOH: 3.0 Pakistani HOH:4.0 The first difference is more ‘convincing’

2. The sample sizes of the two samples Mean Pakistani HOH: 3 4 54.0 Bangladeshi HOH:4 5 65.0 Mean Pakistani HOH: 2 2 3 4 4 4 5 5 5 6 4.0 Bangladeshi HOH:2 3 4 4 5 5 6 6 7 8 5.0 The second difference is more ‘convincing’ Judging whether differences occur by chance…

3. The amount of variation in each of the two groups (samples) Mean Pakistani HOH: 2 2 3 4 4 4 5 5 5 64.0 Bangladeshi HOH:2 3 4 4 5 5 6 6 7 85.0 Mean Pakistani HOH: 4 4 4 4 4 4 4 4 4 4 4.0 Bangladeshi HOH:5 5 5 5 5 5 5 5 5 5 5.0 The second difference is more ‘convincing’. Judging whether differences occur by chance…

Example: the impact of variability on the difference of means.

The logic of a statistical test… The two statistical tests that we have so far looked at: Testing the plausibility of a suggested population mean (via a t-test [or z-test]): Is the sample mean sufficiently different from the suggested population mean that it is implausible that the suggested population mean is correct? Chi-square test: Are the observed frequencies in a table sufficiently different from what one would have expected to have seen if there was no relationship in the population for the idea that there is no relationship in the population to be implausible? Both have asked whether the difference between the actual (observed) data and what one would have expected to have seen given a particular hypothesis is sufficiently large that the hypothesis is implausible.

…the same logic applies to comparing sample means. If the two samples came from populations with identical population means, then one would expect the difference between the sample means to be (close to) zero. Thus, The larger the difference between the two sample means, the more implausible is the idea that the two population means are identical. This is also affected by: Sample size The extent to which there is variation within each group (or sample). The logic of a statistical test…

t-tests Test the null hypothesis: H 0 :  1 =  2 or H 0 :  1 -  2 = 0 The alternative hypothesis is: H 1 :  1   2 or H 1 :  1 -  2  0

What does a t-test measure? Note: T = treatment group and C = control group (from experimental research). In most discussions these will just be shown as groups 1 and 2, indicating different groups.

Example We want to compare the average amount of television watched by Australian and by British children. We have a sample of Australian and a sample of British children. We could say that what we have is something like this: Population of Australian children Population of British children Sample of Australian children Sample of British children inference Want to compare

t distribution critical values For a more detailed view of statistics go all the way to Australia: SurfStatSurfStat Example contd. Here the dependent variable is hours of TV And the independent variable is nationality. When we are comparing means SPSS calls the independent variable the grouping variable and the dependent variable the test variable.

Example contd. If the null hypothesis, of no difference between the two groups, is correct (and children watch the same amount of television in Australia and Britain) we would assume that if we took repeated samples from the two groups the difference in means between them would generally be small or zero. However it is likely that the difference between any two particular samples will be greater than zero. Therefore we build up a sampling distribution of the difference between the two sample means. We use this distribution to determine the probability of getting an observed difference between two sample means from populations with no difference.

If we take a large number of random samples and calculate the difference between each pair of sample means, we will end up with a sampling distribution that has the following properties: It will be a t-distribution (Under the null hypothesis) the mean of the difference between sample means will be zero Mean M 1 - M 2 = 0 The spread of scores around this mean of zero (the standard error) will be defined by the formula: This is called the pooled variance estimate

Back to example… Descriptive statistic Australian sample British sample Mean166 minutes187 minutes Standard deviation29 minutes30 minutes Sample size20 When we are choosing the test of significance it is important that: 1.We are making an inference from TWO samples (of Australian and of British children). Therefore we need a two-sample test 2.The two samples are being compared in terms of an interval- ratio variable – hours of TV watched. Therefore the relevant descriptive statistic is the mean.  These facts lead us to select the two sample t-test for the equality of means as the relevant test of significance. Table 1. Descriptive statistics for the samples

Descriptive statistic Australian sample British sample Mean166 minutes187 minutes Standard deviation 29 minutes30 minutes Sample size20 SDM = (20-1)29 2 + (20-1)30 2 20+20 = 9.3 20 + 20 – 2 20 x 20     t sample = 166 – 187 = – 2.3 9.3 Calculating the t-score

Obtaining a p-value for a t-score To obtain the p-value for this t-score we need to consult the table for critical values for the t-distribution (see Appendix A.2. in Field) The number of degrees of freedom we refer to in the table in the combined sample size minus two: df = N 1 + N 2 – 2 Here that is 20 + 20 – 2 = 38 The table doesn’t have a row of probabilities for 38. In that case we refer to the row for the nearest reported number of degrees of freedom below the desired number. Here that is 35. With 38 degrees of freedom on a two-tail test, t sample falls between the two stated t-scores of 2.03 and 2.72. The p-value, which falls between the significance levels for these scores is therefore between 0.01 and 0.05 Therefore the p-value is statistically significant at a 0.05 level but not at a 0.01 level.

Reporting the results We can say that: The mean number of minutes of TV watched by the sample of 20 British children is 187 minutes, which is 21 minutes higher than the sample of 20 Australian children, and this difference is statistically significant at the 0.05 level (t(38)= - 2.3, p = 0.03, two-tail). Based on these results we can reject the hypothesis that British and Australian children watch the same average amount of television every night.

Calculating the effect size To discover whether the effect is substantive we want to know the size of it. You can convert t-values into an r-value (a PRE statistic) with the following equation: r = t 2 = -2.3 2 = 0.34 t 2 + df -2.3 2 + 38 This is a medium sized effect. 

Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012.

Similar presentations

Presentation on theme: "Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012.

Similar presentations

Presentation on theme: "Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012."— Presentation transcript:

Similar presentations

About project

Feedback