1 The t-distribution
General comment on z and t
Moving from z to t Same concept, different assumptions Can only use z-tests if you know population SD So when you have to estimate σ, use t-dist. t-test estimates population SD from sample SD t-test more robust against departures from normality (doesn’t affect the accuracy of the p-estimate as much) 1 2
Calculating the t-statistic We don’t know the population SD ? Step 1: Estimate σ with “s”
Calculating the t-statistic We don’t know the population SD ? Step 2: Use “s” to estimate SE M 1 2 3
Calculating the t-statistic We know the population mean, but not the SD… ? Step 3: Use in t-statistic 1 2
t-statistic – factors in significance Size of estimated SE obviously depends on both SD of sample, and sample size Thus, factors affecting size of calculated t are mean diff, sample SD, and sample size 1 2 3
Sampling distribution of t Moving from the z-distribution to the t- distribution Still about estimating probabilities, but the properties of the z- and t- distributions are different 1 2
Sampling distribution of t The t distribution varies with sample size The good old 1.96 for 95% is toast 1 2
Sampling distribution of t df = n-1 (see next 2 slides) Because distribution gets flatter as n gets smaller, this implies t for significance gets bigger as n gets smaller α (Significance level)
(digression – degrees of freedom) Degrees of freedom The number of independent pieces of information a sample of observations can provide for purposes of statistical inference Why doesn’t d of f = “n” (sample size)? Not all “n” free to vary You “give up” one degree of freedom for every population parameter you use the sample to estimate
(digression – degrees of freedom) Degrees of freedom Got it? No? Look, it’s really about high dimensional geometry anyway, so just be content with this reality: Why does this matter? DF are used to give an estimate of the accuracy of making a prediction, from your sample, of the population…the more DF you have, the more accurate this prediction will be (and therefore the more likely it will be that you get significant results) 1 2 3
Sampling distribution of t 1
Overall logic again Get your critical t from the table Calculate your actual t using If the calculated t is more different from 0 than the table t, you reject the null