Independent-Samples t-test 10/8
Comparing Two Groups Often interested in whether two groups have same mean Experimental vs. control conditions Comparing learning procedures, with vs. without drug, lesions, etc. Men vs. women, depressed vs. not Comparison of two separate populations Population A, sample A of size nA, mean MA estimates mA Population B, sample B of size nB, mean MB estimates mB mA = mB? Example: maze times Rats with hippocampus: Sample A = [43, 26, 35, 31, 28] Without hippocampus: Sample B = [37, 31, 27, 46, 33] MA = 32.6, MB = 34.8 Is difference reliable? mA < mB? Null hypothesis: mA = mB No assumptions of what each is (e.g., mA = 10, mB= 10) Alternative Hypothesis: mA ≠ mB
Finding a Test Statistic Goal: Define a test statistic for deciding mA = mB vs. mA ≠ mB Constraints (apply to all hypothesis testing): Must be function of data (both samples) Sampling distribution must be fully determined by H0 Can only assume mA = mB Can’t depend on mA or mB separately, or on s Alternative hypothesis should predict extreme values Statistic should measure deviation from mA = mB so that if mA ≠ mB, we’ll be able to reject H0 Answer (preview): Based on MA – MB (just like M – m0 for one-sample t-test) . MA – MB has Normal distribution Standard error has (modified) chi-square distribution Ratio has t distribution
Likelihood Function for MA – MB Central Limit Theorem Distribution of MA – MB Subtract the means: E(MA – MB) = E(MA) – E(MB) = m – m = 0 Add the variances: . Just divide by standard error? but we don’t know s Need to estimate from data
Estimating s Already know best estimator for one sample Could just use one sample or the other sA or sB Works, but not best use of the data Combining sA and sB Both come from averages of (X – M)2 (X – M)2 for each individual score is estimate of s2 Average them all together: Degrees of freedom (nA – 1) + (nB – 1) = nA + nB – 2
Independent-Samples t Statistic Difference between sample means Typical difference expected by chance Variance of MA – MB Variance from MA Variance from MB Estimate of s Sum of squared deviations Degrees of freedom
Steps of I.S. t-test State clearly the two hypotheses Determine null and alternative hypotheses H0: mA = mB H1: mA ≠ mB Compute the test statistic t from the data . Determine likelihood function for test statistic according to H0 t distribution with nA + nB – 2 degrees of freedom Get p-value or 1-pt(t,df) or 2*(1-pt(|t|,df)) Choose alpha level 7a. p > α: Retain null hypothesis, mA = mB 7b. p < α: Reject null hypothesis, mA ≠ mB
Example Rats with hippocampus: Sample A = [43, 26, 35, 31, 28] Without hippocampus: Sample B = [37, 31, 27, 46, 33] MA = 32.6, MB = 34.8, MA – MB = -2.2 df = nA + nB – 2 = 5 + 5 – 2 = 8 X X-MA (X-MA)2 43 10.4 108.16 26 -6.6 43.56 35 2.4 5.76 31 -1.6 2.56 28 -4.6 21.16 SA(X-MA)2 181.20 X X-MB (X-MB)2 37 2.2 4.84 31 -3.8 14.44 27 -7.8 60.84 46 11.2 125.44 33 -1.8 3.24 SB(X-MB)2 208.80 p = .64 p = .32 t8
Homogeneity of Variance t-test only works if sA = sB Variance is homogenous Not assumption of H0, but of whole procedure H0: mA = mB & sA = sB H1: mA ≠ mB & sA = sB If variance is heterogeneous Standard procedure doesn’t work Trick for estimating standard error and reducing degrees of freedom What to remember Independent-samples t-test assumes homogenous variance If not true, you have to use alternative formulas for SE and df
Mean Squared Error Population Choosing gives population variance Sample Sample variance Gives estimate of population variance I.S. t-test Gives estimate of population variance
Degrees of Freedom Applies to any sum-of-squares type formula Tells how many numbers are really being added n = 2: only one number In general: one number determined by the rest Every statistic in formula that’s based on X removes 1 df M, MA, MB Fancy algebra to rewrite formula in terms of only X results in fewer summands I will always tell you how to find df for each formula To get average, divide by df Distribution of a statistic depends on its df c2, t, F X X – M (X – M)2 3 -2 4 7 2