Review Nine men and nine women are tested for their memory of a list of abstract nouns. The mean scores are M male = 15 and M female = 17. The mean square based on both samples is MS = 25. What is your best estimate of the (non-standardized) effect size, µ female – µ male ? A.0.40 B.1.33 C.2.00 D.18.0
Review Nine men and nine women are tested for their memory of a list of abstract nouns. The mean scores are M male = 15 and M female = 17. The mean square based on both samples is MS = 25. Calculate the standardized effect size. A.0.08 B.0.10 C.0.40 D.0.49
Review A sample of 16 subjects has a mean of 53 and a sample variance of 9. Calculate a 95% confidence interval for the population mean. The critical value for t (at 97.5% with 15 df) is A.[51.40, 54.60] B.[48.21, 57.79] C.[51.80, 54.20] D.[50.16, 55.84]
Three Views of Hypothesis Testing 10/21
What are we doing? In science, we run lots of experiments In some cases, there's an "effect”; in others there's not – Some manipulations have an impact; some don't – Some drugs work; some don't Goal: Tell these situations apart, using the sample data – If there is an effect, reject the null hypothesis – If there’s no effect, retain the null hypothesis Challenge – Sample is imperfect reflection of population, so we will make mistakes – How to minimize those mistakes?
Analogy: Prosecution Think of cases where there is an effect as "guilty” – No effect: "innocent" experiments Scientists are prosecutors – We want to prove guilt – Hope we can reject null hypothesis Can't be overzealous – Minimize convicting innocent people – Can only reject null if we have enough evidence How much evidence to require to convict someone? Tradeoff – Low standard: Too many innocent people get convicted (Type I Error) – High standard: Too many guilty people get off (Type II Error)
Binomial Test Example: Halloween candy – Sack holds boxes of raisins and boxes of jellybeans – Each kid blindly grabs 10 boxes – Some kids cheat by peeking Want to make cheaters forfeit candy – How many jellybeans before we call kid a cheater?
Binomial Test How many jellybeans before we call kid a cheater? Work out probabilities for honest kids – Binomial distribution – Don’t know distribution of cheaters Make a rule so only 5% of honest kids lose their candy For each individual above cutoff – Don’t know for sure they cheated – Only know that if they were honest, chance would have been less than 5% that they’d have so many jellybeans Therefore, our rule catches as many cheaters as possible while limiting Type I errors (false convictions) to 5% HonestCheaters ? Jellybeans
Testing the Mean Example: IQs at various universities – Some schools have average IQs (mean = 100) – Some schools are above average – Want to decide based on n students at each school Need a cutoff for the sample mean – Find distribution of sample means for Average schools – Set cutoff so that only 5% of Average schools will be mistaken as Smart 100 AverageSmart ? p(M)p(M)
Unknown Variance: t-test Want to know whether population mean equals 0 – H 0 : = 0 H 1 : ≠ 0 Don’t know population variance – Don’t know distribution of M under H 0 – Don’t know Type I error rate for any cutoff Use sample variance to estimate standard error of M – Divide M – 0 by standard error to get t – We know distribution of t exactly 00 p(M)p(M) p(t)p(t) 0 Distribution of t when H 0 is True Critical value Critical Region
Two-tailed Tests Often we want to catch effects on either side Split Type I Errors into two critical regions – Each must have probability /2 Critical value H0H0 H1?H1? H1?H1? Test statistic
An Alternative View: p-values p-value – Probability of a value equal to or more extreme than what you actually got – Measure of how consistent data are with H 0 p > – t is within t crit – Retain null hypothesis p < – t is beyond t crit – Reject null hypothesis; accept alternative hypothesis t = 2.57 p =.03 t t t crit for =.05
Three Views of Inferential Statistics Effect size & confidence interval – Values of 0 that don’t lead to rejecting H 0 Test statistic & critical value – Measure of consistency with H 0 p-value & – Type I error rate Can answer hypothesis test at any level – Result predicted by H 0 vs. confidence interval – Test statistic vs. critical value – p vs. 01 p Confidence interval M 00 00 00 00 Test statistic Critical value
Review Which of the following would lead you to reject the null hypothesis? A. t < t crit B. p < C. M outside the confidence interval D. t > E. p > t crit
Review In a paired-samples experiment, you get a confidence interval of [-2.4, 3.6] for µ diff. What is your conclusion? A. t , and you keep the null hypothesis B. t < t crit, p < , and you reject the null hypothesis C. t > t crit, p > , and you reject the null hypothesis D. t > t crit, p < , and you keep the null hypothesis
Review In a one-tailed t-test, you get a result of t = 2.8. The critical value is t crit = What region(s) correspond to alpha? A.Blue B.Blue + red C.Blue + green D.Blue + red + orange + green t crit t (from data)