WELCOME TO THE WORLD OF INFERENTIAL STATISTICS Analysis of variance – ANOVA
What is ANOVA? A method for comparing multiple groups. Question: Are these groups (however many there are) significantly different from each other? Answer: ANOVA
ANOVA – Null Hypothesis
ANOVA – Alternative Hypothesis
ANOVA QUESTION????? SYSTEMATIC?
Basics of ANOVA Null hypothesis: All groups are coming from the same population Differences in means are due to normal sampling fluctuations between samples drawn from the same population If null hypothesis is true, then certain other things must be true. Test if that is the case (F-test) If yes accept null hypothesis If no reject null hypothesis
Working with multiple samples Under the null hypothesis, all of these are coming from the same population
Same idea, different picture Population Sample k Sample 1 Sample 4 Sample 2 Sample 3
Working with multiple samples Each of these means estimate the SAME population mean
Working with multiple samples: How can we estimate the population variance? Each of these variances estimate the SAME population variance
Working with multiple samples: How can we estimate the population variance? Variance estimate 1: Average the sample variances!
Working with multiple samples: A second estimate of the population variance These means are the means of “all possible” samples
What can we compute using the means of all possible samples? SD of “all possible” sample means is : SEM
Estimating the variance based on the SEM Variance equation 2: Variance of all possible sample means Variance of the attribute of interest in the population
BETWEEN GROUPS estimate of the population variance Variance of all possible sample means Mean of group k Grand mean n subjects in each sample, k samples Estimate of the total variance based on variance between different groups.
G&W – For Gorkem & friends!!!! Page 369 – comparing the logic of t-test with ANOVA Page 372 – the components of variance Page 373-374 – a very good discussion of F statistic
Example Students were grouped into low-medium and high levels of motivation. They were asked number of hours per week they spent doing homework. The question is, whether low-medium-high motivated students spend time doing homework that significantly differs.
Example Null and Alternative Hypotheses
Example Means of each group
Example Variance in means
Example Between groups estimate of variance
WITHIN GROUPS estimate of the population variance Variance of the attribute in the subjects in each group Estimate based on within group variances – average of group variances
Example Within groups estimate of variance
F-Statistic (Fisher’s test, 1923) Logic of the F statistic: The two estimates would be approximately equal if samples were drawn randomly from a single population. If the ratio of “between” samples variance to “within” samples variance is taken, it should be approximately 1. Depending on sample size Depending on variation of sample means among all possible samples If the ratio of “between” samples variance is much higher than “within” samples variance, then sample means vary more than expected by chance. This is evidence that the independent variable is associated with significant differentiation of means.
Formal presentation of F-statistic dfbetween = k-1 dfwithin = k(n-1)
Partitioning the variance Total variance = Between group variance + Within group variance Within group variance = Variance between subjects OR = True variance between subjects + Variance due to measurement error Between group variance = Variance due to membership of groups + True variance between subjects + If null hyp is true
Understanding the F-test Between groups variance Within groups Variance due to membership of groups True variance between subjects measurement error + =
Assumptions of the F-test Interval/ratio level measurement of the dependent variable (robust against deviations from normality) Independence of observations (each unit of analysis is independently selected into the sample)