Nonparametric tests and ANOVAs: What you need to know
Nonparametric tests Nonparametric tests are usually based on ranks There are nonparametric versions of most parametric tests
Parametric One-sample and Paired t-test Two-sample t-test Sign test Mann-Whitney U-test Nonparametric
Quick Reference Summary: Sign Test What is it for? A non-parametric test to compare the medians of a group to some constant What does it assume? Random samples Formula: Identical to a binomial test with p o = 0.5. Uses the number of subjects with values greater than and less than a hypothesized median as the test statistic. P(x) = probability of a total of x successes p = probability of success in each trial n = total number of trials P = 2 * Pr[x X]
Sample Null hypothesis Median = m o Null distribution Binomial n, 0.5 compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o Fail to reject H o Sign test Test statistic x = number of values greater than m o
Quick Reference Summary: Mann-Whitney U Test What is it for? A non-parametric test to compare the central tendencies of two groups What does it assume? Random samples Test statistic: U Distribution under H o : U distribution, with sample sizes n 1 and n 2 Formulae: n 1 = sample size of group 1 n 2 = sample size of group 2 R 1 = sum of ranks of group 1 Use the larger of U1 or U2 for a two-tailed test
Sample Null hypothesis The two groups Have the same median Null distribution U with n 1, n 2 compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o Fail to reject H o Mann-Whitney U test Test statistic U 1 or U 2 (use the largest)
Mann-Whitney U test Large-sample approximation: Use this when n 1 & n 2 are both > 10 Compare to the standard normal distribution
Mann-Whitney U Test If you have ties: –Rank them anyway, pretending they were slightly different –Find the average of the ranks for the identical values, and give them all that rank –Carry on as if all the whole-number ranks have been used up
Example Data
Example Sorted Data Data
Example Sorted Data Data TIES
Example Sorted Data Data TIES Rank them anyway, pretending they were slightly different
Example Rank A Sorted Data
Example Rank A Sorted Data Find the average of the ranks for the identical values, and give them all that rank
Example Rank A Sorted Data Average = 1.5 Average = 6
Example Rank A Sorted Data Rank
Example Rank A Sorted Data Rank These can now be used for the Mann-Whitney U test
Benefits and Costs of Nonparametric Tests Main benefit: –Make fewer assumptions about your data –E.g. only assume random sample Main cost: –Reduce statistical power –Increased chance of Type II error
When Should I Use Nonparametric Tests? When you have reason to suspect the assumptions of your test are violated –Non-normal distribution –No transformation makes the distribution normal –Different variances for two groups
Quick Reference Summary: ANOVA (analysis of variance) What is it for? Testing the difference among k means simultaneously What does it assume? The variable is normally distributed with equal standard deviations (and variances) in all k populations; each sample is a random sample Test statistic: F Distribution under H o : F distribution with k-1 and N-k degrees of freedom
Formulae: Quick Reference Summary: ANOVA (analysis of variance) = mean of group i = overall mean n i = size of sample i N = total sample size
k Samples Null distribution F with k-1, N-k df compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject H o Fail to reject H o ANOVA Test statistic Null hypothesis All groups have the same mean
Formulae: Quick Reference Summary: ANOVA (analysis of variance) = mean of group i = overall mean n i = size of sample i N = total sample size There are a LOT of equations here, and this is the simplest possible ANOVA
df group = k-1 df error = N-k
df group = k-1 df error = N-k Sum of Squares df Mean SquaresF-ratio
ANOVA Tables Source of variation Sum of squares dfMean Squares F ratioP Treatment Error Total
ANOVA Tables Source of variation Sum of squares dfMean Squares F ratioP Treatment Error Total
ANOVA Tables Source of variation Sum of squares dfMean Squares F ratioP Treatment k-1 Error N-k Total N-1
ANOVA Tables Source of variation Sum of squares dfMean Squares F ratioP Treatment k-1 Error N-k Total N-1
ANOVA Tables Source of variation Sum of squares dfMean Squares F ratioP Treatment k-1 Error N-k Total N-1
ANOVA Tables Source of variation Sum of squares dfMean Squares F ratioP Treatment k-1 Error N-k Total N-1 *
ANOVA Table: Example Source of variation Sum of squares dfMean Squares F ratioP Treatment Error Total
ANOVA Table: Example Source of variation Sum of squares dfMean Squares F ratioP Treatment Error Total
Additions to ANOVA R 2 value: how much variance is explained? Comparisons of groups: planned and unplanned Fixed vs. random effects Repeatability
Two-Factor ANOVA Often we manipulate more than one thing at a time Multiple categorical explanitory variables Example: sex and nationality
Two-factor ANOVA Don’t worry about the equations for this Use an ANOVA table
Two-factor ANOVA Testing three things: 1.Means don’t differ among treatment 1 2.Means don’t differ among treatment 2 3.There is no interaction between the two treatments
Two-factor ANOVA Table Source of variation Sum of Squares dfMean SquareF ratioP Treatment 1SS 1 k 1 - 1SS 1 k MS 1 MSE Treatment 2SS 2 k 2 - 1SS 2 k MS 2 MSE Treatment 1 * Treatment 2 SS 1*2 (k 1 - 1)*(k 2 - 1)SS 1*2 (k 1 - 1)*(k 2 - 1) MS 1*2 MSE ErrorSS error XXXSS error XXX TotalSS total N-1