Download presentation
Presentation is loading. Please wait.
Published byEmmeline Wallis Modified over 9 years ago
1
Categorical and discrete data. Non-parametric tests
2
Non-parametric tests: estimate sample differences when the known distribution shapes cannot help, or even confuse
3
Metrics of arbitrary distibutions Median: the value that "splits the sample in half" Mode: the value that occurs with the greatest frequency. n-th procentile: the value between n% of the sorted sample and the rest of it. Hence quantiles, quartiles, deciles etc.
4
Non-parametric correlations Spearman R: The closest analog of Pearson r with the difference that, instead of each raw value, its rank in the sample is used 5 – 6 1 – 1 4 – 3 6 – 5 7 – 8 2 – 2 8 – 7 3 – 4 5 – 8 1 – 6 4 – 3 6 – 2 7 – 4 2 – 5 8 – 1 3 – 7 X – YX – Z 2.4 -> 5 3.2 -> 1 2.7 -> 4 2.3 -> 6 2.2 -> 7 3.1 -> 2 2.0 -> 8 2.9 -> 3
5
Non-parametric correlations Spearman R: A variability ratio (X and Y vary synchronously) / (X and Y vary in total) Kendall Tau: A probability ratio P(X and Y are related) / P(X and Y are NOT related)
6
Comparing two independent samples The Mann-Whitney U test: The closest analog of the t-test with the difference that, instead of each raw value, its rank in the sample is used Wald-Wolfowitz runs test: Estimates if two samples differ in BOTH means and distribution shapes – – + – – – – – – – + – + + + – + + – + + + + – + + – – + – + – + + – – + + – + – + + + – + + – + + – –
7
Comparing two dependent samples Wilcoxon matched pairs test: The difference between two tied samples is significant, if a sum of pairwise differences (either positive or negative ones) is TOO BIG Sign test: >><<><<>>><<><<> <<><<<<><<><<<<>
8
2 x 2 tables SmokeDo not smoke Female1226 Male3425 Chi-square test, Fisher’s exact test: Χ 2 = Σ (Observed - Expected) 2 / Expected
9
Nonparametric methods are most appropriate when the sample sizes are small. When the data set is large (e.g., n > 100) it often makes little sense to use nonparametric statistics at all: When the samples become very large, then the sample means will follow the normal distribution even if the respective variable is not normally distributed in the population
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.