Download presentation
Presentation is loading. Please wait.
1
Non-parametric methods in statistical testing
Douwe Postmus
2
Content Comparison of a continuous outcome variable between two groups
Independent samples Student t-test (parametric) Mann-Whitney-Wilcoxon test (non-parametric) Dependent samples Paired samples t-test (parametric) Wilcoxon signed rank test (non-parametric)
3
Is a mother’s smoking status during pregnacy associated with the birth weight of an infant?
4
Parametric model The birth weight of infants from mothers who smoked during the pregnacy is normally distributed with mean 𝜇 1 and variance 𝜎 2 The birth weight of infants from mothers who did not smoke during the pregnacy is normally distributed with mean 𝜇 2 and variance 𝜎 2 The study data are independent samples from these two probability distributions
5
Hypotheses H0: µ1 = µ2 (or µ1 - µ2 = 0) H1: µ1 ≠ µ2 (or µ1 - µ2 ≠ 0)
6
Results If H0 is true, t follows a t-distribution with 𝑛 1 + 𝑛 2 −2 degrees of freedom
7
Are our assumptions reasonable?
8
What if the assumptions are not met?
Normality Equal variances Approach + - Welch’s t-test Intermediate or large samples (n>30): Student t-test (central limit theorem) Small samples: Mann-Whitney-Wilcoxon test Welch’s t-test performed on ranked data
9
Mann-Whitney-Wilcoxon test
The observations from group 1 are a random sample from some probability distribution F The observations from group 2 are a random sample from some probability distribution G The observations from group 1 are independent of the observations from group 2 (no pairing or clustering)
10
Hypotheses H0: F is equal to G
H1: F and G have the same shape but a different median (location-shift interpretation)
11
Test statistic (Wilcoxon W)
Step 1: group all observations together and rank them in order of increasing size Step 2: calculate the sum of the ranks W of the observations that came from one of the groups Group selection is arbitrary as the sum of the two rank sums is known SPSS selects the group with the least amount of observations
12
Example - 10 randomly selected infants
13
Step 1: rank the observations
14
Step 2: calculate the sum of the ranks for infants from mothers who smoked during pregnacy
15
Distribution of W under H0
If H0 is true, each assignment of ranks to the 10 infants is equally likely In total, there are 10*9*8*…*1 = 10! = 3,628,800 possible ways to assign ranks to the 10 infants The distribution of W is obtained by enumerating the rank sums for all of these assignments
16
Results
17
Full dataset
18
Results U = W – n2*(n2+1)/2 If H0 is true, U is approximately normally distributed with mean (n1*n2)/2 and variance n1*n2*(n1+n2+1)/12
19
Content Comparison of a continuous outcome variable between two groups
Independent samples Student t-test (parametric) Mann-Whitney-Wilcoxon test (non-parametric) Dependent samples Paired samples t-test (parametric) Wilcoxon signed rank test (non-parametric)
20
How effective is a low-carb diet?
Twenty subjects submitted to a low-carb diet Two measurements per subject Body weight at the start of the diet (kg) Body weight 16 weeks after the start of the diet (kg)
21
Study data
22
Paired samples t-test Assumptions: Hypotheses:
The difference in body weight between baseline and week 16 is normally distributed with mean 𝜇 and variance 𝜎 Hypotheses: H0: µ = 0 H1: µ ≠ 0
23
Results If H0 is true, t follows a t-distribution with n-1 degrees of freedom
24
Are our assumptions reasonable?
25
Wilcoxon signed-rank test
Assumptions: The distribution of the difference in body weight between baseline and week 16 is symmetric around the median Hypotheses: H0: The distribution of the differences is symmetric around zero H1: The distribution of the differences is symmetric around a value other than zero
26
Test statistic (W+) Step 1: rank the absolute values of the differences Exclude pairs for which the difference is zero Step 2: obtain signed ranks by restoring the signs of the differences to the ranks Step 3: calculate the sum of the ranks W+ that have a positive sign
27
Step 1: rank the absolute values of the differences
28
Step 2: obtain signed ranks by restoring the signs of the differences to the ranks
29
Step 3: calculate the sum of the ranks W+ that have a positive sign
30
Distribution of W+ under H0
If H0 is true, any particular assignment of signs to the 20 ranks is equally likely In total, there are 220 = 1,048,576 possible ways to assign signs to the 20 ranks The distribution of W+ is obtained by enumerating the sum of the ranks that have a positive sign for all of these assignments
31
Results
32
Summary Non-parametric methods do not assume that the data follow any particular distributional form More generally applicable than the corresponding parametric models at the cost of a (slight) reduction in power Non-parametric methods are not assumption free!
33
Next lectures Date Location Speaker Topic 12 February TBD H. Burgerhof
Multiple testing: problems and some solutions 9 April D. Postmus Kaplan-Meier survival curves and the log-rank test 11 June
34
contact: d.postmus@umcg.nl
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.