Non-parametric methods in statistical testing

Non-parametric methods in statistical testing
Douwe Postmus

Content Comparison of a continuous outcome variable between two groups
Independent samples Student t-test (parametric) Mann-Whitney-Wilcoxon test (non-parametric) Dependent samples Paired samples t-test (parametric) Wilcoxon signed rank test (non-parametric)

Is a mother’s smoking status during pregnacy associated with the birth weight of an infant?

Parametric model The birth weight of infants from mothers who smoked during the pregnacy is normally distributed with mean 𝜇 1 and variance 𝜎 2 The birth weight of infants from mothers who did not smoke during the pregnacy is normally distributed with mean 𝜇 2 and variance 𝜎 2 The study data are independent samples from these two probability distributions

Hypotheses H0: µ1 = µ2 (or µ1 - µ2 = 0) H1: µ1 ≠ µ2 (or µ1 - µ2 ≠ 0)

Results If H0 is true, t follows a t-distribution with 𝑛 1 + 𝑛 2 −2 degrees of freedom

Are our assumptions reasonable?

What if the assumptions are not met?
Normality Equal variances Approach + - Welch’s t-test Intermediate or large samples (n>30): Student t-test (central limit theorem) Small samples: Mann-Whitney-Wilcoxon test Welch’s t-test performed on ranked data

Mann-Whitney-Wilcoxon test
The observations from group 1 are a random sample from some probability distribution F The observations from group 2 are a random sample from some probability distribution G The observations from group 1 are independent of the observations from group 2 (no pairing or clustering)

Hypotheses H0: F is equal to G
H1: F and G have the same shape but a different median (location-shift interpretation)

Test statistic (Wilcoxon W)
Step 1: group all observations together and rank them in order of increasing size Step 2: calculate the sum of the ranks W of the observations that came from one of the groups Group selection is arbitrary as the sum of the two rank sums is known SPSS selects the group with the least amount of observations

Example - 10 randomly selected infants

Step 1: rank the observations

Step 2: calculate the sum of the ranks for infants from mothers who smoked during pregnacy

Distribution of W under H0
If H0 is true, each assignment of ranks to the 10 infants is equally likely In total, there are 10*9*8*…*1 = 10! = 3,628,800 possible ways to assign ranks to the 10 infants The distribution of W is obtained by enumerating the rank sums for all of these assignments

Results

Full dataset

Results U = W – n2*(n2+1)/2 If H0 is true, U is approximately normally distributed with mean (n1*n2)/2 and variance n1*n2*(n1+n2+1)/12

Content Comparison of a continuous outcome variable between two groups
Independent samples Student t-test (parametric) Mann-Whitney-Wilcoxon test (non-parametric) Dependent samples Paired samples t-test (parametric) Wilcoxon signed rank test (non-parametric)

How effective is a low-carb diet?
Twenty subjects submitted to a low-carb diet Two measurements per subject Body weight at the start of the diet (kg) Body weight 16 weeks after the start of the diet (kg)

Study data

Paired samples t-test Assumptions: Hypotheses:
The difference in body weight between baseline and week 16 is normally distributed with mean 𝜇 and variance 𝜎 Hypotheses: H0: µ = 0 H1: µ ≠ 0

Results If H0 is true, t follows a t-distribution with n-1 degrees of freedom

Are our assumptions reasonable?

Wilcoxon signed-rank test
Assumptions: The distribution of the difference in body weight between baseline and week 16 is symmetric around the median Hypotheses: H0: The distribution of the differences is symmetric around zero H1: The distribution of the differences is symmetric around a value other than zero

Test statistic (W+) Step 1: rank the absolute values of the differences Exclude pairs for which the difference is zero Step 2: obtain signed ranks by restoring the signs of the differences to the ranks Step 3: calculate the sum of the ranks W+ that have a positive sign

Step 1: rank the absolute values of the differences

Step 2: obtain signed ranks by restoring the signs of the differences to the ranks

Step 3: calculate the sum of the ranks W+ that have a positive sign

Distribution of W+ under H0
If H0 is true, any particular assignment of signs to the 20 ranks is equally likely In total, there are 220 = 1,048,576 possible ways to assign signs to the 20 ranks The distribution of W+ is obtained by enumerating the sum of the ranks that have a positive sign for all of these assignments

Results

Summary Non-parametric methods do not assume that the data follow any particular distributional form More generally applicable than the corresponding parametric models at the cost of a (slight) reduction in power Non-parametric methods are not assumption free!

Next lectures Date Location Speaker Topic 12 February TBD H. Burgerhof
Multiple testing: problems and some solutions 9 April D. Postmus Kaplan-Meier survival curves and the log-rank test 11 June

contact: d.postmus@umcg.nl

Non-parametric methods in statistical testing

Similar presentations

Presentation on theme: "Non-parametric methods in statistical testing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Non-parametric methods in statistical testing

Similar presentations

Presentation on theme: "Non-parametric methods in statistical testing"— Presentation transcript:

Similar presentations

About project

Feedback