Nonparametric test based on ranks 1
Concept of parametric and non-parametric testing Wilcoxon’s signed rank test for paired design Wilcoxon’s rank sum test for two independent samples Kruskal-Wallis’ H test for multiple independent samples and pairwise comparison Outline 2
Review of Parametric Test Example: independent sample t test Assumptions: independent observations normal distribution equal variances Objective the population means (parameter of the specified distribution ) are equal or not under such assumptions. 3
Key features: Assume particular distribution Make inference for the parameter of the specified distribution Therefore, they are called parametric ( specific distribution based ) tests Review of Parametric Test 4
Non-parametric tests (distribution-free tests) When conditions of parametric testing were not met. For instance: Variables do not follow normal distribution; The distribution belongs to some type we even do not know yet ( this distribution can not be specified with a finite number of parameters) Frequency Incidence number 5
Scenarios suitable for non-parametric tests a. Distribution unknown (condition of parametric methods not met) b. Ordinal data :data have a ranking but no clear numerical interpretation c. Non-precise data( i. e: >80); d. A quick and brief analysis ( for pilot study ). Non-parametric tests 6
Key features: Distribution free--- free of particular distribution specification, not free of any assumption (independent, symmetric) Not making inference for the parameters (goodness of fit testing…) Non-parametric tests 7
Fewer assumptions, wider applicability However, When the assumptions of parametric tests hold, the power of non-parametric test (if it is used) will be slightly lower. In other words, a larger sample size may be required to draw a significant conclusion under the same test level. So, the appropriateness of assumption is essential (statistic diagnostic) Justification of Parametric test and Non-parametric tests 8
1 Wilcoxon’s signed rank test Used for two related samples or repeated measurements on a single sample alternative to the paired Student's t-test when the paired difference cannot be assumed to be normally distributed. Named for Irish-born US statistician Frank Wilcoxon (1892– 1965) who, in a single paper in 1945, proposed both it and the rank-sum test for two independent samples. It was popularised by Siegel (1956) in his influential text book on non-parametric statistics 9
Example 1 In order to study the difference of The Fluoride 氟化 物 concentration between the methods of electrode 电极法 and spectrophotometry 分光光度法, the concentrations of 11 paired industrial sewage were measured. The results are listed in Table 1 (According to normality test, the paired difference is not normally distributed) 1 Wilcoxon’s signed rank sum test 10
T + =43.5; T - =
Steps: (1) Hypotheses: H 0 : The median of the difference is 0 H 1 : The median of the difference is not 0 α=0.05. (2)Difference Ranking absolute differences (omit zero, mean rank for ties 持平值 ), and give back the signs Rank sum as statistic T : any one from positive sum or negative sum (3) P-value and conclusion Under null hypotheses, T will be not far from the mean rank of n(n+1)/4 (0 vs n(n+1)/2, an extreme ), From C 9, T is within the critical range(8-47), P>0.05, H 0 is not rejected. Conclusion: we can’t infer the concentrations from two methods are different. 12
Total rank sum is always n(n+1)/2 The middle point is the mean rank ( 8+47)/2= n(n+1)/4=
When sample size is beyond the critical value table, a Z test could be used If there is no tie If there are ties 1 Wilcoxon’s signed ranksum test 14
2 Wilcoxon’s rank sum test for two samples Used for independent samples when data is not normally distributed; it is not sure whether the variable follows a normal distribution. Named for Frank Wilcoxon in 1945: equal sample sizes Henry Berthold Mann( ), Austrian-born US mathematician and statistician; Donald Ransom Whitney in 1947: arbitrary sample sizes Also called the Mann–Whitney U test or Mann–Whitney– Wilcoxon (MWW) test. 15
Example 2 In order to study the difference of the lethal effect of two drugs, two groups of snails 钉螺 were separately killed by two drugs and the mortality rate of two groups was measured. (Not normally distributed) The results are listed in Table 2 2 Wilcoxon’s rank sum test for two samples 16
17
Steps (1)Hypotheses: H 0 : The distributions of two populations are the same H 1 : The two distributions are not the same α = 0.05 (2) Ranking all the observations in two samples. same way for ties Rank sum for smaller sample, T=T 1 = 71.5 (3) P-value and conclusion (C10 ) T 0.02,7,7 =34~71, T is outside the critical range, so we got P<0.02, H 0 should be rejected. Conclusion: There exists statistically significant difference between the two mortality rates. 18
19
When sample size is beyond the range of critical value table, a Z test could be used If there is no tie If there are ties 2 Wilcoxon’s rank sum test for two samples 20
3 Kruskal-Wallis’ H test for comparing more than 2 samples Used for testing equality of population medians among groups. Identical to one-way ANOVA with the data replaced by their ranks. Not assume a normal population, but does assume an identically-shaped and scaled distribution for each group, except for any difference in medians. Named for American mathematician and statistician William Henry Kruskal (1919–2005 ) American economist and statistician Wilson Allen Wallis ( ) in a 1952 paper. An extension of the Mann–Whitney U test to 3 or more groups. 21
3.1 Kruskal-Wallis’ H test for comparing more than 2 samples Example chronic pharyngitis 慢性咽炎 patients were grouped into 3 categories, according to the treatments they received. A: treatment A; B: treatment B; C: treatment C; The efficacy are listed in Table 3. 22
23 R1=83182, R2=18070, R3=13229 R=mean rank * total
(1)Hypothesis: H 0 : The distributions of three populations are all the same H 1 : The distributions of three populations are not all the same α = 0.05 (2) Ranking all the observations in three samples (Same way for ties) Rank sums for each sample R 1 =83182, R 2 =18070, R 3 =
(3) Statistic H If there is no tie If there are ties t j : Number of individuals in j-th tie Example 3: 25
(4) P-value and conclusion Compare with critical value of H (C 11) Compare with critical value of H (C 11) ——group number =<3, and n of each group =<5 ——group number =<3, and n of each group =<5 Or k: Number of groups in example 3: from table C 8, Hc=51.41 >, we got P, we got P<0.05 Conclusion: efficacy is not all at an equal level. 26
3.2. multiple comparison of mean ranks Used for When the comparison among three groups results in significant difference, multiple comparison is needed to know which pairs are different. t tests for pair-wise comparison could be used 27
(1)Hypothesis: H 0 : this pair of two population distributions have the same locations H 1 : this pair of two population distributions have different locations, α=0.05. (2) Calculate t value: Similarly , t A,C =0.7071, t B,C =
29
(3) P-value and conclusion, efficacy in group A has a different level from that of the other two groups. Since, The patients in group A may have better efficacy. Conclusion: treatment A may lead to a better efficacy. 30
The End summary 1.Concepts of parametric and non- parametric testing methods 2.Merits and limitations of nonparametric methods 3.Some nonparametric methods based on ranks 31
Nonparametric tests I Back to basics
Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use
MTB > dotplot 'Male' 'Female'; SUBC> same.. : :: :..:::.. :..:: :....:..... : MALE..:. : : :..: ::::::.::.:. ::.: :. : FEMALE
MTB > dotplot 'Male' 'Female'; SUBC> same.. : :: :..:::.. :..:: :....:..... : MALE..:. : : :..: ::::::.::.:. ::.: :. : FEMALE MTB > desc 'Male' 'Female’ Variable N Mean Median TrMean StDev SEMean MALE FEMALE Variable Min Max Q1 Q3 MALE FEMALE
Lecture Outline What is a nonparametric test? –What is a parameter? –What are examples of non-parametric tests? Rank tests, distribution free tests and nonparametric tests Which type of test to use
Parameters are central to inference in GLM and ANOVA and represent assumptions about the underlying processes
LET K1=4.7 # Group 1 mean minus grand mean LET K2=-2.5 # Group 2 mean minus grand mean LET K3=10.4 # The grand mean LET K4=1.9 # Standard deviation of the error RANDOM 30 'Error' LET 'Y'=K3+K1*'DUM1'+K2*'DUM2'+K4*'Error'
LET K1=4.7 # Group 1 mean minus grand mean LET K2=-2.5 # Group 2 mean minus grand mean LET K3=10.4 # The grand mean LET K4=1.9 # Standard deviation of the error RANDOM 30 'Error' LET 'Y'=K3+K1*'DUM1'+K2*'DUM2'+K4*'Error' Fitted value = + Group 1 1 2 2 3- 1 - 2 Error has Normal Distribution with zero mean and standard deviation
LET K1=4.7 # Group 1 mean minus grand mean LET K2=-2.5 # Group 2 mean minus grand mean LET K3=10.4 # The grand mean LET K4=1.9 # Standard deviation of the error RANDOM 30 'Error' LET 'Y'=K3+K1*'DUM1'+K2*'DUM2'+K4*'Error' Fitted value = + Group 1 1 2 2 3- 1 - 2 Error has Normal Distribution with zero mean and standard deviation
Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes
Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes can be done without in some simple situations
Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes can be done without in some simple situations – BUT HOW?
RnkWtSex
RnkWtSex Remember ties
Mean Rank
The ‘Male’ mean rank = The ‘Female’ mean rank = Mean Rank
MTB > mann-whitney male female
Mann-Whitney Test and CI: MALE, FEMALE
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median =
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200)
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200) W =
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200) W = Sum of ranks of 2763 corresponds to a mean rank of 2763/50 = 55.26
The ‘Male’ mean rank = The ‘Female’ mean rank = Mean Rank
The ‘Male’ mean rank = The ‘Female’ mean rank = Mean Rank
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200) W = Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200) W = Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at The test is significant at (adjusted for ties)
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200) W = Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at The test is significant at (adjusted for ties) Cannot reject at alpha = 0.05
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200) W = Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at The test is significant at (adjusted for ties) Cannot reject at alpha = 0.05
MTB > mann-whitney male female Mann-Whitney Test and CI: MALE, FEMALE MALE N = 50 Median = FEMALE N = 50 Median = Point estimate for ETA1-ETA2 is Percent CI for ETA1-ETA2 is ( ,0.1200) W = Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at The test is significant at (adjusted for ties) Cannot reject at alpha = 0.05 The null hypothesis is better expressed as “the distributions of male and female weights are the same”.
Parameters are central to inference in GLM and ANOVA but represent assumptions about the underlying processes can be done without in some simple situations
Nonparametric vs Parametric
Sign TestOne-sample t-test
Nonparametric vs Parametric Sign Test Mann-Whitney Test One-sample t-test Two-sample t-test
Nonparametric vs Parametric Sign Test Mann-Whitney Test Spearman Rank Test One-sample t-test Two-sample t-test Correlation/Regression
Nonparametric vs Parametric Sign Test Mann-Whitney Test Spearman Rank Test Kruskal-Wallis Test One-sample t-test Two-sample t-test Correlation/Regression One-way ANOVA
Nonparametric vs Parametric Sign Test Mann-Whitney Test Spearman Rank Test Kruskal-Wallis Test Friedman Test One-sample t-test Two-sample t-test Correlation/Regression One-way ANOVA One-way blocked ANOVA
Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use
A rose by any other name.. Non-parametric tests lack parameters Rank tests start by ranking the data Distribution-free tests don’t assume a Normal distribution (or any other) These are mainly but not completely overlapping sets of tests (and some are scale-invariant too).
Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use
Fewer assumptions but... still some assumptions (including independence) limited range of situations –no more than 2 x-variables –can’t mix continuous and categorical x-variables provide p-values but estimation is dodgy loss of efficiency if parametric assumptions are upheld there is a grand scheme for parametric statistics (GLM) but a lot of separate strange names for nonparametrics
When is there a choice? when there is a non-parametric test –fewer than two or three variables altogether and prediction is not required
How to choose: If the assumptions of parametric test are upheld, use it – on grounds of efficiency If not upheld, consider fixing the assumptions (e.g. by transforming the data, as in the practical) If assumptions not fixable, use nonparametric test
MTB > dotplot 'LogM' 'LogF'; SUBC> same ::: :... :::.. :..::.:....: : :. : LogM.:. :... : ::.:: : :. ::.::. ::.:. :. : LogF
MTB > dotplot 'LogM' 'LogF'; SUBC> same ::: :... :::.. :..::.:....: : :. : LogM.:. :... : ::.:: : :. ::.::. ::.:. :. : LogF MTB > desc 'LogM' 'LogF' Variable N Mean Median TrMean StDev SEMean LogM LogF Variable Min Max Q1 Q3 LogM LogF
Lecture Outline What is a nonparametric test? Rank tests, distribution free tests and nonparametric tests Which type of test to use
Last remarks Nonparametric tests are an opportunity to revise the basic ideas of statistical inference They are sometimes useful in biology They are often used in biology
PROC NPAR1WAY WILCOXON ; VAR c; CLASS g; FREQ n; RUN;