Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University
Rank sum test
Parametric Test The methods of hypothesis testing we have learnt (1) Assume: the variable follows a normal distribution; (2) To test whether the means (parameters) are equal or not under such an assumption. Therefore, they are called parametric tests
Non-parametric tests (distribution-free tests) There is no any assumptions about the distribution. Chi-square test is a kind of non- parametric tests. Rank sum tests: Another kind of non- parametric tests, which is based on ranks of the data.
For the following situations, the non- parametric tests could be used: a. The distribution of data is unknown; b. The distribution of data is skew; c. Ranked data or non-precise data; d. A quick and easy analysis ( for pilot study ).
It is suitable for a variety of data: Measurement or enumeration or ordinal Normal distribution or not Symmetric or not However, If the data are suitable for parametric tests, the power of non-parametric tests will be slightly lower.
1. Wilcoxon’s signed rank sum test (matched pairs) Example11-1 The evaluation of nursing care before and after training. Score: 1, 2, 3, 4, …, 10
No. of nurse (1) Before training (2) After training (3) Difference (4)=(3)-(2) Rank (5) Positive sum=60 Negative Sum=6
Steps: (1) Hypotheses: H 0 : The median of the difference is 0 H 1 : The median of the difference is not 0 α=0.05. (2) Difference (3) Ranking absolute differences (omit zero) and give back the signs (4) Rank sum and statistic T=min(positive sum, negative sum) (5) P-value and conclusion From Table 11-2, T 0.05,11 =10 > T, P<0.05, H 0 is rejected. Conclusion: The training is effective.
2.Wilcoxon’s rank sum test for two samples Two independent samples; it is not a normal distribution, or it is not sure whether the variable follows a normal distribution or not
Example 11-2 Nicotine of two bands of cigarettes Cigarette A (1) Rank (2) Cigarette B (3) Rank (4) n 1 =6T 1 =40.5n 2 =8T 2 =64.5
(1)Hypotheses: H 0 : The distributions of two populations are same H 1 : The two distributions are not same α=0.05 (2) Ranking all the observations in two samples. If same values appear in two samples(tie), give a mean rank. “28” in both sample, and the ranks should be 9 and 10, so that (9+10)/2= 9.5 for each. (3) Rank sum for smaller sample, T=T 1 = 40.5 (4) P-value and conclusion (Table 11-4) T 0.05,6,2 =29~61, T is inside the range, P>0.05. No significant difference between two brands.
3. Kruskal-Wallis’ H test for comparing more than 2 samples (1) For raw data (2) For contingency table with ordinal categories
(1)For raw data (1)For raw data Example 11-3 Survival months of liver cancer patients treated by 3 operation programs A (1) Rank (2) B (3) Rank (4) C (5) Rank (6) RiRi NiNi 555
(1)Hypothesis: H 0 : The distributions of three populations are all same H 1 : The distributions of three populations are not all same (2) Ranking all the observations in three samples (Same way for ties) (3) Rank sums for each sample R 1 =34, R 2 =60, R 3 =26
(4) Statistic H If there is no tie If there are ties t j : Number of individuals in j-th tie Example 11-3:
(5) P-value and conclusion –- Compare with critical value of H (Table 11-6) or Compare with –- Compare with critical value of H (Table 11-6) or Compare with k: Number of samples Example 11-3: Conclusion: The survival months are not all equal
(2) For contingency table with ordinal categories Conclusion:The pregnancy weeks of the 3 milk-secretion groups are significantly different.
4. Rank test for multiple comparison When the of comparison among three groups results in significant difference, multiple comparison is needed to know who and who are different. A t test for pair-wise comparison could be used. H 0 : The location of population A and B are different H 1 : The location of population A and B are not different If, then reject H 0.
Example 11-3 Survival months of liver cancer patients treated by 3 operation programs A (1) Rank (2) B (3) Rank (4) C (5) Rank (6) RiRi NiNi 555 Conclusion: The survival months of patients treated by Program A and B are significantly different
(1)Hypothesis: H 0 : any pair of two population distributions has same location H 1 : this pair of two population distributions has different locations, α =0.05. (2) Calculate t value:
Other results list in table (3) Decide P value, =N-k=15-3=12, Operation B is better than other two operations.