Lecture 16 Nonparametric Two-Sample Tests

Lecture 16 Nonparametric Two-Sample Tests
Outline of Today Wilcoxon Rank-Sum Test Mann-Whitney Test Krushal-Wallis H Test 9/21/2018 SA3202, Lecture 16

Wilcoxon Rank-Sum Test for Two Independent Samples
The problem A common statistical problem is to compare two populations A and B based on independent samples. The usual parametric test is the two-sample t-test. There are two equivalent nonparametric tests for comparing two populations: Wilcoxon’s rank-sum test and Mann-Whitney U test. Let X1, X2, …, Xn1 ~ Population A, Y1, Y2, …, Yn2~Population B, we wish to test H0: X~Y against H1: X~Y+theta Suppose we rank the combined (pooled) sample, and let R1, R2, …,Rn1 denote the ranks of the observations from Population A. Let W=R1+R2+…+Rn1 9/21/2018 SA3202, Lecture 16

H1:theta<0, reject H0 when W is too small
For H1: theta>0, reject H0 when W is too large H1:theta<0, reject H0 when W is too small H1:theta 0, reject H0 when W is too large or too small For large sample sizes (n1, n2>=10),under H0, we can use Normal Approximation for W with E(W)=n1(n1+n2+1)/2 Var(W)=n1n2(n1+n2+1)/12 9/21/2018 SA3202, Lecture 16

Mann-Whitney Test Mann and Whitney proposed an alternative test for the two sample problem. Their test is equivalent to Wilcoxon rank-sum test, but it is more convenient to use because table of critical values are readily available. The Test The Mann-Whitney statistic is obtained by ordering all the observations and counting the number of times than an observation in the first sample precedes (is smaller than) an observation in the second sample: Ua=#{(i,j)| Xi<Yj}=U1+U2+…+Un1, Ui=#{j| Xi<Yj} Ui is the number of times that the i-th member of Sample A precedes an observation in Sample B. 9/21/2018 SA3202, Lecture 16

If for two samples of sizes n1=5, n2=3, the observations are
Example If for two samples of sizes n1=5, n2=3, the observations are A A A B B A B A Then U1=3, U2=3, U3=3, U4=1, U5=0 Ua= =10 Ub can be similarly defined. Note that Ua+Ub=n1n2 since the total number of comparisons is n1n2. Clearly, Ua tends to be “too large” (“too small”) when the distribution of X is to the left (right) of the distribution of Y. 9/21/2018 SA3202, Lecture 16

Equivalence between Ua and W
Note that Ua will be large when the rank-sum statistic W is small---and vise versa. In fact it can be showed that Ua=n1n2+n1(n1+1)/2-W which provides a more convenient method of computing Ua, and shows that a test based on Ua is equivalent to a test based on W. It also enables us to derive the mean and variance of Ua E(Ua)=n1n2/2, Var(Ua)=n1n2(n1+n2+1)/12 9/21/2018 SA3202, Lecture 16

Remarks Under H0, Ua and Ub have the same, symmetric distribution. Thus, under H0, Pr(Ua<=U0)=Pr(Ua>=n1n2-U0) This can be used to find the upper quantiles when the lower quantiles are given. The test can be conducted using just the lower quantiles via the following procedure: For H1: theta>0, reject H0 when Ub is too small H1: theta<0, reject H0 when Ua is too small H1: theta 0, reject H0 when Ua or Ub is too small For large sample sizes (n1, n2>=10), under H0, we can use Normal Approximation for Ua or Ub. 9/21/2018 SA3202, Lecture 16

Example An experiment was conducted to compare the strengths of two types of kraft papers. The following table gives the strength measurements for 10 randomly selected pieces of each type of paper, together with their ranks: Standard (A) Rank Treated (B) Rank Wa=85.5, Wb=124.5 H0: there is no difference in the distribution of strengths for A and B H1: B tends to be of greater strength Ub=n1n2+n1*(n1+1)/2-Wb=10*10+10*11/ =30.5 When n1=n2=10, the table value is Pr(Ub<=28)= Thus, Ub is not that small. H0 is not rejected. 9/21/2018 SA3202, Lecture 16

The Krushal-Wallis H Test
Suppose that independent samples are taken from k distributions, and consider testing H0: The k distributions are the same H1: At least two of them differ in location Parametric Procedure F test (one way ANOVA), assuming normality, common variance Nonparametric Procedure the Kruskal-Wallis H test (Nonparametric ANOVA) Let ni be the size of the sample from the i-th population and let n be their total: n=n1+n2+….+nk We rank the n observations from 1 to n (ties treated as usual). Let Ri denote the sum of the ranks of the observations from the i-th population, and let Ai denote the associated average rank Ai=Ri/ni, i=1,2,…k The average of all the ranks: A=(R1+R2+…+Rk)/n=(1+2+…+n)/n=(n+1)/2 9/21/2018 SA3202, Lecture 16

V=n1(A1-A)^2+….+nk(Ak-A)^2
The Test Statistic The test statistic is V=n1(A1-A)^2+….+nk(Ak-A)^2 Under H0, all Ai are about the same, and equal to A so that V will be small. Otherwise, V will be large. So a test may be based on V. The Kruskal-Wallis H Test statistic H=12V/(n(n+1)) The exact distribution of H under H0 can be found by elementary methods or via simulation. For large sample sizes, H is approximately chi-square distributed with k-1 degrees of freedom provided all ni >=5. 9/21/2018 SA3202, Lecture 16

Example To compare 3 different assembly lines, for each line the output of 10 randomly selected hours of production was examined for defects. The data, together with the ranks, are given in the following table: Defects Line 1 Rank Defects Line 2 Rank Defects Line 3 Rank R1= R2= R3=134.5 A1= A2= A3= A=15.5 H=6.097 df=3-1=2 The 5% table value=5.99, the 1% table value=10.60. Conclusion: 9/21/2018 SA3202, Lecture 16

Lecture 16 Nonparametric Two-Sample Tests

Similar presentations

Presentation on theme: "Lecture 16 Nonparametric Two-Sample Tests"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 16 Nonparametric Two-Sample Tests

Similar presentations

Presentation on theme: "Lecture 16 Nonparametric Two-Sample Tests"— Presentation transcript:

Similar presentations

About project

Feedback