Lecture 16 Nonparametric Two-Sample Tests

Slides:



Advertisements
Similar presentations
Prepared by Lloyd R. Jaisingh
Advertisements

1 Chapter 20: Statistical Tests for Ordinal Data.
Kruskal Wallis and the Friedman Test.
Chapter 16 Introduction to Nonparametric Statistics
Chapter 12 Chi-Square Tests and Nonparametric Tests
Kruskal Wallis H test Use when a nonparametric test is needed. When you have one variable but more than two groups. Equivalent to a one way ANOVA.
Chapter 15 Nonparametric Statistics
Nonparametric or Distribution-free Tests
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 12-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
NONPARAMETRIC STATISTICS
Chapter 11 Nonparametric Tests.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
NON-PARAMETRIC STATISTICS
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Lesson Test to See if Samples Come From Same Population.
Copyright © Cengage Learning. All rights reserved. 15 Distribution-Free Procedures.
Nonparametric Statistics
Chapter 11 Analysis of Variance
Sampling and Sampling Distributions

Chapter 12 Chi-Square Tests and Nonparametric Tests
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Statistics for Managers using Microsoft Excel 3rd Edition
NONPARAMETRIC STATISTICS
Statistics for Managers Using Microsoft Excel 3rd Edition
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
i) Two way ANOVA without replication
Hypothesis testing. Chi-square test
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Analysing Means II: Nonparametric techniques.
Y - Tests Type Based on Response and Measure Variable Data
CHAPTER 12 ANALYSIS OF VARIANCE
Lecture 6 Comparing Proportions (II)
Environmental Modeling Basic Testing Methods - Statistics
SA3202 Statistical Methods for Social Sciences
Nonparametric Statistical Methods: Overview and Examples
Lecture 11 Nonparametric Statistics Introduction
Nonparametric Statistics
Lecture 15 Wilcoxon Tests
CONCEPTS OF ESTIMATION
Lecture 18 The Run Test Outline of Today The Definition
Lecture 7 The Odds/ Log Odds Ratios
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Lecture Slides Elementary Statistics Twelfth Edition
Nonparametric Tests BPS 7e Chapter 28 © 2015 W. H. Freeman and Company.
十二、Nonparametric Methods (Chapter 12)
Nonparametric Statistical Methods: Overview and Examples
1-Way Analysis of Variance - Completely Randomized Design
Nonparametric Statistical Methods: Overview and Examples
Statistics in SPSS Lecture 9
Non-parametric tests, part A:
Lecture 14 The Sign Test and the Rank Test
Tutorial 10 The following data are the lives of 15 radio tubes selected at random from a large batch of tubes:
Model Diagnostics and Tests
Confidence Intervals for a Standard Deviation
Test to See if Samples Come From Same Population
Chapter 10: One- and Two-Sample Tests of Hypotheses:
The Rank-Sum Test Section 15.2.
Lecture 7 Sampling and Sampling Distributions
Nonparametric Statistics
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
UNIT-4.
Distribution-Free Procedures
Chapter 10 – Part II Analysis of Variance
NONPARAMETRIC STATISTICS FOR BEHAVIORAL SCIENCE
Quantitative Methods ANOVA.
Presentation transcript:

Lecture 16 Nonparametric Two-Sample Tests Outline of Today Wilcoxon Rank-Sum Test Mann-Whitney Test Krushal-Wallis H Test 9/21/2018 SA3202, Lecture 16

Wilcoxon Rank-Sum Test for Two Independent Samples The problem A common statistical problem is to compare two populations A and B based on independent samples. The usual parametric test is the two-sample t-test. There are two equivalent nonparametric tests for comparing two populations: Wilcoxon’s rank-sum test and Mann-Whitney U test. Let X1, X2, …, Xn1 ~ Population A, Y1, Y2, …, Yn2~Population B, we wish to test H0: X~Y against H1: X~Y+theta Suppose we rank the combined (pooled) sample, and let R1, R2, …,Rn1 denote the ranks of the observations from Population A. Let W=R1+R2+…+Rn1 9/21/2018 SA3202, Lecture 16

H1:theta<0, reject H0 when W is too small For H1: theta>0, reject H0 when W is too large H1:theta<0, reject H0 when W is too small H1:theta 0, reject H0 when W is too large or too small For large sample sizes (n1, n2>=10),under H0, we can use Normal Approximation for W with E(W)=n1(n1+n2+1)/2 Var(W)=n1n2(n1+n2+1)/12 9/21/2018 SA3202, Lecture 16

Mann-Whitney Test Mann and Whitney proposed an alternative test for the two sample problem. Their test is equivalent to Wilcoxon rank-sum test, but it is more convenient to use because table of critical values are readily available. The Test The Mann-Whitney statistic is obtained by ordering all the observations and counting the number of times than an observation in the first sample precedes (is smaller than) an observation in the second sample: Ua=#{(i,j)| Xi<Yj}=U1+U2+…+Un1, Ui=#{j| Xi<Yj} Ui is the number of times that the i-th member of Sample A precedes an observation in Sample B. 9/21/2018 SA3202, Lecture 16

If for two samples of sizes n1=5, n2=3, the observations are Example If for two samples of sizes n1=5, n2=3, the observations are 25 26 27 28 29 31 32 35 A A A B B A B A Then U1=3, U2=3, U3=3, U4=1, U5=0 Ua=3+3+3+1+0=10 Ub can be similarly defined. Note that Ua+Ub=n1n2 since the total number of comparisons is n1n2. Clearly, Ua tends to be “too large” (“too small”) when the distribution of X is to the left (right) of the distribution of Y. 9/21/2018 SA3202, Lecture 16

Equivalence between Ua and W Note that Ua will be large when the rank-sum statistic W is small---and vise versa. In fact it can be showed that Ua=n1n2+n1(n1+1)/2-W which provides a more convenient method of computing Ua, and shows that a test based on Ua is equivalent to a test based on W. It also enables us to derive the mean and variance of Ua E(Ua)=n1n2/2, Var(Ua)=n1n2(n1+n2+1)/12 9/21/2018 SA3202, Lecture 16

Remarks Under H0, Ua and Ub have the same, symmetric distribution. Thus, under H0, Pr(Ua<=U0)=Pr(Ua>=n1n2-U0) This can be used to find the upper quantiles when the lower quantiles are given. The test can be conducted using just the lower quantiles via the following procedure: For H1: theta>0, reject H0 when Ub is too small H1: theta<0, reject H0 when Ua is too small H1: theta 0, reject H0 when Ua or Ub is too small For large sample sizes (n1, n2>=10), under H0, we can use Normal Approximation for Ua or Ub. 9/21/2018 SA3202, Lecture 16

Example An experiment was conducted to compare the strengths of two types of kraft papers. The following table gives the strength measurements for 10 randomly selected pieces of each type of paper, together with their ranks: Standard (A) 1.21 1.43 1.35 1.51 1.39 1.17 1.48 1.42 1.29 1.40 Rank 2 12 6 17 9 1 14 11 3.5 10 Treated (B) 1.49 1.37 1.67 1.50 1.31 1.29 1.52 1.37 1.44 1.53 Rank 15 7.5 20 16 5 3.5 18 7.5 13 19 Wa=85.5, Wb=124.5 H0: there is no difference in the distribution of strengths for A and B H1: B tends to be of greater strength Ub=n1n2+n1*(n1+1)/2-Wb=10*10+10*11/2-124.5=30.5 When n1=n2=10, the table value is Pr(Ub<=28)=.0552. Thus, Ub is not that small. H0 is not rejected. 9/21/2018 SA3202, Lecture 16

The Krushal-Wallis H Test Suppose that independent samples are taken from k distributions, and consider testing H0: The k distributions are the same H1: At least two of them differ in location Parametric Procedure F test (one way ANOVA), assuming normality, common variance Nonparametric Procedure the Kruskal-Wallis H test (Nonparametric ANOVA) Let ni be the size of the sample from the i-th population and let n be their total: n=n1+n2+….+nk We rank the n observations from 1 to n (ties treated as usual). Let Ri denote the sum of the ranks of the observations from the i-th population, and let Ai denote the associated average rank Ai=Ri/ni, i=1,2,…k. The average of all the ranks: A=(R1+R2+…+Rk)/n=(1+2+…+n)/n=(n+1)/2 9/21/2018 SA3202, Lecture 16

V=n1(A1-A)^2+….+nk(Ak-A)^2 The Test Statistic The test statistic is V=n1(A1-A)^2+….+nk(Ak-A)^2 Under H0, all Ai are about the same, and equal to A so that V will be small. Otherwise, V will be large. So a test may be based on V. The Kruskal-Wallis H Test statistic H=12V/(n(n+1)) The exact distribution of H under H0 can be found by elementary methods or via simulation. For large sample sizes, H is approximately chi-square distributed with k-1 degrees of freedom provided all ni >=5. 9/21/2018 SA3202, Lecture 16

Example To compare 3 different assembly lines, for each line the output of 10 randomly selected hours of production was examined for defects. The data, together with the ranks, are given in the following table: Defects 6 38 3 17 11 30 15 16 25 5 Line 1 Rank 5 27 2 13 8 21 11 12 17 4 Defects 34 28 42 13 40 31 9 32 39 27 Line 2 Rank 25 19 30 9.5 29 22 7 23 28 18 Defects 13 35 19 4 29 0 7 33 18 24 Line 3 Rank 9.5 26 15 3 20 1 6 24 14 16 R1=120 R2=210.5 R3=134.5 A1=12 A2=21.05 A3=13.45 A=15.5 H=6.097 df=3-1=2 The 5% table value=5.99, the 1% table value=10.60. Conclusion: 9/21/2018 SA3202, Lecture 16