Chapter 18 – Nonparametric Statistics Introduction to Business Statistics, 6e Kvanli, Pavur, Keeling Chapter 18 – Nonparametric Statistics Slides prepared by Jeff Heyl, Lincoln University ©2003 South-Western/Thomson Learning™
Nonparametric Statistics Many of the nonparametric statistical tests answer the same sorts of questions as the parametric tests. With nonparametric tests the assumptions can be relaxed considerably. Consequently, nonparametric methods are used for situations that violate the assumptions of parametric procedures.
The Runs Test (Small Samples) Sequence 1: HHHHH TTTTT Run 1 Run 2 Sequence 2: H T H T H T H T Run 1 Run 10 Sequence 3: T H H H T H T H T T Run 1 Run 7 R = number of runs
Test for Randomness; The Runs Test Arrangement Number Arrangement Number of Runs 1 HHHHHTTTTT 2 sequence 1 2 HHHHTHTTTT 4 3 HHHHTTHTTT 4 4 HHHHTTTHTT 4 5 HHHHTTTTHT 4 . 130 THHHHTTTHT 5 131 THHHHTTTTH 4 132 THHHTHHTTT 5 133 THHHTHTHTT 7 sequence 3 134 THHHTHTTTT 7 248 TTTTHHHHTH 4 249 TTTTHHHTHH 4 250 TTTTHHTHHH 4 251 TTTTHTHHHH 4 252 TTTTTHHHHH 2 Table 18.1
Test for Randomness; The Runs Test Number of Number of Times Relative Cumulative Runs (R) R Occurred Frequency Relative Frequency 2 2 .008 .008 3 8 .032 .040 4 32 .127 .167 5 48 .190 .357 6 72 .286 .643 7 48 .190 .883 8 32 .127 .960 9 8 .032 .992 10 2 .008 1.000 252 1.000 Table 18.2
Test for Randomness; The Runs Test Number of Number of Times Relative Cumulative Runs (R) R Occurred Frequency Relative Frequency 2 2 .008 .008 3 8 .032 .040 4 32 .127 .167 5 48 .190 .357 6 72 .286 .643 7 48 .190 .883 8 32 .127 .960 9 8 .032 .992 10 2 .008 1.000 252 1.000 Number of arrangements = A = n! n1!n2! Table 18.2
The Runs Test (Small Samples) Ho: The sequence was generated in a random manner Ha: The sequence was not generated in a random manner Reject Ho if R ≤ k1 or R ≥ k2 P(R ≤ k1) ≤ = .025 2 Largest value Smallest value P(R ≥ k1) ≤ = .025 2
Runs Test (Large Samples) 2n1n2 n1 + n2 R = 2n1n2(2n1n2 - n1 - n2) (n1 + n2)2(n1 + n2 - 1) Ho: The sequence was generated in a random manner Ha: The sequence was not generated in a random manner Reject Ho if |Z| > Z/2 Z = R - µR R
Excel Runs Test Figure 18.1
Regression and Runs Test Figure 18.2
Residual Plot • Residuals Time 100 – 50 – 0 – -50 – -100 – | 10 20 30 | 10 20 30 40 Figure 18.3
Nonparametric Test Central Tendency: Two Populations | µ1 µ2 Females Males Ho: µ2 ≤ µ1 Ha: µ2 > µ1 Height Figure 18.4
Dependent (Paired) Samples Couple 1 Couple 2 Couple 3 (etc.) Data Sample 1 (women) Sample 2 (men) X . Figure 18.5
Mann-Whitney U Test for Independent Samples Ho: The two populations have identical probability distributions Ha: The two populations differ in location T1 = sum of the ranks of the observations from the first sample in this pooled sample T2 = sum of the ranks of the observations from the second sample
Mann-Whitney U Small Samples U1 = n1n2 + - T1 n1(n1 + 1) 2 U2 = n1n2 + - T2 n2(n2 + 1) 2
Mann-Whitney U Small Samples Procedure: 1. Assume that n1 ≤ n2 (reverse the samples if necessary) 2. Determine U1 and U2 3. Use the value from Table A.10 to test Ho versus Ha Two-Sided Test Ha: the two populations differ in location Reject Ho if Table A.10 value for U < /2, where U = the minimum of U1 and U2
Mann-Whitney U Small Samples Procedure: 1. Assume that n1 ≤ n2 (reverse the samples if necessary) 2. Determine U1 and U2 3. Use the value from Table A.10 to test Ho versus Ha One-Sided Test Ha: population 1 is shifted to the right of population 2 Reject Ho if Table A.10 value for U < , where U = U1 One-Sided Test Ha: population 1 is shifted to the left of population 2 Reject Ho if Table A.10 value for U < , where U = U2
Mann-Whitney U Large Samples 2 µU = n1n2(n1 + n2 + 1) 12 U = Z = U2 - µU U 2
Mann-Whitney U Large Samples Ho: The two populations have identical probability distributions Determine U2 = n1n2 + - T2 n2(n2 + 1) 2 Reject Ho if |Z| > Z/2 Ha: The two populations differ in location Two-Sided Test
Mann-Whitney U Large Samples Ho: The two populations have identical probability distributions Determine U2 = n1n2 + - T2 n2(n2 + 1) 2 Reject Ho if Z > Z Ha: Population 1 is shifted to the right of population 2 One-Sided Test Reject Ho if Z < -Z Ha: Population 1 is shifted to the left of population 2
Mann-Whitney Test Figure 18.6
Wilcoxon Signed Rank Test for Paired Samples When small samples from suspected nonnormal populations are used, a nonparametric technique is required. The Wilcoxon test is used for such situations.
Wilcoxon Signed Rank Test for Paired Samples Determine the difference for each sample pair Arrange the absolute value of these differences in order, assigning a rank to each Let T+ = sum of ranks having a positive value and T- = sum of ranks for the negative values T+, T-, or T = the minimum of T+ and T- is used to define a test of Ho versus Ha
Wilcoxon Signed Rank Test for Small Paired Samples Ho: The population differences are centered at 0 Two-Sided Test Ha: the population differences are not centered at 0 Reject Ho if T ≤ table value, where T = the minimum of T+ and T-
Wilcoxon Signed Rank Test for Small Paired Samples Ho: The population differences are centered at 0 One-Sided Test Reject Ho if T- ≤ table value Ha: the population differences are centered at a value > 0 Reject Ho if T+ ≤ table value Ha: the population differences are centered at a value < 0
Wilcoxon Signed Rank Test for Large Paired Samples n(n + 1) 4 µT = + n(n + 1)(2n + 1) 24 T = Z = T+ - µT T +
Wilcoxon Signed Rank Test for Large Paired Samples Ho: The population differences are centered at 0 Two-Sided Test Ha: the population differences are not centered at 0 Reject Ho if |Z| > Z/2
Wilcoxon Signed Rank Test for Large Paired Samples Ho: The population differences are centered at 0 One-Sided Test Reject Ho if Z > Z Ha: the population differences are centered at a value > 0 Reject Ho if Z < -Z Ha: the population differences are centered at a value < 0
Hardwood Concentration Z -1.41 p value = area of = .0793 Figure 18.7
Wilcoxon Test Solution Figure 18.8
Kruskal-Wallis Test The nonparametric counterpart to the one-way ANOVA test Ho: the k populations have identical probability distributions Ha: at least two of the populations differ in location KW = ∑ - 3(n + 1) 12 n(n + 1) Ti2 ni k i = 1 Reject Ho if KW is “large”
Kruskal-Wallis Statistic Area = .005 12.84 13.83 p value < .005 2 Figure 18.9
Kruskal-Wallis Test Figure 18.10
The Friedman Test The nonparametric counterpart to the randomized block ANOVA test Ho: the k populations have identical probability distributions Ha: at least two of the populations differ in location FR = ∑Ti2 - 3b(k + 1) 12 bk(k + 1) k i = 1 Reject Ho if FR > 2.df
The Friedman Test Figure 18.11
Spearman’s Rank Correlation Figure 18.12
Spearman’s Rank Correlation Spearman’s is the nonparametric counterpart to the Pearson Correlation rs = ∑R(x)R(y) - [∑R(x)][∑R(y)] / n ∑R2(x) - [∑R(x)]2 / n ∑R(y) - [∑R(y)]2 / n rs = 1 - 6∑d2 n(n2 - 1)
Measure of Association 15 – 12 – 9 – 7 – 5 – Y | 1 2 3 4 5 6 7 8 9 10 11 X r < 1 From equation 19-15 Interval X 3 5 7 9 11 data Y 4 7 8 10 16 4 – 3 – 2 – 1 – rs = 1 From equation 19-17 Ranks R(X) 1 2 3 4 5 R(Y) 1 2 3 4 5 Perfect agreement, so rs = 1 Figure 18.13
Spearman’s Rank Correlation Figure 18.14
Spearman’s Rank Correlation Two-Sided Test Ho: no association exists between X and Y Ha: association does exist between X and Y Reject Ho if |rs| > (table value)
Spearman’s Rank Correlation One-Sided Test Ho: no association exists between X and Y Ha: a positive relationship exists between X and Y Reject Ho if rs > (table value) Ha: a negative relationship exists between X and Y Reject Ho if rs < -(table value)