Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distribution-Free Procedures

Similar presentations


Presentation on theme: "Distribution-Free Procedures"— Presentation transcript:

1 Distribution-Free Procedures
15 Distribution-Free Procedures Copyright © Cengage Learning. All rights reserved.

2 Copyright © Cengage Learning. All rights reserved.
Distribution-Free Confidence Intervals 15.3 Copyright © Cengage Learning. All rights reserved.

3 Distribution-Free Confidence Intervals
The method we have used so far to construct a confidence interval (CI) can be described as follows: Start with a random variable (Z, T, X 2, F, or the like) that depends on the parameter of interest and a probability statement involving the variable, manipulate the inequalities of the statement to isolate the parameter between random endpoints, and, finally, substitute computed values for random variables.

4 Distribution-Free Confidence Intervals
Another general method for obtaining CIs takes advantage of a relationship between test procedures and CIs discussed in Section 8.5. A 100(1 – )% CI for a parameter  can be obtained from a level  test for H0:  = 0 versus Ha:  ≠ 0. This method will be used to derive intervals associated with the Wilcoxon signed-rank test and the Wilcoxon rank-sum test.

5 Distribution-Free Confidence Intervals
Proposition This makes intuitive sense because the CI consists of all values of the parameter that are plausible at the selected confidence level, and we do not want to reject 𝐻 0 in favor of 𝐻 𝑎 if 𝜃 0 is a plausible value.

6 Distribution-Free Confidence Intervals
There are actually pathological examples in which the set A defined in the proposition is not an interval of 𝜃 values, but instead the complement of an interval or something even stranger. To be more precise, we should really replace the notion of a CI with that of a confidence set. In the cases of interest here, the set A does turn out to be an interval.

7 The Wilcoxon Signed-Rank Interval

8 The Wilcoxon Signed-Rank Interval
To test H0:  = 0 versus Ha:  ≠ 0 using the Wilcoxon signed-rank test, where  is the mean of a continuous symmetric distribution, the absolute values | x1 – 0 |, , | xn – 0 | are ordered from smallest to largest, with the smallest receiving rank 1 and the largest rank n. Each rank is then given the sign of its associated xi – 0, and the test statistic is the sum of the positively signed ranks.

9 The Wilcoxon Signed-Rank Interval
The two-tailed test rejects H0 if s+ is either  c or  n(n + 1)/2 – c, where c is obtained from Appendix Table A.13 once the desired level of significance  is specified. For fixed x1,    , xn, the 100(1 – )% signed-rank interval will consist of all 0 for which H0:  = 0 is not rejected at level . To identify this interval, it is convenient to express the test statistic S+ in another form.

10 The Wilcoxon Signed-Rank Interval
That is, if we average each xj in the list with each xi to its left, including (xj + xj)/2 (which is just xj), and count the number of these averages that are  0, s+ results. In moving from left to right in the list of sample values, we are simply averaging every pair of observations in the sample [again including (xj + xj)/2] exactly once, so the order in which the observations are listed before averaging is not important.

11 The Wilcoxon Signed-Rank Interval
The equivalence of the two methods for computing s+ is not difficult to verify. The number of pairwise averages is (the first term due to averaging of different observations and the second due to averaging each xi with itself), which equals n(n + 1)/2. It can be shown that P-value ≤ a if and only if either too many or too few of these pairwise averages are ≥ 𝜇 0 , in which case 𝐻 0 is rejected.

12 Example 15.6 The following observations are values of cerebral metabolic rate for rhesus monkeys: x1 = 4.51, x2 = 4.59, x3 = 4.90, x4 = 4.93, x5 = 6.80, x6 = 5.08, x7 = 5.67. The 28 pairwise averages are, in increasing order,

13 Example 15.6 cont’d The first few and the last few of these are pictured in Figure 15.2. Plot of the data for Example 15.6 Figure 15.2

14 Example 15.6 cont’d Because S+ is a discrete rv,  = .05 cannot be obtained exactly. Appendix Table A.13 shows that the P-value for a two-tailed test is 2(.023) = .046 if either 𝑠 + = 26 or 2. Thus 𝐻 0 will not be rejected at significance level .046 if 3 ≤ 𝑠 + ≤ 25. That is, if the number of pairwise averages ≥ 𝜇 0 is between 3 and 25, inclusive, 𝐻 0 is not rejected. From Figure 15.2 the CI for 𝜇 with confidence level 95.4% (approximately 95%) is (4.59, 5.94).

15 The Wilcoxon Signed-Rank Interval
In general, once the pairwise averages are ordered from smallest to largest, the endpoints of the Wilcoxon interval are two of the “extreme” averages. To express this precisely, let the smallest pairwise average be denoted by the next smallest by ,    , and the largest by

16 The Wilcoxon Signed-Rank Interval
Proposition In words, the interval extends from the dth smallest pairwise average to the dth largest average, where d = n(n + 1)/2 – c + 1. Appendix Table A.15 gives the values of c that correspond to the usual confidence levels for n = 5, 6, , 25.

17 Example 15.7 Example 15.6 continued…
For n = 7, the P-value for a two-tailed test is 2(.055) = .11 if 𝑠 + = 24 or 𝑠 Therefore the null hypothesis will be rejected at significance level .11 if s1 5 0, 1, 2, 3, 4, 24, 25, 26, 27, or 28. Thus an 89.0% interval (approximately 90%) is obtained by using c = 24. The interval is ( 𝑥 28−24+1 , 𝑥 =( 𝑥 5 , 𝑥 =4.72,5.85), which extends from the fifth smallest to the fifth largest pairwise average.

18 The Wilcoxon Signed-Rank Interval
The derivation of the interval depended on having a single sample from a continuous symmetric distribution with mean (median) . When the data is paired, the interval constructed from the differences d1, d2, , dn is a CI for the mean (median) difference D. In this case, the symmetry of X and Y distributions need not be assumed; as long as the X and Y distributions have the same shape, the X – Y distribution will be symmetric, so only continuity is required.

19 The Wilcoxon Signed-Rank Interval
For n > 20, the large-sample approximation to the Wilcoxon test based on standardizing S+ gives an approximation to c in (15.10). The result [for a 100(1 – )% interval] is

20 The Wilcoxon Signed-Rank Interval
The efficiency of the Wilcoxon interval relative to the t interval is roughly the same as that for the Wilcoxon test relative to the t test. In particular, for large samples when the underlying population is normal, the Wilcoxon interval will tend to be slightly wider than the t interval, but if the population is quite nonnormal (symmetric but with heavy tails), then the Wilcoxon interval will tend to be much shorter than the t interval.

21 The Wilcoxon Rank-Sum Interval

22 The Wilcoxon Rank-Sum Interval
The Wilcoxon rank-sum test for testing H0: 1 – 2 = 0 is carried out by first combining the (Xi – 0)s and Yj’s into one sample of size m + n and ranking them from smallest (rank 1) to largest (rank m + n). The test statistic W is then the sum of the ranks of the (Xi – 0)s. For the two-sided alternative, H0 is rejected if w is either too small or too large.

23 The Wilcoxon Rank-Sum Interval
To obtain the associated CI for fixed xi’s and yj’s, we must determine the set of all 0 values for which H0 is not rejected. This is easiest to do if the test statistic is expressed in a slightly different form.

24 The Wilcoxon Rank-Sum Interval
The smallest possible value of W is m(m + 1)/2, corresponding to every (Xi – 0) less than every Yj, and there are mn differences of the form (Xi – 0) – Yj. A bit of manipulation gives The P-value will be at most 𝛼, leading to rejection of the null hypothesis, if w is relatively small (close to 0) or large (close to m(m + 2n + 1)/2). This is equivalent to rejecting 𝐻 0 if the number of ( 𝑥 𝑖 − 𝑦 𝑖 )′𝑠≥ Δ 0 is either too small or too large. (15.8)

25 The Wilcoxon Rank-Sum Interval
Expression (15.8) suggests that we compute xi – yj for each i and j and order these mn differences from smallest to largest. Then if the null value 0 is neither smaller than most of the differences nor larger than most, H0: 1 – 2 = 0 is not rejected. Varying 0 now shows that a CI for 1 – 2 will have as its lower endpoint one of the ordered (xi – yj)s, and similarly for the upper endpoint.

26 The Wilcoxon Rank-Sum Interval
Proposition

27 The Wilcoxon Rank-Sum Interval
Notice that the form of the Wilcoxon rank-sum interval (15.12) is very similar to the Wilcoxon signed-rank interval (15.10); (15.10) uses pairwise averages from a single sample, whereas (15.12) uses pairwise differences from two samples. Appendix Table A.16 gives values of c for selected values of m and n.

28 Example 15.8 The article “Some Mechanical Properties of Impregnated Bark Board” (Forest Products J., 1977: 31–38) reports the following data on maximum crushing strength (psi) for a sample of epoxy-impregnated bark board and for a sample of bark board impregnated with another polymer: Let’s obtain a 95% CI for the true average difference in crushing strength between the epoxy-impregnated board and the other type of board.

29 Example 15.8 cont’d From Appendix Table A.16, since the smaller sample size is 5 and the larger sample size is 6, c = 26 for a confidence level of approximately 95%. The dij’s appear in Table 15.5. Differences for the Rank-Sum Interval in Example 15.8 Table 15.5

30 Example 15.8 cont’d The five smallest dij’s [dij(1), , dij(5)] are 4350, 4470, 4610, 4730, and 4830; and the five largest dij’s are (in descending order) 9790, 9530, 8740, 8480, and 8220. Thus the CI is (dij(5), dij(26)) = (4830, 8220).

31 The Wilcoxon Rank-Sum Interval
When m and n are both large, the Wilcoxon test statistic has approximately a normal distribution. This can be used to derive a large-sample approximation for the value c in interval (15.12). The result is

32 The Wilcoxon Rank-Sum Interval
As with the signed-rank interval, the rank-sum interval (15.9) is quite efficient with respect to the t interval; in large samples, will tend to be only a bit longer than the t interval when the underlying populations are normal and may be considerably shorter than the t interval if the underlying populations have heavier tails than do normal populations.


Download ppt "Distribution-Free Procedures"

Similar presentations


Ads by Google