Download presentation
1
14 Elements of Nonparametric Statistics
Copyright © Cengage Learning. All rights reserved.
2
Copyright © Cengage Learning. All rights reserved.
14.4 The Runs Test Copyright © Cengage Learning. All rights reserved.
3
The Runs Test The runs test is used most frequently to test the randomness (or lack of randomness) of data. A run is a sequence of data that possess a common property. One run ends and another starts when an observation does not display the property in question. The test statistic in this test is V, the number of runs observed. The following example illustrates what constitutes a run and how to count the number of runs.
4
Example 9 – Determining the Number of Runs
To illustrate the idea of runs, let’s draw a sample of 10 single-digit numbers from the telephone book, listing the next-to-last digit from each of the selected telephone numbers: Sample: Let’s consider the property of “odd” (o) or “even” (e). The sample, as it was drawn, becomes e, o, o, o, e, e, e, e, e, o, which displays four runs: e o o o e e e e e o Thus = 4
5
The Runs Test In Example 9, if the sample contained no randomness, there would be only two runs—all the evens, then all the odds, or the other way around. We would also not expect to see them alternate—odd, even, odd, even. The maximum number of possible runs would be n1 + n2 or fewer (provided n1 and n2 are not equal), where n1 and n2 are the numbers of data that have each of the two properties being identified.
6
The Runs Test Assumption for inferences about randomness using the runs test Each sample data value can be classified into one of two categories. The runs test is generally a two-tailed test. We will reject the hypothesis when there are too few runs because this indicates that the data are “separated” according to the two properties. We will also reject the hypothesis when there are too many runs because that indicates that the data alternate between the two properties too often to be random.
7
The Runs Test For example, if the data alternated all the way down the line, we might suspect that the data had been tampered with. There are many aspects to the concept of randomness. The occurrence of odd and even as discussed in Example 9 is one aspect. Another aspect of randomness that we might wish to check is the ordering of fluctuations of the data above or below the mean or median of the sample.
8
Example 10 – Hypothesis Test for Randomness
Consider the following sample and determine whether the data points form a random sequence with regard to being above or below the median value. Test the null hypothesis that this sequence is random. Use = 0.05.
9
Example 10 – Solution Step 1 a. Parameter of interest: Randomness of the values above or below the median b. Statement of hypotheses: Ho: The numbers in the sample form a random sequence with respect to the two properties ”above” and ”below” the median value. Ha: The sequence is not random. Step 2 a. Assumptions: Each sample data value can be classified as “above” or “below” the median.
10
Example 10 – Solution cont’d b. Test statistic: V, the number of runs in the sample data c. Level of significance: = 0.05 Step 3 a. Sample information: The sample data are listed at the beginning of the example. b. Test statistic: First we must rank the data and find the median. The ranked data are Since there are 30 data values, the depth of the median is at the = 15.5 position. Thus, = 3.5.
11
Example 10 – Solution cont’d By comparing each number in the original sample to the value of the median, we obtain the following sequence of a’s (above) and b’s (below): b a b a a b a b b b a b a b b a b a b a a b a a b a b a b a We observe na = 15, nb = 15, and 24 runs. So = 24. If n1 and n2 are both less than or equal to 20 and a two-tailed test at = 0.05 is desired, then Table 14 in Appendix B is used to complete the hypothesis test.
12
Example 10 – Solution Step 4 Probability Distribution: p-Value:
cont’d Step 4 Probability Distribution: p-Value: a. Since the concern is for values related to “not random,” the test is two-tailed. The p-value is found by finding the probability of the right tail and doubling: P = 2 P(V 24 for na = 15 and nb = 15)
13
Example 10 – Solution To find the p-value, you have two options:
cont’d To find the p-value, you have two options: 1. Use Table 14 (Appendix B) to place bounds on the p-value: P < 0.05. 2. Use a computer or calculator to find the p-value: P = b. The p-value is smaller than .
14
Example 10 – Solution Alternative procedure for Step 4 Classical:
cont’d Alternative procedure for Step 4 Classical: Since the concern is for values related to “not random,” the test is two-tailed. Use Table 14 for two-tailed = The critical values are at the intersection of column n1 = 15 and row n2 = 15:10 and 22. The critical region is V 10 or V 22. b is in the critical region, as shown in the figure.
15
Example 10 – Solution Step 5 a. Decision: Reject Ho.
cont’d Step 5 a. Decision: Reject Ho. b. Conclusion: We are able to reject the hypothesis of randomness at the 0.05 level of significance and conclude that the sequence is not random with regard to above and below the median.
16
The Runs Test Calculating the p-Value when Using the Runs Test
Method 1: Use Table 14 in Appendix B to place bounds on the p-value. By inspecting Table 14 at the intersection of column n1 = 15 and row n2 = 15, you can determine that the p-value is less than 0.05; the observed value of = 24 is larger than the larger critical value listed. Method 2: If you are doing the hypothesis test with the aid of a computer or graphing calculator, most likely it will calculate the p-value for you.
17
Normal Approximation
18
Normal Approximation To complete the hypothesis test about randomness when n1 and n2 are larger than 20 or when is other than 0.05, we will use z, the standard normal random variable. V is approximately normally distributed with a mean of V and a standard deviation of V. The formulas for the mean and standard deviation of the V statistic and the test statistic follow: (14.8) (14.9) (14.10)
19
Example 11 – Two-tailed Hypothesis Test for Randomness
Test the null hypothesis that the sequence of sample data in Table 14.7 is a random sequence with regard to each data value being odd or even. Use = (Data is in sequence across the rows.) Sample Data for Example 11 [TA14-07] Table 14.7
20
Example 11 – Solution Step 1 a. Parameter of interest: Randomness of odd and even numbers b. Statement of hypotheses: Ho: The sequence of odd and even numbers is random. Ha: The sequence is not random. Step 2 a. Assumptions: Each sample value can be classified as either odd or even.
21
Example 11 – Solution cont’d b. Test statistic: V, the number of runs in the sample data c. Level of significance: = 0.10 Step 3 a. Sample information: The data are given at the beginning of the example. b. Test statistic: The sample data, when converted to “o” for odd and “e” for even, become
22
Example 11 – Solution cont’d and reveal: no = 26, ne = 24, and 29 runs, so = 29. Now use formulas (14.8), (14.9), and (14.10) to determine the z-statistic:
23
Example 11 – Solution cont’d
24
Example 11 – Solution Step 4 Probability Distribution: p-Value:
cont’d Step 4 Probability Distribution: p-Value: A two-tailed test is used: P = 2 P (z > 0.87) To find the p-value, you have three options: 1. Use Table 3 (Appendix B) to calculate the p-value: P = 2( – ) =
25
Example 11 – Solution cont’d 2. Use Table 5 (Appendix B) to place bounds on the p-value: < P < 3. Use a computer or calculator to find the p-value: P = b. The p-value is not smaller than . Classical: A two-tailed test is used. The critical values are obtained from Table 4A: –z(0.05) = –1.65 and z(0.05) = 1.65
26
Example 11 – Solution cont’d b is not in the critical region, as shown in red in the figure. Step 5 a. Decision: Fail to reject Ho. b. Conclusion: At the 0.10 level of significance, we are unable to reject the hypothesis of randomness and conclude that these data are a random sequence.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.