Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ka-fu Wong © 2003 Chap 16- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

Similar presentations


Presentation on theme: "Ka-fu Wong © 2003 Chap 16- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data."— Presentation transcript:

1 Ka-fu Wong © 2003 Chap 16- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data

2 Ka-fu Wong © 2003 Chap 16- 2 l GOALS Conduct the sign test for dependent samples using the binomial distribution as the test statistic. Conduct the sign test for dependent samples using the normal distribution as the test statistic. Conduct a test of hypothesis for the population median. Conduct a test of hypothesis for dependent samples using the Wilcoxon signed-rank test. Conduct the Wilcoxon rank-sum test for independent samples. Conduct the Kruskal-Wallis test for several independent samples. Compute and interpret Spearman’s coefficient of rank correlation. Conduct a test of hypothesis to determine whether the correlation among the ranks in the population is different from zero. Chapter Sixteen Nonparametric Methods: Analysis of Ranked Data

3 Ka-fu Wong © 2003 Chap 16- 3 Tests based on signs and ranks Some of the tests we discussed earlier may be conducted differently based on the signs and ranks of data. Tests based on signs and ranks can Deal with a wider range of data type. Most the tests we talked about mainly deals with ratio level data. Rank-based and sign-based tests can deal with ordinal level data. Requires less distributional assumptions. Some of the tests we talked about requires normality and sometimes same variance across populations. Tests based on signs and ranks are thus known to be non-parametric – requires less parametric assumptions.

4 Ka-fu Wong © 2003 Chap 16- 4 The Sign Test The Sign Test is based on the sign of a difference between two related observations. No assumption is necessary regarding the shape of the population of differences. The binomial distribution is the test statistic for small samples and the standard normal (z) for large samples. The test requires dependent (related) samples. Recall in Chapter 11, we tested the difference of paired sample based on the mean differences of different pairs. A t-statistic was used for the test.

5 Ka-fu Wong © 2003 Chap 16- 5 The Sign Test applications Test of the “ before/after ” experiments. Have sales increased after the introduction of a new marketing strategy? Have general prices fallen after the outbreak of SARS in Hong Kong? Have stock prices fallen after the outbreak of SARS in Hong Kong?

6 Ka-fu Wong © 2003 Chap 16- 6 The Sign Test procedure Procedure to conduct the test: Determine the sign of the difference between related pairs. Determine the number of usable pairs. Compare the number of positive (or negative) differences to the critical value. Idea of the test: Any observation pair is classified as positive (success) or negative (failure). Hence the distribution of observed positive should be approximated binomial. Similarly for observed negatives. If we were to test no change from one group to the other (or before and after), the probability of success (positive) in any single draw under the null is  =.5. In s sample of n observations (n usable pairs without ties), the probability of observing X positives may be computed using the binomial probability formula

7 Ka-fu Wong © 2003 Chap 16- 7 Normal Approximation If both n  and n(1-  ) are greater than 5, we can use z distribution as an approximation to the binomial distribution, with some adjustment of the continuity correction factor (see Chapter 7). If the number of pluses or minuses is more than n/2, then If the number of pluses or minuses is less than n/2, then

8 Ka-fu Wong © 2003 Chap 16- 8 EXAMPLE 1 The Gagliano Research Institute for Business Studies is comparing the research and development expense (R&D) as a percent of income for a sample of glass manufacturing firms for 2000 and 2001. At the.05 significance level has the R&D expense declined? Use the sign test. Company20002001 Savoth Glass2016 Ruisi Glass1413 Rubin Inc.2320 Vaught2417 Lambert Glass3122 Pimental2220 Olson Glass1420 Flynn Glass1811

9 Ka-fu Wong © 2003 Chap 16- 9 EXAMPLE 1 continued Company20002001DifferenceSign Savoth Glass20164+ Ruisi Glass14131+ Rubin Inc.23203+ Vaught24177+ Lambert Glass31229+ Pimental22202+ Olson Glass1420-6- Flynn Glass18117+ First, Compute the differences and determine the signs

10 Ka-fu Wong © 2003 Chap 16- 10 EXAMPLE 1 continued Step 1: If the R&D expense remains more or less unchanged, the probability that a random draw firm should have a higher R&D expense (with + sign) should be about 0.5. H 0 :  =.5 If the R&D expense has decline, the probability that a random draw firm should have a higher R&D expense should be lower than 0.5. H 1 :  <.5

11 Ka-fu Wong © 2003 Chap 16- 11 EXAMPLE 1 continued Step 1: H 0 :  =.5 H 1 :  <.5 Step 2: H 0 is rejected if the number of negative signs is 0 or 1, because z critical value is -1.65, and hence critical value of X = (-1.65) 0.5 (8) 0.5 + 0.5(8)-0.5 =1.166 at 0.05 level of significance, based on the normal approximation. Step 3: There is one negative difference, i.e., 7 positive difference. That is, there was an increase in the percent for one company. Step 4: H 0 is rejected. We conclude that R&D expense as a percent of income declined from 2000 to 2001.

12 Ka-fu Wong © 2003 Chap 16- 12 Using sign test to test a hypothesis about a Median When testing the value of the median, we use the normal approximation to the binomial distribution. Any observation that is above the proposed median is classified as positive (success), and below as negative (failure). If the proposed median is correct, the observed positives should be about 50% of the sample size. Hence the distribution of observed positive should be approximated binomial with  =.5. Similar for observed negatives. As before, the probability of success (positive) in any single draw under the null is  =.5. In s sample of n observations (n usable pairs without ties), the probability of observing X positives may be computed using the binomial probability formula. When sample size is large, the z distribution is used as an approximation, with a continuity factor correction.

13 Ka-fu Wong © 2003 Chap 16- 13 EXAMPLE 2 The Gordon Travel Agency claims that their median airfare for all their clients to all destinations is $450. This claim is being challenged by a competing agency, who believe the median is different from $450. A random sample of 300 tickets revealed 170 tickets were below $450. Use the 0.05 level of significance.

14 Ka-fu Wong © 2003 Chap 16- 14 Example 2 Continued H 0 : median = $450 versus H 1 : median ≠ $450 Above the proposed median implies positive sign. Below the proposed median implies negative sign. Because we have a two-sided alternative, at 0.05 significance level, H 0 is rejected if z is less than – 1.96 or greater than 1.96. Because z is larger than 1.96, H 0 is rejected. We conclude that the median is not $450.

15 Ka-fu Wong © 2003 Chap 16- 15 Wilcoxon Signed-Rank Test If the assumption of normality is violated for the paired-t test (recall Chapter 11), the paired- t test cannot be used. Wilcoxon signed-rank test does not assume normality and hence can be used in this situation. The test requires the ordinal scale of measurement. The observations must be related or dependent. As an alternative to Sign Test to test “no change” in “before/after experiments”.

16 Ka-fu Wong © 2003 Chap 16- 16 Wilcoxon Signed-Rank Test The steps for the test are: 1.Compute the differences between related observations. 2.Rank the absolute differences from low to high. 3.Return the signs to the ranks and sum positive and negative ranks. If there is no change after the experiment, the ranks of the absolute difference should be due to some random errors. Hence, the two rank sums should be close. 4.Compare the smaller of the two rank sums with the T value, obtained from Appendix H.

17 Ka-fu Wong © 2003 Chap 16- 17 EXAMPLE 3 Use the Wilcoxon matched-pair signed-rank test to determine if the R&D expenses as a percent of income (EXAMPLE 1) have declined. Use the.05 significance level. Step 1: H 0 : The percent stayed the same. H 1 : The percent declined. Step 2: H 0 is rejected if the smaller of the rank sums is less than or equal to 5. See Appendix H.

18 Ka-fu Wong © 2003 Chap 16- 18 Example 3 Continued Company20002001DiffAbs- diff RankR+R- Savoth Glass20164444* Ruisi Glass14131111* Rubin Inc.23203333* Vaught2417777** Lambert Glass31229988* Pimental22202222* Olson Glass1420-665*5 Flynn Glass18117766* The smaller rank sum is 5, which is equal to the critical value of T. H 0 is rejected. The percent has declined from one year to the next.

19 Ka-fu Wong © 2003 Chap 16- 19 Wilcoxon Rank-Sum Test The Wilcoxon Rank-Sum Test is used to determine if two independent samples came from the same or equal populations. No assumption about the shape of the population is required. The data must be at least ordinal scale. Each sample must contain at least eight observations. If the two samples are from the same population, the average of the ranks of the two samples should be about the same. Recall a similar t-test in Chapter 11 requires the data to follow normal distribution and have equal population variances.

20 Ka-fu Wong © 2003 Chap 16- 20 Wilcoxon Rank-Sum Test For a one sided test, W have to be chosen to be consistent with the hypothesis. To determine the value of the test statistic W, all data values are ranked from low to high as if they were from a single population. The sum of ranks for each of the two samples is determined. The sum of ranks of the first sample – W – is used to compute the test statistic : Implied sum of ranks for the first sample under the null.

21 Ka-fu Wong © 2003 Chap 16- 21 EXAMPLE 4 Hills Community College purchased two vehicles, a Ford and a Chevy, for the administration ’ s use when traveling. The repair costs for the two cars over the last three years is shown on the next slide. At the.05 significance level is there a difference in the two distributions? Ford ($)Chevy ($) 25.3114.89 33.6820.31 46.8925.97 51.8333.68 87.6568.98 87.9078.23 90.8980.31 120.6781.75 157.90

22 Ka-fu Wong © 2003 Chap 16- 22 EXAMPLE 4 continued Ford ($)RankChevy ($)Rank 25.31314.891.0 33.685.520.312.0 46.897.025.974.0 51.838.033.685.5 87.6513.068.989.0 87.9014.078.2310.0 90.8915.080.3111.0 120.6716.081.7512.0 157.9017.0 81.571.5 First, rank the combined sample and compute the sum of ranks separately for the two samples.

23 Ka-fu Wong © 2003 Chap 16- 23 EXAMPLE 4 continued Step 1: H 0 : The repair costs are the same. H 1 : The repair costs are not the same. Because it is a two-sided test, W can be either the rank sum of the smaller sample or the larger sample. Step 2: H 0 is rejected if z >1.96 or z is less than –1.96 If we have H 0 : The repair costs are the same. H 1 : The repair costs is lower for Ford than for Chevy. Somehow we should reject the null and favor the alternative, if the average of rank of ford is lower than that for Chevy. That should correspond to small z. Hence we will choose “Ford” as the first sample. See the textbook for additional example.

24 Ka-fu Wong © 2003 Chap 16- 24 Example 4 continued Step 3: The value of the test statistic is 0.914. Step 4: We do not reject the null hypothesis. We cannot conclude that there is a difference in the distributions of the repair costs of the two vehicles.

25 Ka-fu Wong © 2003 Chap 16- 25 Kruskal-Wallis Test: Analysis of Variance by Ranks This is used to compare three or more samples to determine if they came from equal populations. The ordinal scale of measurement is required. It is an alternative to the one-way ANOVA. The chi-square distribution is the test statistic with degree of freedom equal to the number of samples minus 1. Each sample should have at least five observations. The sample data is ranked from low to high as if it were a single group.

26 Ka-fu Wong © 2003 Chap 16- 26 Kruskal-Wallis Test: Analysis of Variance by Ranks continued The test statistic is given by: If the samples come from the same population, the mean sum of squared ranks should be approximately the same across samples. If the samples are not from the same population, some mean sum of squared ranks may explode – returning a big H.

27 Ka-fu Wong © 2003 Chap 16- 27 EXAMPLE 5 Keely Ambrose, director of Human Resources for Miller Industries, wishes to study the percent increase in salary for middle managers at the four manufacturing plants. She gathers a sample of managers and determines the percent increase in salary from last year to this year. At the 5% significance level can Keely conclude that there is a difference in the percent increases for the various plants?

28 Ka-fu Wong © 2003 Chap 16- 28 EXAMPLE 5 continued MillvilleRankCamdenRankEatonRankVinelandRank 2.22.01.91.03.76.05.79.0 3.65.02.73.04.57.06.810.5 4.98.03.14.07.113.58.916.0 6.810.56.912.09.317.011.618.5 7.113.58.315.011.618.513.920.0 39.03562.074.0 First, rank the combined sample.

29 Ka-fu Wong © 2003 Chap 16- 29 EXAMPLE 5 continued Step 1: H 0 : The populations are the same. H 1 : The populations are not the same. Step 2: H 0 is rejected if  2 is greater than 7.185. There are 3 degrees of freedom at the.05 significance level.

30 Ka-fu Wong © 2003 Chap 16- 30 Example 5 continued The null hypothesis is not rejected. There is no difference in the percent increases in the four plants. Step 1: H 0 : The populations are the same. H 1 : The populations are not the same. Step 2: H 0 is rejected if  2 is greater than 7.185. There are 3 degrees of freedom at the.05 significance level.

31 Ka-fu Wong © 2003 Chap 16- 31 Rank-Order Correlation Spearman ’ s coefficient of rank correlation reports the association between two sets of ranked observations. The features are: It can range from – 1.00 up to 1.00. It is similar to Pearson’s coefficient of correlation, but is based on ranked data.

32 Ka-fu Wong © 2003 Chap 16- 32 Spearman Coefficient of Rank Correlation The formula to find the coefficient of rank correlation is: d is the difference in the ranks and n is the number of observations.

33 Ka-fu Wong © 2003 Chap 16- 33 Testing the Significance of r s State the null hypothesis: Rank correlation in population is 0. State the alternate hypothesis: Rank correlation in population is not 0. The value of the test statistic is computed from the formula:

34 Ka-fu Wong © 2003 Chap 16- 34 Example 6 Below are the pre-season football rankings for the Atlantic Coast Conference by the coaches and sports writers. Determine the coefficient of rank correlation between the two groups. SchoolCoachesWriters Maryland23 NC State34 NC66 Virginia55 Clemson42 Wake Forest78 Duke87 Florida State11

35 Ka-fu Wong © 2003 Chap 16- 35 Example 6 Continued SchoolCoachesWritersdD2D2 Maryland231 NC State341 NC6600 Virginia5500 Clemson4224 Wake Forest781 Duke8711 Florida State1100 Total8 Compute the differences in ranks and their squares.

36 Ka-fu Wong © 2003 Chap 16- 36 Example 6 Continued There is a strong correlation between the ranks of the coaches and the sports writers. Since the test statistic is larger than the critical value (2.447 from t-distribution with 6 degree of freedom), the null hypothesis of zero correlation is rejected..

37 Ka-fu Wong © 2003 Chap 16- 37 - END - Chapter Sixteen Nonparametric Methods: Analysis of Ranked Data


Download ppt "Ka-fu Wong © 2003 Chap 16- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data."

Similar presentations


Ads by Google