Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola
Copyright © 2010, 2007, 2004 Pearson Education, Inc Chapter 13 Nonparametric Statistics 13-1Review and Preview 13-2Sign Test 13-3Wilcoxon Signed-Ranks Test for Matched Pairs 13-4Wilcoxon Rank-Sum Test for Two Independent Samples 13-5Kruskal-Wallis Test 13-6Rank Correction 13-7Runs Test for Randomness
Copyright © 2010, 2007, 2004 Pearson Education, Inc Section 13-6 Rank Correlation
Copyright © 2010, 2007, 2004 Pearson Education, Inc Key Concept This section describes the nonparametric method of the rank correlation test, which uses paired data to test for an association between two variables. In Chapter 10 we used paired sample data to compute values for the linear correlation coefficient r, but in this section we use ranks as a the basis for computing the rank correlation coefficient.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Definition The rank correlation test (or Spearman’s rank correlation test) is a non-parametric test that uses ranks of sample data consisting of matched pairs. It is used to test for an association between two variables.
Copyright © 2010, 2007, 2004 Pearson Education, Inc The nonparametric method of rank correlation can be used in a wider variety of circumstances than the parametric method of linear correlation. With rank correlation, we can analyze paired data that are ranks or can be converted to ranks. 2.Rank correlation can be used to detect some (not all) relationships that are not linear. Advantages Rank correlation has these advantages over the parametric methods discussed in Chapter 10:
Copyright © 2010, 2007, 2004 Pearson Education, Inc Objective Compute the rank correlation coefficient and use it to test for an association between two variables. (There is no correlation between the two variables.) (There is a correlation between the two variables.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc Notation =rank correlation coefficient for sample paired data ( is a sample statistic) =rank correlation coefficient for all the population data ( is a population parameter) n =number of pairs of sample data d =difference between ranks for the two values within a pair
Copyright © 2010, 2007, 2004 Pearson Education, Inc Requirements 1.The sample paired data have been randomly selected. Note: Unlike the parametric methods of Section 10-2, there is no requirement that the sample pairs of data have a bivariate normal distribution. There is no requirement of a normal distribution for any population.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Rank Correlation Test Statistic No ties: After converting the data in each sample to ranks, if there are no ties among ranks for either variable, the exact value of the test statistic can be calculated using this formula:
Copyright © 2010, 2007, 2004 Pearson Education, Inc Rank Correlation Test Statistic Ties: After converting the data in each sample to ranks, if either variable has ties among its ranks, the exact value of the test statistic can be found by using Formula 10-1 with the ranks:
Copyright © 2010, 2007, 2004 Pearson Education, Inc Critical values: If, critical values are found in Table A-9. If, use Formula Rank Correlation where the value of z corresponds to the significance level. (For example, if , z = 1.96.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc Disadvantages A disadvantage of rank correlation is its efficiency rating of 0.91, as described in Section This efficiency rating shows that with all other circumstances being equal, the nonparametric approach of rank correlation requires 100 pairs of sample data to achieve the same results as only 91 pairs of sample observations analyzed through parametric methods, assuming that the stricter requirements of the parametric approach are met.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Figure 13-4 Rank Correlation for Testing
Copyright © 2010, 2007, 2004 Pearson Education, Inc Figure 13-4 Rank Correlation for Testing
Copyright © 2010, 2007, 2004 Pearson Education, Inc Example: Table 13-1 lists overall quality scores and selectivity rankings of a sample of national universities (based on data from U.S. News and World Report). Find the value of the rank correlation coefficient and use it to determine whether there is a correlation between the overall quality scores and the selectivity rankings. Use a 0.05 significance level. Based on the result, does it appear that national universities with higher overall quality scores are more difficult to get into?
Copyright © 2010, 2007, 2004 Pearson Education, Inc Example: Requirement is satisfied: paired data are a simple random sample The selectivity data consist of ranks that are not normally distributed. So, we use the rank correlation coefficient to test for a relationship between overall quality score and selectivity rank. The null and alternative hypotheses are as follows:
Copyright © 2010, 2007, 2004 Pearson Education, Inc Example: Since neither variable has ties in the ranks:
Copyright © 2010, 2007, 2004 Pearson Education, Inc Example: From Table A-9, using and, the critical values are. Because the test statistic of is not between the critical values of –0.738 and 0.738, we reject the null hypothesis. There is sufficient evidence to support a claim of a correlation between overall quality score and selectivity ranking. It appears that Universities with higher quality scores are more selective and are more difficult to get into.
Copyright © 2010, 2007, 2004 Pearson Education, Inc An experiment involves a growing population of bacteria. Table 13-8 lists randomly selected times (in hr) after the experiment is begun, and the number of bacteria present. Use a 0.05 significance level to test the claim that there is a correlation between time and population size. Example: Detecting a Nonlinear Pattern
Copyright © 2010, 2007, 2004 Pearson Education, Inc Requirement is satisfied: date are from a simple random sample The hypotheses are: Example: Detecting a Nonlinear Pattern We follow the rank correlation procedure summarized in Figure The original values are not ranks, so we convert them to ranks and enter the results in Table 13-9.
Copyright © 2010, 2007, 2004 Pearson Education, Inc There are no ties among ranks of either list. Example: Detecting a Nonlinear Pattern
Copyright © 2010, 2007, 2004 Pearson Education, Inc Example: Detecting a Nonlinear Pattern Since is less than 30, use Table A-9 Critical values are The test statistic r s = 1 is not between and, so we reject the null hypothesis of (no correlation). There is sufficient evidence to conclude there is a correlation between time and population size.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Example: Detecting a Nonlinear Pattern If this example is done using the methods of Section 10-2, the linear correlation coefficient is r = and critical values of –0.632 and This leads to the conclusion that there is not enough evidence to support the claim of a significant linear correlation, whereas the nonlinear test found that there was enough evidence. The Minitab scatter diagram shows that there is a non-linear relationship that the parametric method would not have detected.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Recap In this section we have discussed: Rank correlation which is the non- parametric equivalent of testing for correlation described in Chapter 10. It uses ranks of matched pairs to test for association. Sometimes rank correlation can detect non-linear correlation that the parametric test will not recognize.