Lecture 17 Rank Correlation Coefficient Outline of Today Rank Correlation Coefficients 11/14/2018 SA3202, Lecture 17
Properties: 1. -1<= r <=1 Rank Correlation Coefficient Recall that a measure of association (correlation) between two (numerical) random variables X and Y is Pearson’s correlation coefficient: r=Cor(X,Y) Properties: 1. -1<= r <=1 2. X and Y is positively related when r>0 negatively when r<0 linearly uncorrelated when r=0. Given a sample (Xi, Yi), i=1,2, ..,n, the sample correlation coefficient r can be used to test whether there is an association between X and Y. But the null distribution of r depends on the joint distribution of X and Y, usually assumed to be bivariate normal. 11/14/2018 SA3202, Lecture 17
The Definition For a distribution-free test of association, we replace the observations with their ranks and then compute the sample correlation coefficient (based on ranks). This measure is known as Spearman’s rank correlation coefficient: Where xi is the rank of Xi within the X’s, and yi is the rank of Yi within the Y’s. Ties are treated in a usual manner. 11/14/2018 SA3202, Lecture 17
Note that if there are no ties, we can show that 11/14/2018 SA3202, Lecture 17
Example Eight elementary science teachers have been ranked by a judge according to their teaching ability, and all have taken a national teachers’ examination. Do the data suggest agreement between the judge’s ranking and the examination score? Teacher 1 2 3 4 5 6 7 8 Judge’s Rank 7 4 2 6 1 3 8 5 Exam Score 44 72 69 70 93 82 67 80 For consistency, we ranked the examination scores from the highest from the lowest. We got the following table: Total Teacher 1 2 3 4 5 6 7 8 Judge’s Rank (xi) 7 4 2 6 1 3 8 5 Exam Score Rank (yi) 8 4 6 5 1 2 7 3 xi-yi -1 0 -4 1 0 1 1 2 (xi-yi)^2 1 0 16 1 0 1 1 4 24 rs=1-6x24/8/(64-1)=.714 This indicates agreement between the Judge’s ranking and the examination score. 11/14/2018 SA3202, Lecture 17
Hypothesis Test The Spearman rank correlation coefficient may be used to test the hypothesis H0: there is no association between the two variables. The upper quantiles of the null distribution of rs is symmetric about 0; so the lower quantiles are just the negative of the upper quantiles. Example For the above example, consider testing H0: there is no association between the judge’s ranking and the exam scores H1: the relationship is positively related. For the 5% level, we reject H0 when rs>.643. Since the observed value of rs is .714, H0 is rejected. 11/14/2018 SA3202, Lecture 17