What about ties?? There are two methods mentioned on p.155ff: Spearman’s correlation coefficient , rs, can be computed as Pearson’s r on the ranks; i.e., rank the X’s (among the X’s) and the Y’s (among the Y’s) and then compute the correlation of the ranks… See Table 5.2.1 and let’s do it in R (use cor with method=“s” or “p” on the ranks...) We may test the null hypothesis of no association between X and Y by doing a permutation test on the ranks – all possible assignments of the ranks of the Y’s to the ranks of the X’s – if our correspondence yields an unusually high (or low) value of rs, then we should reject the hypothesis of no association between X and Y. We may also test the above hypothesis with the same normal approximation used for Pearson’s r: Z= rs(sqrt(n-1)); i.e. rs is approx. N(0,1/(sqrt(n-1)) What about ties?? There are two methods mentioned on p.155ff: compute adjusted ranks (midranks) and apply the same formulas we’ve just mentioned use the tie-adjusted formulae given on page 156 (see the next slide...) the author (and I too!) recommend the former.
The following formula for Spearman’s rank correlation (without ties) appears in the literature and we’ll mention it here. It is the one that can be modified for ties – see page 156 where it is defined... Verify that it gives the same results – see problem #13 on page 192-193 for an outline of the theoretical proof of the equivalence of this formula to the definition of rs .
Another measure of association is Kendall’s Tau, t, which looks at the distribution of concordant and discordant pairs of the (X,Y)s: (Xi,Yi) and (Xj,Yj) are concordant if Xi < Xj implies Yi < Yj and discordant if Xi < Xj implies Yi > Yj (or equivalently, concordant if (Xi – Xj)( Yi - Yj ) > 0; discordant if (Xi – Xj)( Yi - Yj ) < 0). X and Y are positively associated if pairs are more likely to be concordant than discordant and negatively associated if pairs are more likely to be discordant than concordant. Note that tau is just rescaled to be between -1 and +1; if there is no association, then the probability of a concordant pair is the same as the probability of a discordant pair, .5, so t = 0. We estimate tau by counting the fraction of concordant pairs in the data, doubling it and subtracting 1
Here, Ranks may also be used to compute tau, since pairs of ranks are concordant or discordant according to whether the original pairs are concordant or discordant. R computes Kendall’s tau in cor.test and SAS computes it in PROC CORR; Exact p-values for testing the hypothesis of no association between X and Y may be obtained by a permutation test; approximate p-values may be obtained from the large sample properties of Kendall’s tau statistic: HW: Read Chapter 5 through page 163 – we will complete this topic (association between two continuous variables) on Thursday – have your questions ready by then. Do problems #3-5 on page 189-190 … we’ll discuss them next class...