Bivariate Statistics NominalOrdinalInterval Nominal 2 Rank-sumt-test Kruskal-Wallis HANOVA OrdinalSpearman r s (rho) IntervalPearson r Regression Y X
Sir Francis Galton Karl Pearson October 31
Source: Raymond Fancher, Pioneers of Psychology. Norton, 1979.
A correlation coefficient is a numerical expression of the degree of relationship between two continuous variables.
-1 r +1 Pearson’s r
Population Sample A X A µ _ Sample B X B Sample E X E Sample D X D Sample C X C _ _ _ _ sasa sbsb scsc sdsd sese n n n nn
Population Sample A Sample B Sample E Sample D Sample C _ XY r XY
-1 r +1 Pearson’s r Pearson’s r is a function of the sum of the cross-product of z-scores for x and y.
Pearson’s r r = z x z y N
Population Sample A Sample B Sample E Sample D Sample C _ XY r XY
The familiar t distribution, at N-2 degrees of freedom, can be used to test the probability that the statistic r was drawn from a population with = 0 H 0 : XY = 0 H 1 : XY 0 where r N r 2 t =
Some uses of r Association of two variables Reliability estimates Validity estimates
Factors that affect r Non-linearity Restriction of range / variability Outliers Reliability of measure / measurement error
Spearman’s Rank Order Correlation r s Point Biserial Correlation r pb
-1 r +1 Pearson’s r Pearson’s r can also be interpreted as how far the scores of Y individuals tend to deviate from the mean of X when they are expressed in standard deviation units.
-1 r +1 Pearson’s r Pearson’s r can also be interpreted as the expected value of z Y given a value of z X. tend to deviate from the mean of X when they are expressed in standard deviation units. The expected value of z Y is z X *r If you are predicting z Y from z X where there is a perfect correlation (r=1.0), then z Y =z X.. If the correlation is r=.5, then z Y =.5z X.