Sir Francis Galton Karl Pearson October 28 and 29
Source: Raymond Fancher, Pioneers of Psychology. Norton, 1979.
A correlation coefficient is a numerical expression of the degree of relationship between two continuous variables.
What might be some practical uses of such a statistic? A correlation coefficient is a numerical expression of the degree of relationship between two continuous variables.
-1 r +1 Pearson’s r
Population Sample A X A µ _ Sample B X B Sample E X E Sample D X D Sample C X C _ _ _ _ sasa sbsb scsc sdsd sese n n n nn
Population Sample A Sample B Sample E Sample D Sample C _ XY r XY
-1 r +1 Pearson’s r Pearson’s r is a function of the sum of the cross-product of z-scores for x and y.
Pearson’s r r = z x z y N Where z is based on an uncorrected standard deviation, SS N
Pearson’s r r = z x z y N-1 if z is based on a corrected standard deviation, SS N-1
Pearson’s r N XY - X Y [N X 2 - ( X) 2 ] [N Y 2 - ( Y) 2 ] r = … or, for your convenience,
Population Sample A Sample B Sample E Sample D Sample C _ XY r XY
The familiar t distribution, at N-2 degrees of freedom, can be used to test the probability that the statistic r was drawn from a population with = 0 H 0 : XY = 0 H 1 : XY 0 where r N r 2 t =
-1 r +1 Pearson’s r Pearson’s r can also be interpreted as how far the scores of Y individuals tend to deviate from the mean of X when they are expressed in standard deviation units.
-1 r +1 Pearson’s r Pearson’s r can also be interpreted as the expected value of z Y given a value of z X. tend to deviate from the mean of X when they are expressed in standard deviation units. The expected value of z Y is z X *r If you are predicting z Y from z X where there is a perfect correlation (r=1.0), then z Y =z X.. If the correlation is r=.5, then z Y =.5z X.
Factors that affect r Non-linearity Restriction of range / variability Outliers Reliability of measure / measurement error
Spearman’s Rank Order Correlation r s Point Biserial Correlation r pb