Math 4030 – 13a Correlation & Regression
Correlation (Sec. 11.6): Two random variables, X and Y, both continuous numerical; Correlation exists when the value of one variable go “consistently” up or down with the change of the other variable. Correlation coefficient: r [-1,1]
Calculation: or xyx2x2 y2y2 xy x1x1 y1y1 …… xnxn ynyn xixi yiyi xi2xi2 yi2yi2 xiyixiyi
Meaning of r values: r = 0.5 b = 0.8 r = b = r = b = r = 0.01 b = 0.9
r vs. b: r and b have the same sign; b is the slope of the linear relationship; r is the strength of the linear relationship; r [-1,1], b (- , + ).
Correlation Coefficient and the Efficiency of the (Linear) Regression Model
Decomposition of Variability
Coefficient of determination: Proportion of total variability explained by the linear regression:
Coefficient of Determination Correlation Coefficient
Testing about the normal population correlation coefficient : Distribution of sample statistic r? Fisher Z transformation: r (-1, 1) Fisher- (- , ) If joint distribution of (X,Y) is approximately bivariate normal, then
Test statistic for H 0 : = 0 Test statistic for H 0 : = 0
Confidence interval for : Confidence interval for Fisher-Z score: Solve the two boundary value for using relationship
Strength vs. significance of the correlation: the significance, given by P-value, depends on the statistical evidence. When small, the correlation (despite of the strength) exists. the significance, given by P-value, depends on the statistical evidence. When small, the correlation (despite of the strength) exists. the strength, given by the r value, is meaningful only it is supported by statistical significance. the strength, given by the r value, is meaningful only it is supported by statistical significance.