Download presentation
Presentation is loading. Please wait.
1
BPK 304W Correlation
2
Correlation Coefficient (r)
Correlation Coefficient (r) is a measure of association between two variables Varies from -1 to +1 r is a ratio of variability in X to that of Y. 0 = no relationship; 1 = perfect relationship Correlation
3
Correlation
4
Linear Fit High Correlation Coefficient does not mean a linear fit
5
Correlation does not mean causation
Spurious Correlations – coincidental correlation between two unrelated variables A study of boys aged 6 to 18 years produced correlations of standing broad jump with other measures. Which had the highest correlation? Correlation
6
Range of the Data affects the Correlation Coefficient
7
Correlation coefficient depends upon the orientation of the two groups
8
Significance of the Correlation Coefficient
The critical value of the correlation coefficient is determined by the sample size Bigger sample size = lower critical value of r Statistical significance of r does not infer “practical significance” Correlation
9
Degrees Probability of Freedom 0.05 0.01 1 .997 1.000 24 .388 .496 2 .950 .990 25 .381 .487 3 .878 .959 26 .374 .478 4 .811 .917 27 .367 .470 5 .754 .874 28 .361 .463 6 .707 .834 29 .355 .456 7 .666 .798 30 .349 .449 8 .632 .765 35 .325 .418 9 .602 .735 40 .304 .393 10 .576 .708 45 .288 .372 11 .553 .684 50 .273 .354 12 .532 .661 60 .250 13 .514 .641 70 .232 .302 14 .497 .623 80 .217 .283 15 .482 .606 90 .205 .267 16 .468 .590 100 .195 .254 17 .575 125 .174 .228 18 .444 .561 150 .159 .208 19 .433 .549 200 .138 .181 20 .423 .537 300 .113 .148 21 .413 .526 400 .098 .128 22 .404 .515 500 .088 .115 23 .396 .505 1,000 .062 .081 Table 2-4.2: Critical Values of the Correlation Coefficient
10
Coefficient of Determination R squared (r2)
The circle represents the total variance in the measure Weight 75% unexplained Correlation of Weight with Arm Girth r = 0.5, r2 = 0.25 Therefore 25% of the variance in weight is explained by arm girth 25% Arm Girth
11
Correlations between all variables
Correlation Matrix Correlations between all variables Weight vs Arm Girth r = 0.5, r2 = 0.25 Weight vs Calf Girth r = 0.6 , r2 = 0 .36 Arm Girth vs Calf Girth r = 0.4 , r2 = 0 .16 Weight Arm Girth Calf Girth
12
BPK 304W Regression
13
Prediction Can we predict one variable from another?
Linear Regression Analysis Y = mX + c m = slope; c = intercept Regression
14
Linear Regression Correlation Coefficient (r) how well the line fits
Standard Error of Estimate (S.E.E.) how well the line predicts Regression
15
Least Sum of Squares Curve Fitting
(Residual) Regression
16
Assumptions about the relationship between Y and X
For each value of X there is a normal distribution of Y from which the sample value of Y is drawn The population of values of Y corresponding to a selected X has a mean that lies on the straight line In each population the standard deviation of Y about its mean has the same value
17
Standard Error of Estimate
measure of how well the equation predicts Y has units of Y true score 68.26% of time is within plus or minus 1 SEE of predicted score Standard deviation of the normal distribution of residuals Regression
18
Right Hand L. = 0.99Left Hand L. + 0.254 r = 0.94 S.E.E. = 0.38cm
Regression
19
How good is my equation? Regression equations are sample specific
Cross-validation Studies Test your equation on a different sample Split sample studies Take a 50% random sample and develop your equation then test it on the other 50% of the sample Regression
20
Multiple Regression More than one independent variable
Y = m1X1 + m2X2 + m3X3 …… + c Same meaning for r, and S.E.E., just more measures used to predict Y Stepwise regression variables are entered into the equation based upon their relative importance Regression
21
Building a multiple regression equation
X1 has the highest correlation with Y, therefore it would be the first variable included in the equation. X3 has a higher correlation with Y than X2. However, X2 would be a better choice than X3. to include in an equation with X1, to predict Y. X2 has a low correlation with X1 and explains some of the variance that X1 does not. X3 Y X1 X2
22
Standardized Regression
The numerical value is of mn is dependent upon the size of the independent variable Y = m1X1 + m2X2 + m3X3 …… + c Variables are transformed into standard scores before regression analysis, therefore mean and standard deviation of all independent variables are 0 and 1 respectively. The numerical value of zmn now represents the relative importance of that independent variable to the prediction Y = zm1X1 + zm2X2 + zm3X3 …… + c Regression
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.