The relationship of two quantitative variables
What is relationship? Going/moving together: cooccurrance Causal effect, dependence Independence
Example I Birth weight (kg) Birth height (cm)
Example II Body weight at 10 (kg) Height at 10
The problem of prediction If Mom is 50 kg at 30, what will be the weight of his 10 years old son?
Prediction by means of a line Mom’s body weight (kg) Son’s weight at 10
Which is the best predicting line? Mom’s body weight (kg) Son’s weight at 10
The best line is the one that lies closest to the points of the diagram The general formula of a line : f(x) = a + bx
Variable X Variable Y a y = a + bx parameter ‘a’ = intercept parameter ‘b’ = slope The parameters of a line
Basic terms of prediction Predicted (dependent) variable: Y Predicting (independent) variable: X Linear prediction: Ŷ = a + bX True Y-value belonging to value x: y Prediction belonging to x: ŷ = a + bx Error of prediction for one subject: (y - ŷ) 2 For the best line E((Y - Ŷ) 2 ) is minimal
Basic terms of regression Thge best predicting line: Regression line The y = + x formula of the regression line: Linear regression function Determining the regression line: Regression problem Error of regression = Error variance: Res = E((Y - Ŷ) 2 ) , parameters: Regression coefficients
How strong is the relationship between X and Y? The more X is informative for Y, the smaller Res will be relative to Var(Y), that is the smaller will be Res/Var(Y). But the greater will be the coefficient of determination:
The coefficient of determination 0 Det(X,Y) 1 A measure of explained variance Important: Det(X,Y) = Det(Y,X). Shows the strenght of the linear relationship between X and Y.
The independence of two random variables QUESTION: Does the height of a person depend on gender?
Does birth height depend on birth weight? Birth weight (kg) Birth height (cm)
Does variable Y depend on variable X? , YY X X
Does variable Y depend on X? X Y
The independence is mutual IMPORTANT: If Y is independent from X, then X is independent from Y as well.
The covariance DEFINITION: Cov(X,Y) = E(X·Y) - E(X)·E(Y) If X and Y are independent, then Cov(X,Y) = 0 The reverse is not always true.
The correlation coefficient Standardized covariance = correlation coefficient:
Relationship between correlation coefficient and coefficient of determination ( (X,Y)) 2 = Det(X,Y)
Some characteristics of (X, Y) -1 (X,Y) 1 If X and Y are independent then (X,Y) = 0. If (X,Y) = 0, that is X and Y are uncorrelated, then X and Y can still be related to each other (U shaped relationship).
Prediction and correlation IQ of father = 130. IQ of son = ??? z(IQ/father) = 2. z(IQ/son) = ??? z(predicted) = z(predictor) z ŷ = z x
Sample correlation coefficient Notation: r XY or r Formula:
(X,Y)-sample H 1 : XY < 0 H0H0 H 2 : XY > 0 Condition: X and Y are bivariate normals r - r 0.05 r r 0.05 |r| < r 0.05 Significance test of correl. coefficient H 0 : XY = 0 Computation of r xy (df = n 2)