Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regression 10/29.

Similar presentations


Presentation on theme: "Regression 10/29."— Presentation transcript:

1 Regression 10/29

2 Prediction Correlation can tell how to predict one variable from another What about multiple predictor variables? Explaining Y using X1, X2, … Xm Income based on age, education Memory based on pre-test, IQ Social behavior based on personality dimensions Regression Finds best combination of predictors to explain outcome variable Determines unique contribution of each predictor Can test whether each predictor has reliable influence

3 Linear Prediction One predictor Multiple predictors Intercept: b0
Draw a line through data Multiple predictors Each predictor has linear effect Effects of different predictors simply add together Intercept: b0 Value of Y when all Xs are zero Regression coefficients: bi Influence of Xi Sign tells direction; magnitude tells strength Can be any value (not standardized like correlation)

4 Example Predict income from education and intelligence
Y = $k/yr X1 = years past HS X2 = IQ Regression equation: 4-year college, IQ = 120 *4 + 1*120 = 90  $90k/yr 2-year college, IQ = 90 *2 + 1*90 = 40  $40k/yr

5 Finding Regression Coefficients
Goal: regression equation with best fit to data Minimize error between Y and Math solution Matrix algebra Treats X as a matrix Rows for subjects, columns for variables Transposes, matrix inverse, and other fun Practical solution Use a computer R: linear model function: lm(Y~X1+X2+X3)

6 Regression vs. Correlation
One predictor: regression = correlation Multiple predictors Predictors affect each other Each coefficient depends on what other predictors are included Example: 3rd-variable problem Math achievement and verbal achievement are correlated Math helps with English? Add IQ as a predictor: verbal ~ math + IQ bmath will be near 0 No effect of math once IQ is accounted for Regression finds best combination of predictors to explain outcome Best values of b1, b2, etc. taken together Each coefficient shows contribution of predictor beyond effects of others Allows identification of most important predictors

7 Explained Variability
Sum of squares Same as mean squared error, but without dividing by df Easier because not complicated by degrees of freedom Total sum of squares Variability in Y SSY = S(Y – MY)2 Residual sum of squares Deviation between predicted and actual scores SSresidual = S(Y – )2

8 Explained Variability
SSY SSresidual SSregression SSregression Variability explained by regression Difference between total and residual sums of squares SSregression = SSY - SSresidual R2 ("R-squared") Fraction of variability explained by the regression SSregression/SSY Measures how well the predictors (Xs) can predict or explain the outcome (Y) Same as r2 from correlation when only one predictor

9 Hypothesis Testing with Regression
Does the regression explain meaningful variance in the outcome? Null hypothesis All regression coefficients equal zero in population Regression only “explaining” random error in sample Approach Find how much variance the predictors explain: SSregression Compare to amount expected by chance Ratio gives F statistic, which we can use to get p-value If F is larger than expected, regression explains more variance than expected by chance Reject null hypothesis if F is large enough

10 Hypothesis Testing with Regression
SSY SSresidual SSregression Likelihood function for F is an F distribution Comes from ratio of two chi-square variables Two df values, from numerator and denominator p-value is probability of a result greater than F

11 Degrees of Freedom SSY dfY = n - 1 dfresidual = n – m – 1 SSregression
dfregression = m SSY dfY = n - 1 dfresidual = n – m – 1

12 Testing Individual Predictors
Does predictor variable have any effect on outcome? Does it provide any information beyond other predictors? Null hypothesis: bi = 0 in population Other bs might be nonzero Compute standard error for bi Uses SSresidual (MSE) to estimate t statistic: Get p-value from t distribution with dfresidual Can do one-tailed test if predicted direction of effect Sign of bi


Download ppt "Regression 10/29."

Similar presentations


Ads by Google