Download presentation
Presentation is loading. Please wait.
1
Statistical Analysis SC504/HS927 Spring Term 2008 Session 5: Week 20: 15 th February OLS (2): assessing goodness of fit, extension to multiple regression
2
2 How well does the estimated regression model fit the data? How well does the estimate model describe the variation in the dependent variable? Regression models try to account for (“explain”) deviations from the mean
3
3 Goodness of Fit We need to compare the line of best fit with something to assess the ‘goodness of fit’. The most basic model we have of our data is the mean so we use that. We are looking to see if our line of best fit is a better representation of the data than the mean (remember the example I read out of the Andy Field book last week)
4
4 The proportion of total variation of Y explained by the regression To do this we work out the Sum of Squares (SS) for the mean which we call SS Total and we also work out the SS for the regression line which we call SS Residual The difference between the SS Total and the SS Residual is known as the SS Model/Regression To compute R 2 we just divide the SS Model/Regression by the SS Total
5
5 Proportion of Variance Explained by the Regression R 2 can range from 0 to 1 with higher values indicating better fit
6
6 Example from Suicide data:
7
7 Multiple OLS regression y i =a + b 1 x i1 + b 2 x i2 + …..+ b k x ik + e a = value of y when all x’s =0 b 1 = change in y from a 1 unit change in x 1, holding all x 2, x 3 …………. x k constant i.e. controlling for all other independent variables in the model
8
8
9
9 E.g.: Using the data set alcohol
10
10 Multiple regression – beta coefficients suppose Y = alcohol consumption, X 1 = income (in £s pw) and X 2 = age (in years) how would you compare estimates for b 1 and b 2 ? You would need to standardise the beta coefficient for b 1 = b 1 multiplied by standard deviation of X 1 relative to std. dev. of Y the variable with the largest standardised beta coefficient has the largest impact on Y
11
11 F-test an F-test can be used to test: whether a subset of variables contributes significantly to the regression model (when doing hierarchical regression) If the F-test is significant then the model as a whole is considered to be able to ‘explain’ or ‘account’ for a significant amount of the variance in the dependent variable.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.