Goodness of Fit The sum of squared deviations from the mean of a variable can be decomposed as follows: TSS = ESS + RSS This decomposition can be used to define the R-squared or coefficient of determination for a regression equation.
Properties of R-squared R-squared always lies in the range zero to one. If R-squared equals one then the regression is a perfect fit to the data (this almost always indicates that there is something wrong with it!). If R-squared is equal to zero then the regression has no explanatory power. In multivariate regressions the R-squared will always increase when we add an extra variable (even if that variable is completely irrelevant).
Testing if an equation has explanatory power Suppose we wish to test: Under the null hypothesis we can show that: This is the F-statistic for a regression equation. We can compare the test statistic with a critical value from the F tables and reject the null if it exceeds this value.
Relationship between the F-statistic and R-squared We can think of the F test as a test of: This relationship remains true when we consider multivariate regressions.
For a bivariate regression equation, there is also a relationship between the F-test and the t-ratio for the slope coefficient. This relationship only holds for bivariate regression equations. Things become more complicated when we move to multivariate regressions.
Relationship between the R-squared and the standard error of the regression A similar relationship will hold for the multivariate case but we will need to adjust for the loss of degrees of freedom when we introduce extra regressors.
Properties of the OLS residuals The OLS residuals sum to zero: by virtue of the property that the OLS regression passes through the sample means of the data.
The OLS residuals are uncorrelated with the X variable. Note: Therefore