Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regression Models Residuals and Diagnosing the Quality of a Model.

Similar presentations


Presentation on theme: "Regression Models Residuals and Diagnosing the Quality of a Model."— Presentation transcript:

1 Regression Models Residuals and Diagnosing the Quality of a Model

2 Visualizing Regression Models

3 Collinearity

4 An Omitted Variable?

5 Models A Model: A statement of the relationship between a phenomenon to be explained and the factors, or variables, which explain it. Steps in the Process of Quantitative Analysis: –Specification of the model –Estimation of the model –Evaluation of the model

6 Thus far… We’ve discussed… –The specification of a model, –The estimation of a model and how to read and interpret the statistics we’ve produced: coefficients, t tests, F tests, R Square Now we need to evaluate the model for problems and further elaboration.

7 We need to evaluate The variation in the predicted values and the difference between the Yi and the predicted Y. That difference is called a “residual.” We can analyze the residuals to see how good the equation is, and whether there are problems with the model that need correction or improvement.

8 More statistics… Standard Error of the Estimate: The square root of the average squared error of prediction is used as a measure of the accuracy of prediction. (p. 281 and 340 in the text). For the population: For the sample:

9 Standard Error of the Estimate Used to calculate a confidence interval around the predicted y. As a rule of thumb, multiply the SEE by 2 and add and subtract from the predicted Ys to determine a measure of the variability of the prediction at a 95% confidence level. At the mean of the independent variable: the standard error of the prediction = SEE/(square root of n).

10 Hypothetical Example 55 predicted value is 48.8 10 20 30 40 50 60 01020 X Y residual is 6.2

11 Example from last week…. Newval = a + b1(Newsize) + b2(Families) + b3(Eastside) + b4(South) Dep Var: NEWVAL N: 467 Multiple R: 0.75 Squared multiple R: 0.56 Adjusted squared multiple R: 0.55 Standard error of estimate: 19.61 Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail) CONSTANT -3.32 2.95 0.00. -1.13 0.26 NEWSIZE 23.60 1.32 0.67 0.68 17.88 0.00 FAMILIES -5.27 2.15 -0.08 0.87 -2.46 0.01 EASTSIDE 14.06 2.53 0.20 0.78 5.56 0.00 SOUTH 6.08 2.75 0.08 0.81 2.21 0.03

12 To understand the principles, let’s simplify…. We return to the bivariate case: House value is a function of the size of the building. Regression models assume that the errors of prediction are homoscedastic, not autocorrelated, normally distributed, and not correlated with the independent variables. That is, the error term should be noise. Now we ask: –1. how accurate our prediction is, –2. what are the characteristics of the residuals or the error term.

13 Model of Housing Values and Building Size Dep Var: NEWVAL N: 467 Multiple R: 0.719 Squared multiple R: 0.517 Adjusted squared multiple R: 0.516 Standard error of estimate: 20.419 Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail) CONSTANT -8.667 2.012 0.000. -4.307 0.000 NEWSIZE 25.381 1.138 0.719 1.000 22.312 0.000 Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P Regression 207571.306 1 207571.306 497.842 0.000 Residual 193878.246 465 416.942

14 Scatterplot of Newsize and Newval

15 Scatterplot, cont.

16 95% Confidence Intervals for Mean Predictions of Y (left) and Individual Predictions of Y (right)

17 Hypothetical Example 55 predicted value is 48.8 10 20 30 40 50 60 01020 X Y residual is 6.2

18 Analysis of Residuals ESTIMATE NEWVAL RESIDUAL N of cases 467 467 467 Minimum -2.647 6.400 -56.140 Maximum 157.129 399.600 242.471 Range 159.777 393.200 298.611 Sum 14463.200 14463.200 0.000 Median 25.391 24.000 -0.092 Mean 30.970 30.970 0.000 95% CI Upper 32.963 33.639 1.775 95% CI Lower 28.977 28.301 -1.775 Std. Error 1.014 1.358 0.903 Standard Dev 21.917 29.351 19.522 Variance 480.353 861.480 381.127 C.V. 0.708 0.948 9.54775E+14 Skewness(G1) 1.337 6.756 7.030 SE Skewness 0.113 0.113 0.113 Kurtosis(G2) 2.875 67.925 79.001 SE Kurtosis 0.225 0.225 0.225

19 Visualizing Regression Models

20 Collinearity

21 An Omitted Variable?


Download ppt "Regression Models Residuals and Diagnosing the Quality of a Model."

Similar presentations


Ads by Google