Download presentation
Presentation is loading. Please wait.
Published byShanon Manning Modified over 9 years ago
1
Economics 173 Business Statistics Lecture 20 Fall, 2001© Professor J. Petry http://www.cba.uiuc.edu/jpetry/Econ_173_fa01/
2
2
3
3 Example – Vacation Homes (18.1) 1.What is the standard error of the estimate? Interpret its value. 2.What is the coefficient of determination? What does this statistic tell you? 3.What is the coefficient of determination, adjusted for degrees of freedom? Why does this value differ from the coefficient of determination? What does this tell you about the model? ========================================================= 1.Test the overall validity of the model. What does the p-value of the test statistic tell you? 2.Interpret each of the coefficients. 3.Test to determine whether each of the independent variables is linearly related to the price of the lot.
4
4 The required conditions for the model assessment to apply must be checked. –Is the error variable normally distributed? –Is the error variance constant? –Are the errors independent? –Can we identify outliers? –Is multicollinearity a problem? 18.4 Regression Diagnostics - II Draw a histogram of the residuals Plot the residuals versus y ^ Plot the residuals versus the time periods
5
5 Example 18.2 House price and multicollinearity –A real estate agent believes that a house selling price can be predicted using the house size, number of bedrooms, and lot size. –A random sample of 100 houses was drawn and data recorded. –Analyze the relationship among the four variables
6
6 Solution The proposed model is PRICE = 0 + 1 BEDROOMS + 2 H-SIZE + 3 LOTSIZE + –Excel solution The model is valid, but no variable is significantly related to the selling price !!
7
7 –when regressing the price on each independent variable alone, it is found that each variable is strongly related to the selling price. –Multicollinearity is the source of this problem. Multicollinearity causes two kinds of difficulties: –The t statistics appear to be too small. –The coefficients cannot be interpreted as “slopes”. However,
8
8 Remedying violations of the required conditions – Nonnormality or heteroscedasticity can be remedied using transformations on the y variable. –The transformations can improve the linear relationship between the dependent variable and the independent variables. –Many computer software systems allow us to make the transformations easily.
9
9 A brief list of transformations »y’ = log y (for y > 0) Use when the s increases with y, or Use when the error distribution is positively skewed »y’ = y 2 Use when the s 2 is proportional to E(y), or Use when the error distribution is negatively skewed »y’ = y 1/2 (for y > 0) Use when the s 2 is proportional to E(y) »y’ = 1/y Use when s 2 increases significantly when y increases beyond some value.
10
10 Example 18.3: Analysis, diagnostics, transformations. –A statistics professor wanted to know whether time limit affect the marks on a quiz? –A random sample of 100 students was split into 5 groups. –Each student wrote a quiz, but each group was given a different time limit. See data below. MarksMarks Analyze these results, and include diagnostics
11
11 This model is useful and provides a good fit. The errors seem to be normally distributed The model tested: MARK = 0 + 1 TIME +
12
12 The standard error of estimate seems to increase with the predicted value of y. Two transformations are used to remedy this problem: 1. y’ = log e y 2. y’ = 1/y
13
13 Let us see what happens when a transformation is applied 40,18 40,23 40, 3.135 40, 2.89 Log e 23 = 3.135 Log e 18 = 2.89 The original data, where “Mark” is a function of “Time” The modified data, where LogMark is a function of “Time"
14
14 The new regression analysis and the diagnostics are: The model tested: LOGMARK = ’ 0 + ’ 1 TIME + ’ Predicted LogMark = 2.1295 +.0217Time This model is useful and provides a good fit.
15
15 The errors seem to be normally distributed The standard errors still changes with the predicted y, but the change is smaller than before.
16
16 Let TIME = 55 minutes LogMark = 2.1295 +.0217Time = 2.1295 +.0217(55) = 3.323 To find the predicted mark, take the antilog: antilog e 3.323 = e 3.323 = 27.743 How do we use the modified model to predict?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.