More Multiple Regression

More Multiple Regression
Chapter 15, continued More Multiple Regression

III. Adjusted R2 During single variable regression, we assess goodness of fit with R2, the coefficient of determination. R2 = SSR/SST This value is interpreted as the proportion of the variability in y that is explained by the estimated regression equation.

A. Inclusion of more variables
An unfortunate result of adding more independent variables to our regression is that R2 will increase, even if we are adding insignificant variables. For example, if we had added x2=“Color of the car” to our repair regression, R2 would have marginally increased, despite the ridiculous idea that the color of a car should influence its repair cost.

B. Adjustment To adjust for the addition of more and more variables, just to increase R2, we compensate for the number of independent variables in the model. With n denoting the # of observations in the sample and p is the # of independent variables included in the model,

C. An Example Y is # of hours of television watched in a week.
X1 is the amount of alcohol consumed in a typical week. Can you interpret these estimated coefficients and test their significance? Can you correctly evaluate the fit of the equation?

Include one more variable
Now I’ll add X2=Age of the student, which I don’t believe affects television viewing, but am adding to make a point. If you looked simply at R2, you would conclude that the goodness of fit slightly improved. However, looking at Ra you can see that adding this insignificant variable actually decreased the fit. Alcohol is still significant and positive, but Age is insignificant.

IV. Model Assumptions These assumptions are modified from chapter 14 to accommodate the inclusion of multiple independent variables. The error term is a normally distributed random variable and thus, The variance of  is constant for all values of x1, x2,…,xp. All  are independent, not influenced by any other error term. Thus the size of  is also constant.

V. Testing for Significance
Now that we have multiple independent variables, we can conduct a true F-test of overall significance. Ho: ß1=ß2=…=ßp = 0 Ha: One or more of the parameters is not equal to zero.

A. The F-test Described in Chapter 14, the test statistic is calculated by F = MSR/MSE where: MSR = SSR/p and p is the # of x-variables. and MSE = SSE/(n-p-1)

B. Rejection Rule The critical F is based on an F distribution with p degrees of freedom in the numerator and (n-p-1) degrees of freedom in the denominator. So I’ll test the overall significance of my Television watching model.

C. The Example I have a sample of n=60 and p=2 independent variables.
I have d.f.=2 in the numerator and d.f.=57 in the denominator. So at the .05 level of significance, my critical F is approximately 3.15. If my test F is greater than 3.15, I reject the null and conclude that at least one of my coefficients is NOT zero and my model has overall significance.

Excel Output My test statistic is greater than 3.15 so I reject Ho. You can see from the p-value that it is less than  (.05), which also indicates reject Ho. However, it is not less than  (.01). Thus my model is significant at the 95% level, but not the 99% level of confidence.

E. T-Tests A t-test of a coefficient’s statistical significance is done the same way as in Chapter 14. If t>t/2, reject the null that =0 for that coefficient. Reproducing my Excel output reveals that: the coefficient on Age is insignificant. You can’t reject the null that that coefficient is non-zero. You CAN reject the null for the Alcohol coefficient. It is statistically significant.

More Multiple Regression

Similar presentations

Presentation on theme: "More Multiple Regression"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

More Multiple Regression

Similar presentations

Presentation on theme: "More Multiple Regression"— Presentation transcript:

Similar presentations

About project

Feedback