Download presentation
Presentation is loading. Please wait.
Published byMaurice Harrison Modified over 9 years ago
1
Regression Analysis Relationship with one independent variable
2
Lecture Objectives You should be able to interpret Regression Output. Specifically, 1.Interpret Significance of relationship (Sig. F) 2.The parameter estimates (write and use the model) 3.Compute/interpret R-square, Standard Error (ANOVA table)
3
Basic Equation Independent variable (x) Dependent variable (y) ŷ = b 0 + b 1 X b 0 (y intercept) b 1 = slope = ∆y/ ∆x є The straight line represents the linear relationship between y and x.
4
Understanding the equation What is the equation of this line?
5
Total Variation Sum of Squares (SST) What if there were no information on X (and hence no regression)? There would only be the y axis (green dots showing y values). The best forecast for Y would then simply be the mean of Y. Total Error in the forecasts would be the total variation from the mean. Dependent variable (y) Independent variable (x) Mean Y Variation from mean (Total Variation)
6
Sum of Squares Total (SST) Computation Shoe Sizes for 13 Children XYDeviationSquared ObsAge Shoe Sizefrom Meandeviation 1115.0-2.76927.6686 2126.0-1.76923.1302 3125.0-2.76927.6686 4137.5-0.26920.0725 5136.0-1.76923.1302 6138.50.73080.5340 7148.00.23080.0533 81510.02.23084.9763 9157.0-0.76920.5917 10178.00.23080.0533 111811.03.230810.4379 12188.00.23080.0533 131911.03.230810.4379 48.8077Sum of Squared Mean7.7690.000Deviations (SST) In computing SST, the variable X is irrelevant. This computation tells us the total squared deviation from the mean for y.
7
Error after Regression Dependent variable (y) Independent variable (x) Mean Y Total Variation Explained by regression Residual Error (unexplained) Information about x gives us the regression model, which does a better job of predicting y than simply the mean of y. Thus some of the total variation in y is explained away by x, leaving some unexplained residual error.
8
Computing SSE Shoe Sizes for 13 Children XYResidual ObsAge Shoe SizePred. Y(Error)Squared 1115.05.5565-0.55650.3097 2126.06.1685-0.16850.0284 3125.06.1685-1.16851.3654 4137.56.78060.71940.5176 5136.06.7806-0.78060.6093 6138.56.78061.71942.9565 7148.07.39260.60740.3689 81510.08.00461.99543.9815 9157.08.0046-1.00461.0093 10178.09.2287-1.22871.5097 111811.09.84071.15931.3439 12188.09.8407-1.84073.3883 131911.010.45280.54720.2995 0.000017.6880Sum of Squares PredictionIntercept (bo)-1.17593Error Equation:Slope (b1)0.612037
9
The Regression Sum of Squares Some of the total variation in y is explained by the regression, while the residual is the error in prediction even after regression. Sum of squares Total = Sum of squares explained by regression + Sum of squares of error still left after regression. SST = SSR + SSE or, SSR = SST - SSE
10
R-square The proportion of variation in y that is explained by the regression model is called R 2. R 2 = SSR/SST = (SST-SSE)/SST F or the shoe size example, R 2 = (48.8077 – 17.6879)/48.8077 = 0.6376. R 2 ranges from 0 to 1, with a 1 indicating a perfect relationship between x and y.
11
Mean Squared Error MSR = SSR/df regression MSE = SSE/df error df is the degrees of freedom For regression, df = k = # of ind. variables For error, df = n-k-1 Degrees of freedom for error refers to the number of observations from the sample that could have contributed to the overall error.
12
Standard Error Standard Error (SE) = √ MSE Standard Error is a measure of how well the model will be able to predict y. It can be used to construct a confidence interval for the prediction.
13
Summary Output & ANOVA SUMMARY OUTPUT Regression Statistics Multiple R0.798498 R Square0.637599 Adjusted R Square0.604653 Standard Error1.268068 Observations13 ANOVA dfSSMSFSignificance F Regression1 (k)31.1197 19.35310.0011 Residual (Error)11 (n-k-1)17.68801.6080 Total12 (n-1)48.8077 = SSR/SST = 31.1/48.8 = √MSE = √ 1.608 =MSR/MSE =31.1/1.6 p-value for regression
14
The Hypothesis for Regression H 0 : β 1 = β 2 = β 3 = … = 0 H a : At least one of the β s is not 0 If all βs are 0, then it implies that y is not related to any of the x variables. Thus the alternate we try to prove is that there is in fact a relationship. The Significance F is the p-value for such a test.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.