Simple Linear Regression Analysis Chapter Eleven Simple Linear Regression Analysis McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Simple Linear Regression 11.1 The Simple Linear Regression Model 11.2 The Least Squares Point Estimates 11.3 Model Assumptions, Mean Squared Error, Std. Error 11.4 Testing Significance of Slope and y-Intercept 11.5 Confidence Intervals and Prediction Intervals 11.6 The Coefficient of Determination and Correlation 11.7 An F Test for the Simple Linear Regression Model *11.8 Checking Regression Assumptions by Residuals *11.9 Some Shortcut Formulas
11.1 The Simple Linear Regression Model y|x = b0 + b1x + e is the mean value of the dependent variable y when the value of the independent variable is x. b0 is the y-intercept, the mean of y when x is 0. b1 is the slope, the change in the mean of y per unit change in x. e is an error term that describes the effect on y of all factors other than x.
The Simple Linear Regression Model Illustrated
11.2 The Least Squares Point Estimates Estimation/Prediction Equation: Least squares point estimate of the slope 1 Least squares point estimate of the y-intercept 0
Example: The Least Squares Point Estimates Prediction (x = 40) Slope b1 y-Intercept b0
11.3 The Regression Model Assumptions Assumptions about the model error terms, ’s Mean Zero The mean of the error terms is equal to 0. Constant Variance The variance of the error terms s2 is, the same for all values of x. Normality The error terms follow a normal distribution for all values of x. Independence The values of the error terms are statistically independent of each other.
Regression Model Assumptions Illustrated
Mean Square Error and Standard Error Sum of Squared Errors Mean Square Error, point estimate of residual variance s2 Standard Error, point estimate of residual standard deviation s Example 11.6 The Fuel Consumption Case
11.4 Significance Test and Estimation for Slope If the regression assumptions hold, we can reject H0: 1 = 0 at the level of significance (probability of Type I error equal to ) if and only if the appropriate rejection point condition holds or, equivalently, if the corresponding p-value is less than . Alternative Reject H0 if: p-Value Test Statistic 100(1-)% Confidence Interval for 1 t, t/2 and p-values are based on n – 2 degrees of freedom.
Significance Test and Estimation for y-Intercept If the regression assumptions hold, we can reject H0: 0 = 0 at the level of significance (probability of Type I error equal to ) if and only if the appropriate rejection point condition holds or, equivalently, if the corresponding p-value is less than . Alternative Reject H0 if: p-Value Test Statistic 100(1-)% Conf Interval for 0 t, t/2 and p-values are based on n – 2 degrees of freedom.
Example: Inferences About Slope and y-Intercept Tests Intervals Example 11.7 The Fuel Consumption Case Excel Output
11.5 Confidence and Prediction Intervals Prediction (x = x0) Distance Value If the regression assumptions hold, 100(1 - a)% confidence interval for the mean value of y, my|xo 100(1 - a)% prediction interval for an individual value of y ta/2 is based on n-2 degrees of freedom
Example: Confidence and Prediction Intervals Example 11.7 The Fuel Consumption Case Minitab Output (predicted FuelCons when Temp, x = 40) Predicted Values Fit StDev Fit 95.0% CI 95.0% PI 10.721 0.241 ( 10.130, 11.312) ( 9.014, 12.428)
11.6 The Simple Coefficient of Determination The simple coefficient of determination r2 is r2 is the proportion of the total variation in y explained by the simple linear regression model
The Simple Correlation Coefficient The simple correlation coefficient measures the strength of the linear relationship between y and x and is denoted by r. Where, b1 is the slope of the least squares line. Example 11.15 Fuel Consumption Excel Output
Different Values of the Correlation Coefficient
11.7 F Test for Simple Linear Regression Model To test H0: 1= 0 versus Ha: 1 0 at the level of significance Test Statistic: Reject H0 if F(model) > Fa or p-value < a Fa is based on 1 numerator and n-2 denominator degrees of freedom.
Example: F Test for Simple Linear Regression Example 11.17 The Fuel Consumption Case Excel Output F-test at = 0.05 level of significance Test Statistic: Reject H0 at level of significance, since Fa is based on 1 numerator and 6 denominator degrees of freedom.
*11.8 Checking the Regression Assumptions by Residual Analysis For an observed value of y, the residual is where the predicted value of y is calculated as If the regression assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance 2. Residual Plots Residuals versus independent variables Residuals versus predicted y’s Residuals in time order (if the response is a time series) Histogram of residuals Normal plot of the residuals
Checking the Constant Variance Assumption Example 11.18: The QHIC Case Plot: Residual versus x and predicted responses
Checking the Normality Assumption Example 11.18: The QHIC Case Plots: Histogram and Normal Plot of Residuals
Checking the Independence Assumption Plots: Residuals versus Fits (to check for functional form, not shown) Residuals versus Time Order
Combination Residual Plots Example 11.18: The QHIC Case Minitab Output Plots: Histogram and Normal Plot of Residuals, Residuals versus Order (I Chart), Residuals versus Fit.
*11.9 Some Shortcut Formulas where
Simple Linear Regression Summary: 11.1 The Simple Linear Regression Model 11.2 The Least Squares Point Estimates 11.3 Model Assumptions, Mean Squared Error, Std. Error 11.4 Testing Significance of Slope and y-Intercept 11.5 Confidence Intervals and Prediction Intervals 11.6 The Coefficient of Determination and Correlation 11.7 An F Test for the Simple Linear Regression Model *11.8 Checking Regression Assumptions by Residuals *11.9 Some Shortcut Formulas