Download presentation
Presentation is loading. Please wait.
Published byBerniece Conley Modified over 6 years ago
1
Lecture 18 Outline: 1. Role of Variables in a Regression Equation
2. Effects of Additional Predictors 11/10/2018 ST3131, Lecture 18
2
Role of Variables in a Regression Equation
Question : Given a regression currently has q predictor variables, Should we delete a variable from the model? Should we add a variable to the model? Answer: Compute the t-test for each variable in the model (1). If the t-test is large in absolute variable, the variable is retained. (2). If the t-test is small in absolute variable, the variable is omitted. The Answer is valid only when are valid . Thus, the t-test should be used together with appropriate graphs of the data. The added-variable and residual plus component plot are two plots that give this information visually and are often very illuminating. 11/10/2018 ST3131, Lecture 18
3
Added-Variable Plot/Partial Regression Plot:
Plot Y-residuals against Xj-residuals Xj-residuals: Residuals after Xj is linearly adjusted by other predictors (part of Xj not linearly explained by other predictors) Y-residuals: Residuals after Y is linearly adjusted by all predictors except Xj (part of Y not linearly explained by all predictors other than Xj) 11/10/2018 ST3131, Lecture 18
4
Regression Analysis: Time versus Distance, Climb
The regression equation is Time = Distance Climb Predictor Coef SE Coef T P Constant Distance Climb S = R-Sq = 91.9% R-Sq(adj) = 91.4% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Source DF Seq SS Distance Climb Unusual Observations Obs Distance Time Fit SE Fit Residual St Resid RX X R R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. The Scottish Hills Races Data (Y, Record Time in seconds , X1, Distance in miles, X2, Climb in feet), n=35, p=2) Regression Analysis: Time versus Distance, Climb The regression equation is Time = Distance Climb Predictor Coef SE Coef T P Constant Distance Climb S = R-Sq = 91.9% R-Sq(adj) = 91.4% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Unusual Observations Obs Distance Time Fit SE Fit Residual St Resid RX X R R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. 11/10/2018 ST3131, Lecture 18
5
Advantages (1) Indicate -------------------
7 11 18 18 11 7 Advantages (1) Indicate (2) Point out (3) Easy to interpret 11/10/2018 ST3131, Lecture 18
6
Residual Plus Component Plot /Partial Residual Plot
Plot e Xj against Xj e Residual vector of Y regressed on all predictor variables, the coefficient Of the j-th predictor variable Xj is the contribution of Xj to the fitted value. Advantages (1) Indicate (2) Indicate if the between Y and Xj is present (suggest possible transformation for linearizing Xj or the data) Remark: The indication of non-linearity is not present in the added-variable plot since the x-axis is not the variable itself. 11/10/2018 ST3131, Lecture 18
7
11/10/2018 ST3131, Lecture 18
8
Results for: P010.txt (New York Rivers Data)
Regression Analysis: Nitrogen versus Agr, Forest, Rsdntial, ComIndl The regression equation is Nitrogen = Agr Forest Rsdntial ComIndl Predictor Coef SE Coef T P Constant Agr Forest Rsdntial ComIndl S = R-Sq = 70.9% R-Sq(adj) = 63.2% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Unusual Observations Obs Agr Nitrogen Fit SE Fit Residual St Resid RX RX R R R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. 11/10/2018 ST3131, Lecture 18
9
11/10/2018 ST3131, Lecture 18
10
11/10/2018 ST3131, Lecture 18
11
Effects of Additional Predictors
Questions: (1). Is the regression coefficient of the new variable significant? (2). Does the introduction of the new variable substantially change the regression coefficients of the variables already in the equation? When a new variable is introduced in a regression equation, four possibilities result, Depending on the answers to the above questions: The new coefficient test the previous coefficients Action ========================================================== Insignificant Do not change substantially Significant Change substantially Significant Do not change substantially Insignificant Change substantially =========================================================== 11/10/2018 ST3131, Lecture 18
12
Case 1: the added variable has some special meaning
Special situations: Case 1: the added variable has some special meaning Case 2: corrective action should be taken in case of collinearity Case 3: ideal situation occurs when the new variable is uncorrelated with old ones Case 4: Collinearity exists and corrective action should be taken. See Chapter 11 for more details. 11/10/2018 ST3131, Lecture 18
13
Regression Analysis: Nitrogen versus Forest
The regression equation is Nitrogen = Forest Predictor Coef SE Coef T P Constant Forest Regression Analysis: Nitrogen versus Forest, Agr The regression equation is Nitrogen = Forest Agr Predictor Coef SE Coef T P Constant Forest Agr 11/10/2018 ST3131, Lecture 18
14
Regression Analysis: Nitrogen versus Forest, Rsdntial
The regression equation is Nitrogen = Forest Rsdntial Predictor Coef SE Coef T P Constant Forest Rsdntial Regression Analysis: Nitrogen versus Forest, ComIndl The regression equation is Nitrogen = Forest ComIndl Predictor Coef SE Coef T P Constant Forest ComIndl 11/10/2018 ST3131, Lecture 18
15
11/10/2018 ST3131, Lecture 18
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.