9/19/2018 ST3131, Lecture 6
Chapter 3 Multiple Linear Regression Lecture 6 Review of Lecture 5 Chapter 3 Multiple Linear Regression Motivation Example MLR Model Estimation of the MLR Model Methods for Assessment of Linearity 9/19/2018 ST3131, Lecture 6
Review of Lecture 5: Special SLR Models No Intercept Model : 9/19/2018 ST3131, Lecture 6
Review of Lecture 5: Special SLR Models No Slope Model : 9/19/2018 ST3131, Lecture 6
Review of Lecture 5: Special SLR Models Trivial Regression Model One-sample t-test: Trivial Model against No-slope Model 9/19/2018 ST3131, Lecture 6
Review of Lecture 5: Special SLR Models Paired two-sample t-test Transformation: The two-sample t-test becomes one-sample t-test: 9/19/2018 ST3131, Lecture 6
Chapter 3 Multiple Linear Regression Motivation Example: Supervisor Performance Data In a large financial organization, there are 30 departments, each having a supervisor and 35 employees. To study the supervisor performance, the employees in each department are given a questionnaire with following items: (1 response variable) Y: Overall rating of job being done (6 predictor variables) X1: handles employee complaints X2:Does not allow special privileges X4: Raises based on performance X3:Opportunity to learn new things X5: Too critical of poor performance X6:Rate of advancing to better jobs 9/19/2018 ST3131, Lecture 6
Motivation Example (cont.) For each item, the employees are required to choose a number from 2 3 4 5 ---------------------------------------------------------------- very satisfactory very unsatisfactory To evaluate the supervisor on a item, The proportion of “favorable” responses among 35 answers in a department is regarded as observation of the associated item. 9/19/2018 ST3131, Lecture 6
Motivation Example (cont.) Part of the Supervisor Performance Data Y X1 X2 X3 X4 X5 X6 =================================================== 43 51 30 39 61 92 45 63 64 51 54 63 73 47 71 70 68 69 76 86 48 61 63 45 47 54 84 35 …….. Each Row : Observations for a department for all items Each Column: Observations for all departments for an item Examples: X3=(39,54,69,47,…)’ Y=(43,63,71,61,81,…)’ called a design matrix X=(X1,X2,X3,X4,X5,X6) 9/19/2018 ST3131, Lecture 6
MLR Models Multiple Linear Regression Model: 1 Response variable Y SLR is not good enough for handling many practical cases where more predictor variables are involved for predicting the response variable. We need to use Multiple Linear Regression Model. Multiple Linear Regression Model: 1 Response variable Y MLR generalizes SLR Remark: SLR is a special case of MLR When p=1, MLR reduces to SLR 9/19/2018 ST3131, Lecture 6
Estimation of the MLR Model Least Squares Method: Solution: where 9/19/2018 ST3131, Lecture 6
Fitted Values and Squares Decomposition Fitted Regression Line: Fitted values: Residuals: Noise variance estimator: SSE= Sum of Squared Errors, n-(p+1)=degrees of freedom of SSE Observation Decomposition Squares Decomposition SST = SSR + SSE 9/19/2018 ST3131, Lecture 6
Methods of Assessment of Linearity /Quality of Linear Fit A. Coefficient of Determination Proportion of total variability Explained by Linear Regression B. Correlation between responses and fitted values R is called Multiple Correlation Coefficient between Y and multiple predictor variables X1, X2, …, Xp, measuring the linear relationship between Y and X1, X2, …,Xp. 9/19/2018 ST3131, Lecture 6
Methods of Assessment of Linearity /Quality of Linear Fit That is Remark: 9/19/2018 ST3131, Lecture 6
Analysis of the Supervisor Performance Data Example Analysis of the Supervisor Performance Data Results for: P054.txt Correlations: Y, X1, X2, X3, X4, X5, X6 Y X1 X2 X3 X4 X5 X1 0.825 0.000 X2 0.426 0.558 0.019 0.001 X3 0.624 0.597 0.493 0.000 0.001 0.006 X4 0.590 0.669 0.445 0.640 0.001 0.000 0.014 0.000 X5 0.156 0.188 0.147 0.116 0.377 0.409 0.321 0.438 0.542 0.040 X6 0.155 0.225 0.343 0.532 0.574 0.283 0.413 0.233 0.063 0.003 0.001 0.129 Cell Contents: Pearson correlation P-Value 9/19/2018 ST3131, Lecture 6
Regression Analysis: Y versus X1, X2, X3, X4, X5, X6 The regression equation is Y = 10.8 + 0.613 X1 - 0.073 X2 + 0.320 X3 + 0.082 X4 + 0.038 X5 - 0.217 X6 Predictor Coef SE Coef T P Constant 10.79 11.59 0.93 0.362 X1 0.6132 0.1610 3.81 0.001 X2 -0.0731 0.1357 -0.54 0.596 X3 0.3203 0.1685 1.90 0.070 X4 0.0817 0.2215 0.37 0.715 X5 0.0384 0.1470 0.26 0.796 X6 -0.2171 0.1782 -1.22 0.236 S = 7.068 R-Sq = 73.3% R-Sq(adj) = 66.3% 9/19/2018 ST3131, Lecture 6
9/19/2018 ST3131, Lecture 6 Analysis of Variance Source DF SS MS F P Regression 6 3147.97 524.66 10.50 0.000 Residual Error 23 1149.00 49.96 Total 29 4296.97 Source DF Seq SS X1 1 2927.58 X2 1 7.52 X3 1 137.25 X4 1 0.94 X5 1 0.56 X6 1 74.11 9/19/2018 ST3131, Lecture 6
After-class Questions: When is a MLR needed? Can we fit several SLR models instead fitting a single MLR model? 9/19/2018 ST3131, Lecture 6