Download presentation
Presentation is loading. Please wait.
1
Multiple Regression (1)
Shakeel Nouman M.Phil Statistics Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
2
11 Multiple Regression (1) Using Statistics
The k-Variable Multiple Regression Model The F Test of a Multiple Regression Model How Good is the Regression Tests of the Significance of Individual Regression Parameters Testing the Validity of the Regression Model Using the Multiple Regression Model for Prediction Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
3
11 Multiple Regression (2) Qualitative Independent Variables
Polynomial Regression Nonlinear Models and Transformations Multicollinearity Residual Autocorrelation and the Durbin-Watson Test Partial F Tests and Variable Selection Methods The Matrix Approach to Multiple Regression Analysis Summary and Review of Terms Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
4
11-1 Using Statistics x y x2 x1 Lines Planes
Slope: 1 Intercept: 0 Any two points (A and B), or an intercept and slope (0 and 1), define a line on a two-dimensional surface. B A x y x2 x1 C Any three points (A, B, and C), or an intercept and coefficients of x1 and x2 (0 , 1, and 2), define a plane in a three-dimensional surface. Lines Planes Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
5
11-2 The k-Variable Multiple Regression Model
The population regression model of a dependent variable, Y, on a set of k independent variables, X1, X2,. . . , Xk is given by: Y= 0 + 1X1 + 2X kXk + where 0 is the Y-intercept of the regression surface and each i , i = 1,2,...,k is the slope of the regression surface - sometimes called the response surface - with respect to Xi. x2 x1 y 2 1 0 Model assumptions: 1. ~N(0,2), independent of other errors. 2. The variables Xi are uncorrelated with the error term. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
6
Simple and Multiple Least-Squares Regression
In a simple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression line. In a multiple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression plane. X Y x2 x1 y Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
7
The Estimated Regression Relationship
where is the predicted value of Y, the value lying on the estimated regression surface. The terms b0,...,k are the least-squares estimates of the population regression parameters i. The actual, observed value of Y is the predicted value plus an error: yj = b0+ b1 x1j+ b2 x2j bk xkj+e Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
8
Least-Squares Estimation: The 2-Variable Normal Equations
Minimizing the sum of squared errors with respect to the estimated coefficients b0, b1, and b2 yields the following normal equations: Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
9
--- --- --- --- ---- --- ---- ----
Example 11-1 Y X1 X2 X1X2 X12 X22 X1Y X2Y Normal Equations: 743 = 10b0+123b1+65b2 9382 = 123b0+1615b1+869b2 5040 = 65b0+869b1+509b2 b0 = b1 = b2 = Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
10
Example 11-1: Using the Template
Regression results for Alka-Seltzer sales Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11
Decomposition of the Total Deviation in a Multiple Regression Model
x2 x1 y Total Deviation = Regression Deviation + Error Deviation SST = SSR SSE Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
12
11-3 The F Test of a Multiple Regression Model
A statistical test for the existence of a linear relationship between Y and any or all of the independent variables X1, x2, ..., Xk: H0: 1 = 2 = ...= k=0 H1: Not all the i (i=1,2,...,k) are 0 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
13
Using the Template: Analysis of Variance Table (Example 11-1)
F D i s t r b u o n w h 2 a d 7 e g f m F0.01=9.55 =0.01 Test statistic 86.34 f(F) The test statistic, F = 86.34, is greater than the critical point of F(2, 7) for any common level of significance (p-value 0), so the null hypothesis is rejected, and we might conclude that the dependent variable is related to one or more of the independent variables. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
14
11-4 How Good is the Regression
x2 x1 y Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
15
Example 11-1: s = 1.911 R-sq = 96.1% R-sq(adj) = 95.0%
Decomposition of the Sum of Squares and the Adjusted Coefficient of Determination SST SSR SSE Example 11-1: s = R-sq = 96.1% R-sq(adj) = 95.0% Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
16
Measures of Performance in Multiple Regression and the ANOVA Table
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
17
11-5 Tests of the Significance of Individual Regression Parameters
Hypothesis tests about individual regression slope parameters: (1) H0: b1= 0 H1: b1 0 (2) H0: b2 = 0 H1: b2 0 . (k) H0: bk = 0 H1: bk 0 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
18
Regression Results for Individual Parameters
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
19
Example 11-1: Using the Template
Regression results for Alka-Seltzer sales Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
20
Using the Template: Example 11-2
Regression results for Exports to Singapore Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
21
11-6 Testing the Validity of the Regression Model: Residual Plots
Residuals vs M1 It appears that the residuals are randomly distributed with no pattern and with equal variance as M1 increases Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
22
11-6 Testing the Validity of the Regression Model: Residual Plots
Residuals vs Price It appears that the residuals are increasing as the Price increases. The variance of the residuals is not constant. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
23
Normal Probability Plot for the Residuals: Example 11-2
Linear trend indicates residuals are normally distributed Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
24
Investigating the Validity of the Regression: Outliers and Influential Observations
. * Outlier y x Regression line without outlier Regression line with outlier Outliers Point with a large value of xi * Regression line when all data are included No relationship in this cluster Influential Observations Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
25
Outliers and Influential Observations: Example 11-2
Unusual Observations Obs M1 EXPORTS Fit Stdev.Fit Residual St.Resid X X R R R R R denotes an obs. with a large st. resid. X denotes an obs. whose X value gives it large influence. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
26
11-7 Using the Multiple Regression Model for Prediction
Sales Advertising Promotions 8.00 18.00 3 12 63.42 89.76 Estimated Regression Plane for Example 11-1 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
27
Prediction in Multiple Regression
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
28
MOVIE EARN COST PROM BOOK
11-8 Qualitative (or Categorical) Independent Variables (in Regression) MOVIE EARN COST PROM BOOK EXAMPLE 11-3 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
29
Picturing Qualitative Variables in Regression
x2 x1 y b3 X1 Y Line for X2=1 Line for X2=0 b0 b0+b2 A regression with one quantitative variable (X1) and one qualitative variable (X2): A multiple regression with two quantitative variables (X1 and X2) and one qualitative variable (X3): Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
30
Picturing Qualitative Variables in Regression: Three Categories and Two Dummy Variables
X1 Y Line for X = 0 and X3 = 1 A regression with one quantitative variable (X1) and two qualitative variables (X2 and X2): b0+b2 b0+b3 Line for X2 = 1 and X3 = 0 Line for X2 = 0 and X3 = 0 A qualitative variable with r levels or categories is represented with (r-1) 0/1 (dummy) variables. Category X2 X3 Adventure Drama Romance Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
31
Using Qualitative Variables in Regression: Example 11-4
Salary = Education Experience Gender (SE) (32.6) (45.1) (78.5) (212.4) (t) (262.2) (21.0) (16.0) (-15.3) On average, female salaries are $3256 below male salaries Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
32
Interactions between Quantitative and Qualitative Variables: Shifting Slopes
X1 Y Line for X2=0 b0+b2 b0 Line for X2=1 Slope = b1 Slope = b1+b3 A regression with interaction between a quantitative variable (X1) and a qualitative variable (X2 ): Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
33
11-9 Polynomial Regression
One-variable polynomial regression model: Y=0+1 X + 2X2 + 3X mXm + where m is the degree of the polynomial - the highest power of X appearing in the equation. The degree of the polynomial is the order of the model. X1 Y Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
34
Polynomial Regression: Example 11-5
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
35
Polynomial Regression: Other Variables and Cross-Product Terms
Variable Estimate Standard Error T-statistic X X X X X1X Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
36
11-10 Nonlinear Models and Transformations: Multiplicative Model
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
37
Transformations: Exponential Model
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
38
Plots of Transformed Variables
1 5 3 2 A D V E R T S L i m p l e g r s o n f a d v t . O G ( ) - q u = 8 9 Y 6 7 + X 4 Y-HAT I P : Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
39
Variance Stabilizing Transformations
Square root transformation: Useful when the variance of the regression errors is approximately proportional to the conditional mean of Y Logarithmic transformation: Useful when the variance of regression errors is approximately proportional to the square of the conditional mean of Y Reciprocal transformation: Useful when the variance of the regression errors is approximately proportional to the fourth power of the conditional mean of Y Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
40
Regression with Dependent Indicator Variables
y x 1 Logistic Function The logistic function: Transformation to linearize the logistic function: Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
41
11-11: Multicollinearity x2 x1 x2 x1
Orthogonal X variables provide information from independent sources. No multicollinearity. Perfectly collinear X variables provide identical information content. No regression. Some degree of collinearity. Problems with regression depend on the degree of collinearity. x2 x1 A high degree of negative collinearity also causes problems with regression. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
42
Effects of Multicollinearity
Variances of regression coefficients are inflated. Magnitudes of regression coefficients may be different from what are expected. Signs of regression coefficients may not be as expected. Adding or removing variables produces large changes in coefficients. Removing a data point may cause large changes in coefficient estimates or signs. In some cases, the F ratio may be significant while the t ratios are not. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
43
Detecting the Existence of Multicollinearity: Correlation Matrix of Independent Variables and Variance Inflation Factors Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
44
Variance Inflation Factor
Relationship between VIF and Rh2 1 . 5 Rh2 VIF Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
45
Variance Inflation Factor (VIF)
Observation: The VIF (Variance Inflation Factor) values for both variables Lend and Price are both greater than 5. This would indicate that some degree of multicollinearity exists with respect to these two variables. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
46
Solutions to the Multicollinearity Problem
Drop a collinear variable from the regression Change in sampling plan to include elements outside the multicollinearity range Transformations of variables Ridge regression Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
47
11-12 Residual Autocorrelation and the Durbin-Watson Test
An autocorrelation is a correlation of the values of a variable with values of the same variable lagged one or more periods back. Consequences of autocorrelation include inaccurate estimates of variances and inaccurate predictions. Lagged Residuals i i i i-2 i-3 i-4 * * * * * * * * * * The Durbin-Watson test (first-order autocorrelation): H0: 1 = 0 H1: 0 The Durbin-Watson test statistic: Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
48
n dL dU dL dU dL dU dL dU dL dU
Critical Points of the Durbin-Watson Statistic: =0.05, n= Sample Size, k = Number of Independent Variables k = 1 k = 2 k = k = 4 k = 5 n dL dU dL dU dL dU dL dU dL dU Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
49
Using the Durbin-Watson Statistic
Positive Autocorrelation Test is Inconclusive No Autocorrelation Test is Inconclusive Negative Autocorrelation dL dU 4-dU 4-dL 4 For n = 67, k = 4: dU dU2.27 dL dL2.53 < 2.58 H0 is rejected, and we conclude there is negative first-order autocorrelation. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
50
11-13 Partial F Tests and Variable Selection Methods
Full model: Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 + Reduced model: Y = 0 + 1 X1 + 2 X2 + Partial F test: H0: 3 = 4 = 0 H1: 3 and 4 not both 0 Partial F statistic: where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared errors of the full model; MSEF is the mean square error of the full model [MSEF = SSEF/(n-(k+1))]; r is the number of variables dropped from the full model. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
51
Variable Selection Methods
All possible regressions Run regressions with all possible combinations of independent variables and select best model A p-value of indicates that we should reject the null hypothesis H0: the slopes for Lend and Exch. are zero. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
52
Variable Selection Methods
Stepwise procedures Forward selection Add one variable at a time to the model, on the basis of its F statistic Backward elimination Remove one variable at a time, on the basis of its F statistic Stepwise regression Adds variables to the model and subtracts variables from the model, on the basis of the F statistic Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
53
Stepwise Regression Compute F statistic for each variable not in the model Enter most significant (smallest p-value) variable into model Calculate partial F for all variables in the model Is there a variable with p-value > Pout? Remove variable Stop Yes No Is there at least one variable with p-value > Pin? Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
54
Stepwise Regression: Using the Computer (MINITAB)
MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’ 'EXCHANGE' Stepwise Regression F-to-Enter: F-to-Remove: Response is EXPORTS on 4 predictors, with N = 67 Step Constant M T-Ratio PRICE T-Ratio S R-Sq Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
55
Using the Computer: MINITAB
MTB > REGRESS 'EXPORTS’ 'M1’ 'LEND’ 'PRICE' 'EXCHANGE'; SUBC> vif; SUBC> dw. Regression Analysis The regression equation is EXPORTS = M LEND PRICE EXCHANGE Predictor Coef Stdev t-ratio p VIF Constant M LEND PRICE EXCHANGE s = R-sq = 82.5% R-sq(adj) = 81.4% Analysis of Variance SOURCE DF SS MS F p Regression Error Total Durbin-Watson statistic = 2.58 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
56
Using the Computer: SAS (continued)
Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T| INTERCEP M LEND PRICE EXCHANGE Variance Variable DF Inflation INTERCEP M LEND PRICE EXCHANGE Durbin-Watson D (For Number of Obs.) 1st Order Autocorrelation Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
57
11-15: The Matrix Approach to Regression Analysis (1)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
58
The Matrix Approach to Regression Analysis (2)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
59
(Degree awarded by GC University)
Name Shakeel Nouman Religion Christian Domicile Punjab (Lahore) Contact # E.Mail M.Phil (Statistics) GC University, . (Degree awarded by GC University) M.Sc (Statistics) GC University, . Statitical Officer (BS-17) (Economics & Marketing Division) Livestock Production Research Institute Bahadurnagar (Okara), Livestock & Dairy Development Department, Govt. of Punjab Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.