1 Multivariate Linear Regression Models Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia
2 Regression Analysis A statistical methodology –For predicting value of one or more response (dependent) variables –Predict from a collection of predictor (independent) variable values
3 Example 7.1 Fitting a Straight Line Observed data Linear regression model z1z1z1z y14389
4 Example 7.1 Fitting a Straight Line z y
5 Classical Linear Regression Model
6
7 Example 7.1
8 Examples 6.6 & 6.7
9 Example 7.2 One-Way ANOVA
10 Method of Least Squares
11 Result 7.1
12 Proof of Result 7.1
13 Proof of Result 7.1
14 Example 7.1 Fitting a Straight Line Observed data Linear regression model z1z1z1z y14389
15 Example 7.3
16 Coefficient of Determination
17 Geometry of Least Squares
18 Geometry of Least Squares
19 Projection Matrix
20 Result 7.2
21 Proof of Result 7.2
22 Proof of Result 7.2
23 Result 7.3 Gauss Least Square Theorem
24 Proof of Result 7.3
25 Result 7.4
26 Proof of Result 7.4
27 Proof of Result 7.4
28 Proof of Result 4.11
29 Proof of Result 7.4
30 Proof of Result 7.4
31 Proof of Result 7.4
32 2 Distribution
33 Result 7.5
34 Proof of Result 7.5
35 Example 7.4 (Real Estate Data) 20 homes in a Milwaukee, Wisconsin, neighborhood Regression model
36 Example 7.4
37 Result 7.6
38 Effect of Rank In situations where Z is not of full rank, rank(Z) replaces r+1 and rank(Z 1 ) replaces q+1 in Result 7.6
39 Proof of Result 7.6
40 Proof of Result 7.6
41 Wishart Distribution
42 Generalization of Result 7.6
43 Example 7.5 (Service Ratings Data)
44 Example 7.5: Design Matrix
45 Example 7.5
46 Result 7.7
47 Proof of Result 7.7
48 Result 7.8
49 Proof of Result 7.8
50 Example 7.6 (Computer Data)
51 Example 7.6
52 Adequacy of the Model
53 Residual Plots
54 Q-Q Plots and Histograms Used to detect the presence of unusual observations or severe departures from normality that may require special attention in the analysis If n is large, minor departures from normality will not greatly affect inferences about
55 Test of Independence of Time
56 Example 7.7: Residual Plot
57 Leverage “ Outliers ” in either the response or explanatory variables may have a considerable effect on the analysis and determine the fit Leverage for simple linear regression with one explanatory variable z
58 Mallow’s C p Statistic Select variables from all possible combinations
59 Usage of Mallow’s C p Statistic
60 Stepwise Regression 1. The predictor variable that explains the largest significant proportion of the variation in Y is the first variable to enter 2. The next to enter is the one that makes the highest contribution to the regression sum of squares. Use Result 7.6 to determine the significance (F-test)
61 Stepwise Regression 3. Once a new variable is included, the individual contributions to the regression sum of squares of the other variables already in the equation are checked using F -tests. If the F-statistic is small, the variable is deleted 4. Steps 2 and 3 are repeated until all possible additions are non- significant and all possible deletions are significant
62 Treatment of Colinearity If Z is not of full rank, Z’Z does not have an inverse Colinear Not likely to have exact colinearity Possible to have a linear combination of columns of Z that are nearly 0 Can be overcome somewhat by –Delete one of a pair of predictor variables that are strongly correlated –Relate the response Y to the principal components of the predictor variables
63 Bias Caused by a Misspecified Model
64 Example 7.3 Observed data Regression model z 1 z y 1 y y 2 y
65 Multivariate Multiple Regression
66 Multivariate Multiple Regression
67 Multivariate Multiple Regression
68 Multivariate Multiple Regression
69 Multivariate Multiple Regression
70 Multivariate Multiple Regression
71 Example 7.8
72 Example 7.8
73 Result 7.9
74 Proof of Result 7.9
75 Proof of Result 7.9
76 Proof of Result 7.9
77 Forecast Error
78 Forecast Error
79 Result 7.10
80 Result 7.11
81 Example 7.9
82 Other Multivariate Test Statistics
83 Predictions from Regressions
84 Predictions from Regressions
85 Predictions from Regressions
86 Example 7.10
87 Example 7.10
88 Example 7.10
89 Linear Regression
90 Result 7.12
91 Proof of Result 7.12
92 Proof of Result 7.12
93 Population Multiple Correlation Coefficient
94 Example 7.11
95 Linear Predictors and Normality
96 Result 7.13
97 Proof of Result 7.13
98 Invariance Property
99 Example 7.12
100 Example 7.12
101 Prediction of Several Variables
102 Result 7.14
103 Example 7.13
104 Example 7.13
105 Partial Correlation Coefficient
106 Example 7.14
107 Mean Corrected Form of the Regression Model
108 Mean Corrected Form of the Regression Model
109 Mean Corrected Form for Multivariate Multiple Regressions
110 Relating the Formulations
111 Example 7.15 Example 7.6, classical linear regression model Example 7.12, joint normal distribution, best predictor as the conditional mean Both approaches yielded the same predictor of Y 1
112 Remarks on Both Formulation Conceptually different Classical model –Input variables are set by experimenter –Optimal among linear predictors Conditional mean model –Predictor values are random variables observed with the response values –Optimal among all choices of predictors
113 Example 7.16 Natural Gas Data
114 Example 7.16 : First Model
115 Example 7.16 : Second Model