Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.

Similar presentations


Presentation on theme: "Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function."— Presentation transcript:

1 Lab 4 Multiple Linear Regression

2 Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function of several explanatory variables

3 Ways of analysis  Matrix of scatterplots  Matrix of correlations  Regression: fit the model (variable selection); interpret the model, t-test & f-test in regression; prediction; diagnostics (linearity, constant var, normality, independence, outliers).

4 The independent variable, the response  The response: iq  The independent variables: MILK: 0=no breast milk, 1=yes FEM: 0=male kid, 1=female WEEKS: weeks in ventilation SOCIAL: mum’s social class  1,2,3,4 with 1 being the highest RANK: birth order of the kid EDUC: mum’s education level  1,2,3,4,5 with 5 being the highest

5 Matrix of scatterplots

6 Correlation among iq, weeks, social, educ, rank

7 Matrix of correlations

8

9 Regression-fit the model  Procedure Analyze  Regression  Linear  Methods of determining independent variables

10 Methods (details in instruction 4 P18)  Enter: The model is obtained with all specified variables. This is the default method.  Stepwise  Remove  Backward: The variables are removed from the model one by one if the meet the criterion for removal (a maximum significance level or a minimum F value).  Forward:

11 Regression-interpret model  Interpretation of the output 1. variables entered/removed 2. model summaries (R, R^2) 3. ANOVA test (f-test)

12 Note on f-test  To test overall significance of the model  its null distribution: f-distribution  To further construct extra-sum-of- squares f-test

13 4. Coefficients (estimation, t-test, CI of coefficients)  t-test in i-th row  CI of coefficients

14 Note on t-test and CI of coefficients  t-test to test the significance of a single independent variable can be one-sided its null distribution: t-distribution  95% CI of coefficients estimation of the range of its coefficient with 95% confidence i.e. the 95% changing range of Y with 1 unit increase in its corresponding X

15 Regression-prediction  Point estimation  Confidence interval of the mean (CI)  Prediction interval of one observation (PI)  e.g.

16

17 Multiple Regression-Diagnostics Obtain plots to test the validity of the assumptions Linearity: Residuals vs predicted value (Y) / explanatory variable (X) Constant variance: Residuals vs predicted value (Y) / explanatory variable (X) Normality: QQ plot of residuals Independence: residuals versus the time order of the observations Outliers and influential observations:

18 What is an influential observation?  An observation is influential if removing it markedly changes the estimated coefficients of the regression model.  An outlier may be an influential observation.

19 To identify outliers and/or influential observations  Studentized Residuals A case may be considered an outlier if the absolute value of its studentized residual exceeds 2.  Leverage Values The leverage for an observation is larger than 2p/n would imply the observation has a high potential for influence.  Cook ’ s Distances If Cook ’ s distance is close to or larger than 1, the case may be considered influential.

20

21 Miscellanies  Multicollinearity it exists if the correlation between independent variables is close to or higher than 0.85  Remember to use Ln(WEEKS) from Question 5

22 Miscellanies  Understanding meaning of 95% CI of coefficients  Identify “full model” and “reduced model” when doing extra-sum-of- squares f-test


Download ppt "Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function."

Similar presentations


Ads by Google