Class 15: Tuesday, Nov. 2 Multiple Regression (Chapter 11, Moore and McCabe).

Slides:



Advertisements
Similar presentations
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Chapter 13 Multiple Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Stat 112: Lecture 17 Notes Chapter 6.8: Assessing the Assumption that the Disturbances are Independent Chapter 7.1: Using and Interpreting Indicator Variables.
Stat 112: Notes 1 Main topics of course: –Simple Regression –Multiple Regression –Analysis of Variance –Chapters 3-9 of textbook Readings for Notes 1:
Lecture 23: Tues., Dec. 2 Today: Thursday:
Chapter 12 Simple Regression
Chapter 12 Multiple Regression
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Lecture 23: Tues., April 6 Interpretation of regression coefficients (handout) Inference for multiple regression.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Lecture 6 Notes Note: I will homework 2 tonight. It will be due next Thursday. The Multiple Linear Regression model (Chapter 4.1) Inferences from.
Stat 112: Lecture 8 Notes Homework 2: Due on Thursday Assessing Quality of Prediction (Chapter 3.5.3) Comparing Two Regression Models (Chapter 4.4) Prediction.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Chapter 11 Multiple Regression.
Lecture 23 Multiple Regression (Sections )
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Ch. 14: The Multiple Regression Model building
Stat Notes 5 p-values for one-sided tests Caution about forecasting outside the range of the explanatory variable (Chapter 3.7.2) Fitting a linear.
Lecture 22 – Thurs., Nov. 25 Nominal explanatory variables (Chapter 9.3) Inference for multiple regression (Chapter )
Stat Notes 4 Chapter 3.5 Chapter 3.7.
Class 20: Thurs., Nov. 18 Specially Constructed Explanatory Variables –Dummy variables for categorical variables –Interactions involving dummy variables.
Lecture 20 – Tues., Nov. 18th Multiple Regression: –Case Studies: Chapter 9.1 –Regression Coefficients in the Multiple Linear Regression Model: Chapter.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
5-3 Inference on the Means of Two Populations, Variances Unknown
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Simple Linear Regression Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Correlation & Regression
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Hypothesis Testing in Linear Regression Analysis
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
STA302/ week 911 Multiple Regression A multiple regression model is a model that has more than one explanatory variable in it. Some of the reasons.
Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Multiple Regression I 4/9/12 Transformations The model Individual coefficients R 2 ANOVA for regression Residual standard error Section 9.4, 9.5 Professor.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Lecture 4 Introduction to Multiple Regression
Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Stat 112 Notes 6 Today: –Chapter 4.1 (Introduction to Multiple Regression)
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/20/12 Multiple Regression SECTIONS 9.2, 10.1, 10.2 Multiple explanatory.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Stat 112 Notes 6 Today: –Chapters 4.2 (Inferences from a Multiple Regression Analysis)
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Stat 112 Notes 8 Today: –Chapters 4.3 (Assessing the Fit of a Regression Model) –Chapter 4.4 (Comparing Two Regression Models) –Chapter 4.5 (Prediction.
Chapter 11: Linear Regression E370, Spring From Simple Regression to Multiple Regression.
Chapter 14 Introduction to Multiple Regression
Regression Analysis: Emergency Calls to the New York Auto Club
Stat 112 Notes 4 Today: Review of p-values for one-sided tests
Simple Linear Regression
Presentation transcript:

Class 15: Tuesday, Nov. 2 Multiple Regression (Chapter 11, Moore and McCabe).

Example: Predicting Emergency Calls to the AAA Club The AAA club of New York provides Emergency Road Service (ERS) to its members. This service is especially useful in the winter months, when people can be stranded with frozen locks, dead batteries, weather induced accidents and spinning tires. If the weather is very bad, the club can be overwhelmed with calls. By tracking the weather conditions the club can divert resources from other club activities to the ERS for projected peak days. In order to be able to allocate its resources efficiently, the club would like to be able to predict ERS calls from the weather forecast on the previous day.

Data The club has available for 28 past days in the winter, –the number of ERS calls to New York AAA offices –the forecasted average temperature ([forecast high + forecast low])/2 –the range of the forecasted temperatures (forecast high – forecast low) –whether rain is forecast (0 if no rain in forecast, 1 if rain in forecast) –whether snow is forecast (0 if no snow in forecast, 1 if snow in forecast) –whether the day is a weekday (1 if M, T, W, Th, F, 0 if Sat or Sun) –whether the day is a Sunday (1 if Sunday, 0 if not) –whether a subzero temperature is forecast (1 if subzero temperature forecast, 0 if not) Source: New York Motorist, March, 1994

Simple Linear Regression using Average Forecast Temperature Root mean square error = calls. Can we do better by using more variables than just average forecast temperature to predict the calls?

Multiple Linear Regression Model Model for the distribution of Y for the subpopulation of units with explanatory variables Multiple Linear regression model: – –The distribution of is normal with mean and SD –Observations are independent.

Multiple Linear Regression in JMP Analyze, Fit Model Put response variable in Y Click on explanatory variables and then click Add under Construct Model Effects Click Run Model.

Root mean square error = calls for multiple regression, compared to calls for simple linear regression on average temperature.

Making Predictions Suppose we want to estimate the average number of calls for New York AAA offices for a day when –The average temperature is predicted to be 20. –The temperature range is predicted to be 10 degrees. –No rain is in the forecast. –Snow is in the forecast –It is a weekday (so weekday=1, Sunday=0) –The temperature is not predicted to be subzero. The estimated mean number of calls for days with these properties is

Residuals and Root Mean Square Errors Residual for observation i = prediction error for observation i = Root mean square error = Typical size of absolute value of prediction error As with simple linear regression model, if multiple linear regression model holds –About 68% of the observations will be within one RMSE of their predicted value –About 95% of the observations will be within two RMSEs of their predicted value –About 99% of the observations will be within three RMSEs of their predicted value For New York AAA data, about 95% of the time, the actual number of ERS calls will be within 2* = of the predicted number of calls based on the multiple linear regression of calls on average forecast temperature, forecasted range of temperature, rain forecast, snow forecast, weekday, Sunday and subzero.

Regression Coefficients Interpretation of coefficient:. An increase in one degree in the average temperature is associated with a decrease of AAA calls, on average, assuming that all other variables are held constant (e.g., assuming nothing else changes).

Inferences About Regression Coefficients t-test for regression coefficient j tests versus This answers the question, is variable useful for predicting Y when the other variables (e.g., the other X’s) are already included in the model. If is not rejected (p-value >0.05), it means that we don’t need to include in the model if we have all the other X’s in the model (either Xj is not useful in predicting Y or it is redundant) Range of plausible values for = 95% confidence interval for = For New York AAA data, p-value for t-test that average temperature coefficient equals 0 is average temperature is not a useful predictor once we have already taken into account range, rain, snow, weekday, Sunday, subzero. 95% confidence interval for =

R-Squared R-squared: As in simple linear regression, measures proportion of variability in Y explained by the regression of Y on these X’s. Between 0 and 1, nearer to 1 indicates more variability explained.

Overall F-test Test of whether any of the predictors are useful: vs. at least one of does not equal zero. Tests whether the model provides better predictions than the sample mean of Y. p-value for the test: Prob>F in Analysis of Variance table. p-value = 0.005, strong evidence that at least one of the predictors is useful for predicting ERS for the New York AAA club.

Prediction Intervals and CIs for Mean Response Approximate 95% prediction interval for observation with : Exact 95% prediction interval from JMP that takes into account uncertainty in estimates of regression coefficients: –Create a row with but no Y. –After Fit Model, click red triangle next to Response, click Save Columns and click Indiv Confid Intervals. Saves columns with lower and upper bounds of 95% confidence intervals. 95% confidence interval for the mean response in JMP: Follow same procedure as for 95% prediction interval, but when you click Save Columns, click Mean Confid Intervals instead.

CIs for Mean Response and Prediction Intervals for AAA data For –The average temperature is predicted to be 20. –The temperature range is predicted to be 10 degrees. –No rain is in the forecast. –Snow is in the forecast –It is a weekday –The temperature is not predicted to be subzero 95% Confidence Interval for the mean response: (-29.46, ) 95% prediction interval ( , )

Next class: –Checking the simple linear regression model. –More on interpretation of multiple regression coefficients. Multiple regression as a method for controlling for known lurking variables. –Hand out on final project. –Hand out HW6, due next Thursday.