BUSI 410 Business Analytics

Slides:



Advertisements
Similar presentations
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Advertisements

Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Simple Regression Model
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Chapter 10 Simple Regression.
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
Multiple Regression and Correlation Analysis
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Understanding Multivariate Research Berry & Sanders.
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained.
Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Class 23 The most over-rated statistic The four assumptions The most Important hypothesis test yet Using yes/no variables in regressions.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
ANOVA for Regression ANOVA tests whether the regression model has any explanatory power. In the case of simple regression analysis the ANOVA test and the.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Lecture 10: Correlation and Regression Model.
Environmental Modeling Basic Testing Methods - Statistics III.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Stat 112 Notes 6 Today: –Chapters 4.2 (Inferences from a Multiple Regression Analysis)
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Stat 112 Notes 8 Today: –Chapters 4.3 (Assessing the Fit of a Regression Model) –Chapter 4.4 (Comparing Two Regression Models) –Chapter 4.5 (Prediction.
Just one quick favor… Please use your phone or laptop Please take just a minute to complete Course Evaluations online….. Check your for a link or.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Chapter 13 Simple Linear Regression
Lecture #25 Tuesday, November 15, 2016 Textbook: 14.1 and 14.3
Chapter 14 Introduction to Multiple Regression
Chapter 20 Linear and Multiple Regression
*Bring Money for Yearbook!
Chapter 15 Multiple Regression and Model Building
Regression Analysis AGEC 784.
Inference for Least Squares Lines
*Bring Money for Yearbook!
Let’s Get It Straight! Re-expressing Data Curvilinear Regression
Chapter 9 Multiple Linear Regression
Multiple Regression Analysis and Model Building
STAT 250 Dr. Kari Lock Morgan
Chapter 11 Simple Regression
Relationship with one independent variable
Regression Statistics
Chapter 13 Simple Linear Regression
Business Statistics, 4e by Ken Black
CHAPTER 29: Multiple Regression*
LESSON 24: INFERENCES USING REGRESSION
Prepared by Lee Revere and John Large
Statistics in WR: Lecture 12
Multiple Regression BPS 7e Chapter 29 © 2015 W. H. Freeman and Company.
Multiple Regression Models
Hypothesis testing and Estimation
Simple Linear Regression
Multiple Regression Chapter 14.
Relationship with one independent variable
Chapter 13 Additional Topics in Regression Analysis
Business Statistics, 4e by Ken Black
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Chapter 13 Simple Linear Regression
Presentation transcript:

BUSI 410 Business Analytics Module 19: Multiple Regression

Last lecture Simple linear regression (with one driver) Reading regression output Presenting regression equation in a report

Something is missing… We assumed 𝜀 is normal, is independent of the driver, and is independent of each other. Is it so? Skew: 1.2

Multiple regression Multiple regression—a regression with multiple independent variables (but still a single dependent variable) Multiple regression accounts for the joint impact of multiple drivers

Sakura Motors: What drives car emission? Use you logic to choose drivers Fuel economy Acceleration Weight Passenger capacity Engine displacement Cylinders Horsepower

Sakura Motors: Choose a varied sample We can explain variation in emission by variations in the drivers… only if there are variations!

Sakura Motors: Regression output looks good R Square – % of variation in the dependent variable explained by the variation in the independent variables SUMMARY OUTPUT Regression Statistics Multiple R 0.963233 R Square 0.927817 Adjusted R Square 0.908383 Standard Error 0.643631 Observations 34 ANOVA   df SS MS F Significance F Regression 7 138.4448 19.77783 47.74239 3.05E-13 Residual 26 10.7708 0.414261 Total 33 149.2156 Coefficients t Stat P-value Lower 95% Upper 95% Intercept 9.164994 1.903614 4.814523 5.48E-05 5.252059 13.07793 MPG -0.22612 0.036559 -6.18514 1.53E-06 -0.30127 -0.15097 seconds 0.229146 0.09864 2.323068 0.028265 0.02639 0.431903 liters 0.414272 0.293367 1.412128 0.16977 -0.18875 1.017296 pounds (K) 0.544133 0.294646 1.846736 0.076198 -0.06152 1.149786 cylinders -0.03513 0.188284 -0.18656 0.853456 -0.42215 0.351897 horsepower -0.00052 0.003688 -0.14111 0.888875 -0.0081 0.00706 passengers -0.08552 0.119738 -0.7142 0.481466 -0.33164 0.160607 Significance F – the p-value for H0: all slopes are zero (R Square = 0) vs. H1: at least one slope is non-zero (R Square > 0)

Sakura Motors: A closer look   Coefficients P-value Intercept 9.164993684 5.48333E-05 MPG -0.226120196 1.52989E-06 seconds 0.229146363 0.028265174 liters 0.414271531 0.169769828 pounds (K) 0.544132795 0.076198053 cylinders -0.0351256 0.853456304 horsepower -0.000520352 0.888874572 passengers -0.085516635 0.481466463

Sakura Motors: A closer look Insignificant drivers   Coefficients P-value Intercept 9.164993684 5.48333E-05 MPG -0.226120196 1.52989E-06 seconds 0.229146363 0.028265174 liters 0.414271531 0.169769828 pounds (K) 0.544132795 0.076198053 cylinders -0.0351256 0.853456304 horsepower -0.000520352 0.888874572 passengers -0.085516635 0.481466463 “Wrong” signs

(Multi)collinearity: Two or more drivers being highly correlated Rule of thumb: multicollinearity if |correlation| > 0.7 between drivers Multicollinearity symptoms Important drivers appear insignificant Coefficients have “wrong” signs Increased forecast standard error

Sakura Motors: Confirming multicollinearity MPG seconds liters pounds (K) cylinders horsepower passengers 1 -0.05 -0.81 -0.77 -0.74 -0.53 -0.17 -0.01 -0.19 -0.36 0.84 0.92 0.76 0.59 0.81 0.72 0.70 0.77 0.60 0.55

How to reduce multicollinearity? Remove irrelevant or redundant information Use parsimony (the “KISS” principle) “Everything should be made as simple as possible, but not simpler” – Albert Einstein Use liters to represent engine size (eliminate cylinders, horsepower) Use pounds to represent car size (eliminate passengers)

Sakura Motors: Partial regression output SUMMARY OUTPUT Regression Statistics Multiple R 0.962329107 R Square 0.92607731 (was 0.9278) Adjusted R Square 0.915881076 Standard Error 0.616732707 (was 0.6436) Observations 34 ANOVA   df SS MS F Significance F Regression 4 138.1851705 34.54629 90.82543 5.70804E-16 Residual 29 11.03041774 0.380359 Total 33 149.2155882 Coefficients t Stat P-value Lower 95% Upper 95% Intercept 8.98999042 1.802313247 4.988029 2.62E-05 5.303846 12.6761348 MPG -0.228397885 0.034089787 -6.69989 2.38E-07 -0.298119327 -0.15867644 seconds 0.239516296 0.086938163 2.755019 0.010033 0.061707792 0.4173248 liters 0.360546138 0.197403115 1.826446 0.078094 -0.043188557 0.76428083 pounds (K) 0.426995446 0.235862056 1.810361 0.080613 -0.055396615 0.90938751 all signs “correct” all drivers significant at 0.1

Sakura Motors: Is the partial model worse? Removing a driver always decreases R2 Partial F-test is a hypothesis test for H0: 𝑅 𝑝 2 = 𝑅 𝑓 2 (the partial model has as much explanatory power as the full model) H1: 𝑅 𝑝 2 < 𝑅 𝑓 2 (the partial model has less explanatory power than the full model) The p-value is equal to F.DIST.RT(Reduced R2 per removed driver/Full model’s unexplained variations per residual DF, # of removed drivers, Full model’s residual DF) F.DIST.RT( 0.927817−0.926077 3 / 1−0.927817 26 ,3,26)=0.89

Exercise: Tackle multicollinearity Predict salary using age, experience and education? Drop age Predict basketball performance using height and weight? Use height and body mass index (BMI) Predict election result using % of income groups? Drop (any) one group

Sakura Motors: Final check on residuals Residuals seem independent of drivers

Sakura Motors: Final check on residuals Skewness = 0.46 Residuals seem approximately normal

Today’s assignment Lab Practice 7, due on 11/9 by class time Homework 4, due on 11/14 by class time Group Case Project, due today by midnight Project self- and peer-evaluation (Canvas=>Quizzes), 1 bonus pt

For next class Bring your laptop, textbook and course pack to Lab Session