Lecture Unit 8 8.4 Multiple Regression.

Slides:



Advertisements
Similar presentations
Simple Linear Regression 1. review of least squares procedure 2
Advertisements

Multiple Regression. Introduction In this chapter, we extend the simple linear regression model. Any number of independent variables is now allowed. We.
Simple Linear Regression Analysis
Multiple Regression and Model Building
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Forecasting Using the Simple Linear Regression Model and Correlation
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Multiple Regression Analysis
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
Lecture 22 Multiple Regression (Sections )
Chapter 13 Introduction to Linear Regression and Correlation Analysis
1 Multiple Regression. 2 Introduction In this chapter we extend the simple linear regression model, and allow for any number of independent variables.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Lecture 23 Multiple Regression (Sections )
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
1 8.4 Multiple Regression Lecture Unit Introduction In this section we extend simple linear regression where we had one explanatory variable,
Lecture 5 Correlation and Regression
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Chapter 14 Simple Regression
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Economics 173 Business Statistics Lecture 19 Fall, 2001© Professor J. Petry
Lecture 10: Correlation and Regression Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Economics 173 Business Statistics Lecture 18 Fall, 2001 Professor J. Petry
Chapter 12 Simple Regression Statistika.  Analisis regresi adalah analisis hubungan linear antar 2 variabel random yang mempunyai hub linear,  Variabel.
Multiple Regression Reference: Chapter 18 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Multiple Regression Indicator (Dummy) Variables Interaction Terms
Inference for Least Squares Lines
Statistics for Managers using Microsoft Excel 3rd Edition
Correlation and Simple Linear Regression
Linear Regression and Correlation Analysis
Correlation and Simple Linear Regression
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
PENGOLAHAN DAN PENYAJIAN
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Introduction to Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Lecture Unit 8 8.4 Multiple Regression

8.4 Introduction In this section we extend simple linear regression where we had one explanatory variable, and allow for any number of explanatory variables. We expect to build a model that fits the data better than the simple linear regression model.

Introduction We shall use computer printout to Assess the model How well it fits the data Is it useful Are any required conditions violated? Employ the model Interpreting the coefficients Predictions using the prediction equation Estimating the expected value of the dependent variable

Multiple Regression Model We allow for k explanatory variables to potentially be related to the response variable y = b0 + b1x1+ b2x2 + …+ bkxk + e Coefficients Random error variable Dependent variable Independent variables

The Multiple Regression Model Idea: Examine the linear relationship between 1 response variable (y) & 2 or more explanatory variables (xi) Population model: Y-intercept Population slopes Random Error Estimated multiple regression model: Estimated (or predicted) value of y Estimated intercept Estimated slope coefficients

Simple Linear Regression y Observed Value of y for xi εi Slope = b1 Predicted Value of y for xi Random Error for this x value Intercept = b0 xi x

Multiple Regression, 2 explanatory variables * Y 2 1 Least Squares Plane (instead of line) Scatter of points around plane are random error.

Multiple Regression Model Two variable model y Sample observation yi yi < e = (yi – yi) < x2i x2 x1i < The best fit equation, y , is found by minimizing the sum of squared errors, e2 x1

Required conditions for the error variable The error e is normally distributed. The mean is equal to zero and the standard deviation is constant (se) for all values of y. The errors are independent.

8.4 Estimating the Coefficients and Assessing the Model The procedure used to perform regression analysis: Obtain the model coefficients and statistics using statistical software. Diagnose violations of required conditions. Try to remedy problems when identified. Assess the model fit using statistics obtained from the sample. If the model assessment indicates good fit to the data, use it to interpret the coefficients and generate predictions.

Estimating the Coefficients and Assessing the Model, Example Predicting final exam scores in BUS/ST 350 We would like to predict final exam scores in 350. Use information generated during the semester. Predictors of the final exam score: Exam 1 Exam 2 Exam 3 Homework total

Estimating the Coefficients and Assessing the Model, Example Data were collected from 203 randomly selected students from previous semesters The following model is proposed final exam = b0 + b1exam1 + b2exam2 + b3exam3 + b4hwtot exam 1 exam2 exam3 hwtot finalexm 80 60 159 72 70 75 359 76 95 90 330 84 100 92 272 64 344 85 351 88 35 200 55 251 40 293

Regression Analysis, Excel Output This is the sample regression equation (sometimes called the prediction equation) Final exam score = 0.0498 + 0.1002exam1 + 0.1541exam2 + 0.2960exam3 +0.1077hwtot

Interpreting the Coefficients b0 = 0.0498. This is the intercept, the value of y when all the variables take the value zero. Since the data range of all the independent variables do not cover the value zero, do not interpret the intercept. b1 = 0.1002. In this model, for each additional point on exam 1, the final exam score increases on average by 0.1002 (assuming the other variables are held constant).

Interpreting the Coefficients b2 = 0.1541. In this model, for each additional point on exam 2, the final exam score increases on average by 0.1541 (assuming the other variables are held constant). b3 = 0.2960. For each additional point on exam 3, the final exam score increases on average by 0.2960 (assuming the other variables are held constant). b4 = 0.1077. For each additional point on the homework, the final exam score increases on average by 0.1077 (assuming the other variables are held constant).

Final Exam Scores, Predictions Predict the average final exam score of a student with the following exam scores and homework score: Exam 1 score 75, Exam 2 score 79, Exam 3 score 85, Homework score 310 Use trend function in Excel Final exam score = 0.0498 + 0.1002(75) +0.1541(79) + 0.2960(85) + 0.1077(310) = 78.2857

Model Assessment The model is assessed using three tools: The standard error of the residuals The coefficient of determination The F-test of the analysis of variance The standard error of the residuals participates in building the other tools.

Standard Error of Residuals The standard deviation of the residuals is estimated by the Standard Error of the Residuals: The magnitude of se is judged by comparing it to

Regression Analysis, Excel Output Standard error of the residuals; sqrt(MSE) (standard error of the residuals)2: MSE=SSE/198 Sum of squares of residuals SSE

Standard Error of Residuals From the printout, se = 11.5122…. Calculating the mean value of y we have It seems se is not particularly small. Question: Can we conclude the model does not fit the data well?

Coefficient of Determination R2 (like r2 in simple linear regression The proportion of the variation in y that is explained by differences in the explanatory variables x1, x2, …, xk R = 1 – (SSE/SSTotal) From the printout, R2 = 0.382466… 38.25% of the variation in final exam score is explained by differences in the exam1, exam2, exam3, and hwtot explanatory variables. 61.75% remains unexplained. When adjusted for degrees of freedom, Adjusted R2 = 36.99%

Testing the Validity of the Model We pose the question: Is there at least one explanatory variable linearly related to the response variable? To answer the question we test the hypothesis H0: b1 = b2 = … = bk=0 H1: At least one bi is not equal to zero. If at least one bi is not equal to zero, the model has some validity.

Testing the Validity of the Final Exam Scores Regression Model The hypotheses are tested by what is called an F test shown in the Excel output below MSR/MSE P-value k = n–k–1 = n-1 = SSR MSR=SSR/k MSE=SSE/(n-k-1) SSE

Testing the Validity of the Final Exam Scores Regression Model [Variation in y] = SSR + SSE. Large F results from a large SSR. Then, much of the variation in y is explained by the regression model; the model is useful, and thus, the null hypothesis H0 should be rejected. Reject H0 when P-value < 0.05

Testing the Validity of the Final Exam Scores Regression Model Conclusion: There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At least one of the bi is not equal to zero. Thus, at least one explanatory variable is linearly related to y. This linear regression model is valid The P-value (Significance F) < 0.05 Reject the null hypothesis.

Testing the Coefficients The hypothesis for each bi is Excel printout H0: bi = 0 H1: bi ¹ 0 Test statistic d.f. = n - k -1