Chapter 12 Multiple Linear Regression Doing it with more variables! More is better. Chapter 12A.

Slides:



Advertisements
Similar presentations
3.3 Hypothesis Testing in Multiple Linear Regression
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Topic 12: Multiple Linear Regression
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
12-1 Multiple Linear Regression Models Introduction Many applications of regression analysis involve situations in which there are more than.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
12 Multiple Linear Regression CHAPTER OUTLINE
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Chapter 10 Curve Fitting and Regression Analysis
Linear regression models
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
Simple Linear Regression
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 13 Multiple Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 12 Multiple Regression
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 11 Multiple Regression.
Multiple Linear Regression
Ch. 14: The Multiple Regression Model building
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression Analysis
Variance and covariance Sums of squares General linear models.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Objectives of Multiple Regression
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 13: Inference in Regression
Regression Analysis (2)
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Chapter 13 Multiple Regression
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Chapter 10 The t Test for Two Independent Samples
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Regression Analysis1. 2 INTRODUCTION TO EMPIRICAL MODELS LEAST SQUARES ESTIMATION OF THE PARAMETERS PROPERTIES OF THE LEAST SQUARES ESTIMATORS AND ESTIMATION.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Chapter 14 Introduction to Multiple Regression
Chapter 4 Basic Estimation Techniques
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Chapter 3 Multiple Linear Regression
Simple Linear Regression
Presentation transcript:

Chapter 12 Multiple Linear Regression Doing it with more variables! More is better. Chapter 12A

What are we doing?

12-1 Multiple Linear Regression Models Many applications of regression analysis involve situations in which there are more than one regressor variable. A regression model that contains more than one regressor variable is called a multiple regression model.

Introduction For example, suppose that the effective life of a cutting tool depends on the cutting speed and the tool angle. A possible multiple regression model could be where Y – tool life x 1 – cutting speed x 2 – tool angle

The Model Y =   +   X 1 +   X 2 + … +  k X k +  More than one regressor or predictor variable. Linear in the unknown parameters – the  ’s. The   - intercept,  i - partial regression coefficients,  – errors. Can handle nonlinear functions as predictors, e.g. X 3 = Z 2. Interactions can be present, e.g.   X 1 X 2.

The Data The data collection step in a regression analysis

A Data Example Example – Oakland games won: 13 =   +   *      *   Similar equation for every data point. More equations than beta’s

Least Squares Estimation of the Parameters The least squares function is given by The least squares estimates must satisfy

The least squares normal Equations The solution to the normal Equations are the least squares estimators of the regression coefficients.

The Matrix Approach where Vector of predicted values Our observations – the predictor variables Unknown vector of error terms – possibly normally distributed The vector of coefficients we must estimate.

Solving those normal equations

Least-Squares in Matrix Form

More Matrix Approach

Example 12-2 Wire bonding is a method of making interconnections between a microchip and other electronics as part of semiconductor device fabrication.

Example 12-2

Some Basic Terms and Concepts Residuals are estimators of the error term in the regression model: We use an unbiased estimator of the variance of the error term. SS E is called the residual sums of squares and n-p is the residual degrees of freedom. ‘residual’ – what remains after the regression explains all of the variability in the data it can.

Estimating  2 An unbiased estimator of  2 is

Properties of the Least Squares Estimators Note that in this treatment, the elements of X are not random variables. They are the observed values of the x ij. We treat them as though they are constants, often coefficients of random variables like the  i. The first result says that the estimators are unbiased. The second result shows the covariance structure of the estimators – diagonal and off-diagonal elements It is important to remember that in a typical multiple regression model the estimates of the coefficients are not independent of one another.

Properties of the Least Squares Estimators Unbiased estimators: Covariance Matrix:

Covariance Matrix of the Regression Coefficients In general, we do not know  2. We estimate it by the mean square error of the residuals (estimated standard error) the quality of our estimates of the regression coefficients is very much related to (X’X) -1. the estimates of the coefficients are not independent

Test for Significance of Regression The appropriate hypotheses are The test statistic is

ANOVA The basic idea is that the data (the y i values) has some variability – if it didn’t there would be nothing to explain. A successful model explains most of the variability, leaving little to be carried by the error term.

R2R2 The coefficient of multiple determination

The Adjusted R 2 The adjusted R 2 statistic penalizes the analyst for adding terms to the model. It can help guard against overfitting (including regressors that are not really useful)

Tests on Individual Regression Coefficients and Subsets of Coefficients The test statistic is Reject H 0 if |t 0 | > t  /2,n-p. This is called a partial or marginal test H 0 :  j =  j0 H 1 :  j =  j0

Linear Independence of the Predictors - some random thoughts Instabilities in regression coefficients will occur where the values of one of the predictors are ‘nearly’ a linear combination of other predictors. It would be incredibly unlikely that you would get an exact linear dependence. Coming close is bad enough. What is the dimension of the space you are working in? It is n, where n is the number of data points in your sample. The prediction you are trying to match is an n dimensional vector. You are trying to match it with a set of k (k << n) predictors. The predictors had better be related to the prediction if this is going to be successful!

Interactions and Higher Order Terms – still thinking randomly Including interaction terms (products of two predictors), higher order terms, or functions of predictors does not make the model nonlinear. Suppose you believe that the following relation may apply: Y =  0 +  1 X 1 +  22 X 2 X 2 +  23 X 2 X 3 +  4 exp(X 4 ) +  This is still a linear regression model – linear in the beta’s. After recording the values of X 1 through X 4, you simply calculate the values of the predictors into the columns of the worksheet for the regression software. The model would become nonlinear if you were trying to estimate a parameter inside of the exponential function, e.g.  4 exp(  4e X 4 ).

The NFL Again – problem Predictor variables Att pass attempts Comp – completed passes Pct Comp = percent completed passes Yds – yards gained passing Yds per Att – yards gained per pass attempt Pct TD = percent of attempts that are TDs Long – longest pass completion Int – number of interceptions Pct Int – percentage of attempts that are interceptions Response Variable – quarterback rating

The NFL Again – problem 12-15

Fit a multiple regression model using Pct Comp, Pct TD, and the Pct Int Estimate  2 Determine the standard errors of the regression coefficients Predict the rating when Pct Comp = 60%, Pct TD is 4%, and the Pct Int = 3%

Now the solutions

More NFL – problem Test the regression model for significance using  =.05 Find the p-value conduct a t-test on each regression coefficient These are very good problems to answer.

Again with the answers

Even more answers

Next Time Confidence Intervals, again Modeling and Model Adequacy Also, Doing it with Computers Computers are good.