Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Topic 15: General Linear Tests and Extra Sum of Squares.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
EPI 809/Spring Probability Distribution of Random Error.
Simple Linear Regression and Correlation
Objectives (BPS chapter 24)
Topic 3: Simple Linear Regression. Outline Simple linear regression model –Model parameters –Distribution of error terms Estimation of regression parameters.
Chapter 10: Inferential for Regression 1.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
Multiple regression analysis
Matrix A matrix is a rectangular array of elements arranged in rows and columns Dimension of a matrix is r x c  r = c  square matrix  r = 1  (row)
Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)
Chapter 12 Simple Regression
Chapter 12 Multiple Regression
Statistics for Business and Economics
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Topic 16: Multicollinearity and Polynomial Regression.
Topic 28: Unequal Replication in Two-Way ANOVA. Outline Two-way ANOVA with unequal numbers of observations in the cells –Data and model –Regression approach.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
Topic 2: An Example. Leaning Tower of Pisa Construction began in 1173 and by 1178 (2 nd floor), it began to sink Construction resumed in To compensate.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Topic 7: Analysis of Variance. Outline Partitioning sums of squares Breakdown degrees of freedom Expected mean squares (EMS) F test ANOVA table General.
Lecture 4 SIMPLE LINEAR REGRESSION.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Introduction to Linear Regression
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Topic 17: Interaction Models. Interaction Models With several explanatory variables, we need to consider the possibility that the effect of one variable.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Topic 6: Estimation and Prediction of Y h. Outline Estimation and inference of E(Y h ) Prediction of a new observation Construction of a confidence band.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
Topic 13: Multiple Linear Regression Example. Outline Description of example Descriptive summaries Investigation of various models Conclusions.
Analisa Regresi Week 7 The Multiple Linear Regression Model
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
ANOVA for Regression ANOVA tests whether the regression model has any explanatory power. In the case of simple regression analysis the ANOVA test and the.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Topic 23: Diagnostics and Remedies. Outline Diagnostics –residual checks ANOVA remedial measures.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Topic 30: Random Effects. Outline One-way random effects model –Data –Model –Inference.
Topic 25: Inference for Two-Way ANOVA. Outline Two-way ANOVA –Data, models, parameter estimates ANOVA table, EMS Analytical strategies Regression approach.
Topic 26: Analysis of Covariance. Outline One-way analysis of covariance –Data –Model –Inference –Diagnostics and rememdies Multifactor analysis of covariance.
Topic 4: Statistical Inference. Outline Statistical inference –confidence intervals –significance tests Statistical inference for β 1 Statistical inference.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Chapter 20 Linear and Multiple Regression
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Presentation transcript:

Topic 14: Inference in Multiple Regression

Outline Review multiple linear regression Inference of regression coefficients –Application to book example Inference of mean –Application to book example Inference of future observation Diagnostics and remedies

Data for Multiple Regression Y i is the response variable X i1, X i2, …, X i,p-1 are the p-1 explanatory variables Y i, X i1, X i2, …, X i,p-1 are the data for case i, where i = 1 to n

Multiple Regression Model Y i = β 0 + β 1 X i1 + β 2 X i2 +…+ β p-1 X i,p-1 + e i Y i is the value of the response variable for the i th case β 0 is the intercept β 1, β 2, …, β p-1 are the regression coefficients for the explanatory variables e i are independent Normally distributed random errors with mean 0 and variance σ 2

Least Squares Solutions s 2 = MSE= s = Root MSE

ANOVA F-test H 0 : β 1 = β 2 = … = β p-1 = 0 H a : β k ≠ 0, for at least one k=1,2,…,p-1 Under H 0, F ~ F(p-1,n-p) Reject H 0 if F is large, using P-value we reject if the P-value ≤ 0.05

Inference for individual regression coefficients We can show b ~ N(β, σ 2 (X΄X) -1 ) Define

Significance Test for β k H 0 : β k = 0 Same test statistic t * = b k /s(b k ) Still use df E which now equals n-p P-value computed from t(n-p) dist This tests the significance of a variable given the other variables are already in the model (i.e., fitted last)

Confidence interval for β k CI: b k ± t c s(b k ), where t c = t(.975, n-p) Same form as before but df E now equals n-p This interval describes region of b k given the other variables are in the model

Example II (KNNL p 236) Dwaine Studios, Inc. operates portrait studios in 21 cities of medium size Y i is sales in city i X 1 : population aged 16 and under X 2 : per capita disposable income

Read in the data data a1; infile ‘../data/ch06fi05.txt'; input young income sales; proc print data=a1; run;

Partial Proc Print Results Obs young income sales

Proc Reg proc reg data=a1; model sales=young income; run;

Output Analysis of Variance SourceDF Sum of Squares Mean SquareF ValuePr > F Model <.0001 Error Corrected Total Root MSE R-Square0.917 At least one variable is helpful in predicting in sales

Output Parameter Estimates VariableDF Parameter Estimate Standard Errort ValuePr > |t| Intercept young <.0001 income Both variables are helpful in explaining sales after the other is already in the model

CLB option Used to get confidence intervals for each coefficient proc reg data=a1; model sales=young income/clb; run;

Output Parameter Estimates VariableDF Parameter Estimate Standard Error 95% Confidence Limits Intercept young income

What if just young fit? Parameter Estimates VariableDF Parameter Estimate Standard Error 95% Confidence Limits Intercept young CIs for both the intercept and young change dramatically when just young as explanatory variable

Estimation of E(Y h ) X h is now a vector that looks like (1, X h1, X h2, …, X h,p-1 )΄ We want a point estimate and a confidence interval for the subpopulation mean corresponding to the set of explanatory variables X h

Theory for E(Y h )

Using CLM option proc reg data=a1; model sales=young income/clm; id young income; run; Adds them to output table

CLM Output Output Statistics Obsyoungincome Dependent Variable Predicted Value Std Error Mean Predict95% CL Mean

Prediction of Y h X h is still a vector of form (1, X h1, X h2, …, X h,p-1 )΄ We want a prediction of Y h based on a set of predictor values with an interval that expresses the uncertainty in our prediction

Theory for Y h

Using the CLI option proc reg data=a1; model sales=young income/cli; id young income; run; Adds them to output table

CLI Output Output Statistics Obsyoungincome Dependent Variable Predicted Value Std Error Mean Predict95% CL Predict

Diagnostics Look at the distribution of each variable Look at the relationship between pairs of variables Plot the residuals versus –the predicted/fitted values –each explanatory variable –time (if available)

Diagnostics Are the residuals approximately Normal –Look at a histogram –Normal quantile plot Is the variance constant –Plot the residuals vs anything that might be related to the variance (e.g. residuals vs predicted values & residuals versus each X)

Remedies Similar remedies as simple regression Transformations such as Box-Cox Analyze with/without outliers More detail in KNNL Ch 9 and 10

Background Reading We finished Chapter 6. Program used to generate output for confidence intervals for means and prediction intervals is topic14.sas