Rainfall Example The data set contains cord yield (bushes per acre) and rainfall (inches) in six US corn-producing states (Iowa, Nebraska, Illinois, Indiana,

Slides:



Advertisements
Similar presentations
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Advertisements

1 Multiple Regression Response, Y (numerical) Explanatory variables, X 1, X 2, …X k (numerical) New explanatory variables can be created from existing.
Multiple Regression [ Cross-Sectional Data ]
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 29 Multiple Regression.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Interaksi Dalam Regresi (Lanjutan) Pertemuan 25 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regresi dan Rancangan Faktorial Pertemuan 23 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Chapter 11 Multiple Regression.
Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building.
Multiple Linear Regression
© 2004 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Lecture 20 – Tues., Nov. 18th Multiple Regression: –Case Studies: Chapter 9.1 –Regression Coefficients in the Multiple Linear Regression Model: Chapter.
Lecture 21 – Thurs., Nov. 20 Review of Interpreting Coefficients and Prediction in Multiple Regression Strategy for Data Analysis and Graphics (Chapters.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
STA302/ week 111 Multicollinearity Multicollinearity occurs when explanatory variables are highly correlated, in which case, it is difficult or impossible.
Chapter 14 Introduction to Multiple Regression
STA302/ week 911 Multiple Regression A multiple regression model is a model that has more than one explanatory variable in it. Some of the reasons.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
1 G Lect 6M Comparing two coefficients within a regression equation Analysis of sets of variables: partitioning the sums of squares Polynomial curve.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Regression Models for Quantitative (Numeric) and Qualitative (Categorical) Predictors KNNL – Chapter 8.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Week 101 ANOVA F Test in Multiple Regression In multiple regression, the ANOVA F test is designed to test the following hypothesis: This test aims to assess.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
12b - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part II.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Chapter 15 Multiple Regression Model Building
Chapter 14 Introduction to Multiple Regression
CHAPTER 7 Linear Correlation & Regression Methods
Meadowfoam Example Continuation
Analysis of Variance in Matrix form
Multiple Regression Analysis and Model Building
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multiple Regression II
Diagnostics and Transformation for SLR
CHAPTER 29: Multiple Regression*
Multiple Regression II
Indicator Variables Often, a data set will contain categorical variables which are potential predictor variables. To include these categorical variables.
Quadrat sampling Quadrat shape Quadrat size Lab Regression and ANCOVA
Korelasi Parsial dan Pengontrolan Parsial Pertemuan 14
Multiple Linear Regression
Regression and Categorical Predictors
Diagnostics and Transformation for SLR
Multicollinearity Multicollinearity occurs when explanatory variables are highly correlated, in which case, it is difficult or impossible to measure their.
Presentation transcript:

Rainfall Example The data set contains cord yield (bushes per acre) and rainfall (inches) in six US corn-producing states (Iowa, Nebraska, Illinois, Indiana, Missouri and Ohio). Straight line model is not adequate – up to 12″ rainfall yield increases and then starts to decrease. A better model for this data is a quadratic model: Yield = β0 + β1∙rain + β2∙rain2 + ε. This is still a multiple linear regression model since it is linear in the β’s. However, we can not interpret individual coefficients, since we can’t change one variable while holding the other constant… STA302/1001 - week 11

More on Rainfall Example Examination of residuals (from quadratic model) versus year showed that perhaps there is a pattern of an increase over time. Fit a model with year… To assess whether yield’s relationship with rainfall depends on year we include an interaction term in the model… STA302/1001 - week 11

Interaction Two predictor variables are said to interact if the effect that one of them has on the response depends on the value of the other. To include interaction term in a model we simply the have to take the product of the two predictor variables and include the resulting variable in the model and an additional predictor. Interaction terms should not routinely be added to the model. Why? We should add interaction terms when the question of interest has to do with interaction or we suspect interaction exists (e.g., from plot of residuals versus interaction term). If an interaction term for 2 predictor variables is in the model we should also include terms for predictor variables as well even if their coefficients are not statistically significant different from 0. STA302/1001 - week 11

Indicator Variables Often, a data set will contain categorical variables which are potential predictor variables. To include these categorical variables in the model we define dummy variables. A dummy variable takes only two values, 0 and 1. In categorical variable with j categories we need j-1 indictor variables. STA302/1001 - week 11

Meadowfoam Example Meadowfoam is a small plant found in the US Pacific Northwest. Its seed oil is unique among vegetable oils for its long carbon strings, and it is nongreasy and highly stable. A study was conducted to find out how to elevate meadowfoam production to a profitable crop. In a growth chamber, plants were grown under 6 light intensities (in micromol/m^2/sec) and two timings of the onset of the light treatment, either late (coded 0) or early (coded 1). The response variable is the average number of flowers per plant for 10 seedlings grown under each of the 12 treatment conditions. This is an example of an experiment in which we can make causal conclusions. There are two explanatory variables, light intensity and timing. There are 24 data points, 2 at each treatment combination. STA302/1001 - week 11

Question of Interests What is the effect of timing on the seedling growth? What are the effects of the different light intensity? Does the effect of intensity depend on timing? STA302/1001 - week 11

Indicator Variables in Meadowfoam Example To include the variable time in the model we define a dummy variable that takes the value 1 if early timing and the value 0 if late timing. The variable intensity has 6 levels (150, 300, 450, 600, 750, 900). We will treat these levels as 6 categories. It is useful to do so if we expect a complex relationship between response variable and intensity and if the goal is to determine which intensity level is “best”. The cost in using dummy variables is degrees of freedom since we need multiple dummy variables for each of the multiple categories. We define the dummy variables as follows…. STA302/1001 - week 11

Partial F-test Partial F-test is designed to test whether a subset of β’s are 0 simultaneously. The approach has two steps. First we fit a model with all predictor variables. We call this model the “full model”. Then we fit a model without the predictor variables whose coefficients we are interested in testing. We call this model the “reduced model”. We then compare the SSReg and RSS in these two models…. STA302/1001 - week 11

Test Statistic for Partial F-test To test whether some of the coefficients of the explanatory variables are all 0 we use the following test statistic: . Where Extra SS = RSSred - RSSfull, and Extra df = number of parameters being tested. To get the Extr SS in SAS we can simply fit two regressions (reduced and full) or we can look at Type I SS which are also called Sequential Sum of Squares. The Sequential SS gives the additional contribution to SSR each variable gives over and above variables previously listed. The Sequential SS depends on which order variables are stated in model statement; the variables whose coefficients we want to test must be listed last. STA302/1001 - week 11

Meadowfoam Example Continuation Suppose now we treat the variable light intensity as a quantitative variable. There are three possible models to look at the relationship between seedling growth and the two predictor variables… If we want to know whether the effect of light intensity on number of flowers per plant depends on timing we need to include in the model an interaction term…. STA302/1001 - week 11

Meadowfoam Example – Summary of Findings There is no evidence that the effect of light intensity on flowers depends on timing (P-value = 0.91). That means that the interaction effect is not significant. If interaction did exist, it is difficult to talk about the effect of light intensity on Y, as it varies with timing. Since the interaction was not significant, we remove it from the model. For same timing, increasing light intensity by 100 micromol/m2/sec decreases the mean number of flower per plant by 4.0 flowers / per plant. 95% CI: (-5.1, -3) For same light intensity, beginning the light treatment early increases the mean number of flowers per plant by 12.2 flowers / plants. 95% CI (6.7, 17.6). STA302/1001 - week 11