An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.

Slides:



Advertisements
Similar presentations
 Population multiple regression model  Data for multiple regression  Multiple linear regression model  Confidence intervals and significance tests.
Advertisements

Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Objectives (BPS chapter 24)
July 1, 2008Lecture 17 - Regression Testing1 Testing Relationships between Variables Statistics Lecture 17.
Chapter 12 Linear Regression and Correlation
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Note 14 of 5E Statistics with Economics and Business Applications Chapter 12 Multiple Regression Analysis A brief exposition.
Announcements: Next Homework is on the Web –Due next Tuesday.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Simple Linear Regression Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i.
Simple Linear Regression and Correlation
Chapter 7 Forecasting with Simple Regression
Part 7: Multiple Regression Analysis 7-1/54 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Descriptive measures of the strength of a linear association r-squared and the (Pearson) correlation coefficient r.
Hypothesis tests for slopes in multiple linear regression model Using the general linear test and sequential sums of squares.
Simple linear regression Linear regression with one predictor variable.
Regression Analysis (2)
Announcements: Homework 10: –Due next Thursday (4/25) –Assignment will be on the web by tomorrow night.
Linear Lack of Fit (LOF) Test An F test for checking whether a linear regression function is inadequate in describing the trend in the data.
M23- Residuals & Minitab 1  Department of ISM, University of Alabama, ResidualsResiduals A continuation of regression analysis.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Econ 3790: Business and Economics Statistics
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Introduction to Probability and Statistics Chapter 12 Linear Regression and Correlation.
Introduction to Linear Regression
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Chap 13-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 12.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Copyright ©2011 Nelson Education Limited Linear Regression and Correlation CHAPTER 12.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Sequential sums of squares … or … extra sums of squares.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Simple Linear Regression ANOVA for regression (10.2)
ANOVA for Regression ANOVA tests whether the regression model has any explanatory power. In the case of simple regression analysis the ANOVA test and the.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
The general linear test approach to regression analysis.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Chapter 12 Simple Linear Regression.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Inference for  0 and 1 Confidence intervals and hypothesis tests.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Chapter 9 Minitab Recipe Cards. Contingency tests Enter the data from Example 9.1 in C1, C2 and C3.
Interaction regression models. What is an additive model? A regression model with p-1 predictor variables contains additive effects if the response function.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 19 Measure of Variation in the Simple Linear Regression Model (Data)Data.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Announcements There’s an in class exam one week from today (4/30). It will not include ANOVA or regression. On Thursday, I will list covered material and.
Chapter 20 Linear and Multiple Regression
Relationship with one independent variable
Relationship with one independent variable
F test for Lack of Fit The lack of fit test..
Presentation transcript:

An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table

Translating a research question into a statistical procedure Is there a (linear) relationship between skin cancer mortality and latitude? –How ?? –Also, the (analysis of variance) F-test Is there a (linear) relationship between height and grade point average? –How ?? –Also, the (analysis of variance) F-test

Where does this topic fit in? Model formulation Model estimation Model evaluation Model use

Example: Skin cancer mortality and latitude

The regression equation is Mort = Lat Predictor Coef SE Coef T P Constant Lat S = R-Sq = 68.0% R-Sq(adj) = 67.3% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

Example: Skin cancer mortality and latitude

The regression equation is Mort = Lat Predictor Coef SE Coef T P Constant Lat S = R-Sq = 68.0% R-Sq(adj) = 67.3% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

Example: Height and GPA

The regression equation is gpa = height Predictor Coef SE Coef T P Constant height S = R-Sq = 0.3% R-Sq(adj) = 0.0% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

Example: Height and GPA

The regression equation is gpa = height Predictor Coef SE Coef T P Constant height S = R-Sq = 0.3% R-Sq(adj) = 0.0% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

The basic idea Break down the total variation in y (“total sum of squares”) into two components: –a component that is “due to” the change in x (“regression sum of squares”) –a component that is just due to random error (“error sum of squares”) If the regression sum of squares is a large component of the total sum of squares, it suggests that there is a linear association between x and y.

A geometric decomposition

The decomposition holds for the sum of the squared deviations, too: Total sum of squares (SSTO) Regression sum of squares (SSR) Error sum of squares (SSE)

Breakdown of degrees of freedom Degrees of freedom associated with SSTO Degrees of freedom associated with SSR Degrees of freedom associated with SSE

Example: Skin cancer mortality and latitude The regression equation is Mort = Lat Predictor Coef SE Coef T P Constant Lat S = R-Sq = 68.0% R-Sq(adj) = 67.3% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

Definitions of Mean Squares Similarly, the regression mean square (MSR) is defined as: We already know the mean square error (MSE) is defined as:

Expected Mean Squares If β 1 = 0, we’d expect the ratio MSR/MSE to be … If β 1 ≠ 0, we’d expect the ratio MSR/MSE to be … Use ratio, MSR/MSE, to reject whether or not β 1 = 0.

Analysis of Variance (ANOVA) Table Source of variation DFSSMSF Regression1 Residual error n-2 Totaln-1

The formal F-test for slope parameter β 1 Null hypothesis H 0 : β 1 = 0 Alternative hypothesis H A : β 1 ≠ 0 Test statistic P-value = What is the probability that we’d get an F* statistic as large as we did, if the null hypothesis is true? The P-value is determined by comparing F* to an F distribution with 1 numerator degree of freedom and n-2 denominator degrees of freedom.

Row Year Men200m Winning times (in seconds) in Men’s 200 meter Olympic sprints, Are men getting faster?

Analysis of Variance Table Analysis of Variance Source DF SS MS F P Regression Residual Error Total DF E = n-2 = 22-2 = 20 DF TO = n-1 = 22-1 = 21 MSR = SSR/1 = 15.8 MSE = SSE/(n-2) = 1.8/20 = 0.09 F* = MSR/MSE = /0.089 = P = Probability that an F(1,20) random variable is greater than = 0.000…

For simple linear regression model, the F-test and t-test are equivalent. Predictor Coef SE Coef T P Constant Year Analysis of Variance Source DF SS MS F P Regression Residual Error Total

Equivalence of F-test to t-test For a given α level, the F-test of β 1 = 0 versus β 1 ≠ 0 is algebraically equivalent to the two-tailed t-test. Will get exactly same P-values, so… –If one test rejects H 0, then so will the other. –If one test does not reject H 0, then so will the other.

Should I use the F-test or the t-test? The F-test is only appropriate for testing that the slope differs from 0 (β 1 ≠ 0). Use the t-test to test that the slope is positive (β 1 > 0) or negative (β 1 < 0). F-test is more useful for multiple regression model when we want to test that more than one slope parameter is 0.

Getting ANOVA table in Minitab The Analysis of Variance (ANOVA) Table is default output for either command: –Stat >> Regression >> Regression … –Stat >> Regression >> Fitted line plot …

Stat >> Regression >> Regression

Stat >> Regression >> Fitted line plot

Example: Is number of stories linearly related to building height?

The regression equation is HEIGHT = STORIES Predictor Coef SE Coef T P Constant STORIES S = R-Sq = 90.4% R-Sq(adj) = 90.2% Analysis of Variance Source DF SS MS F P Regression Residual Error Total