Simple Linear Regression ANOVA for regression (10.2)

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Topic 12: Multiple Linear Regression
 Population multiple regression model  Data for multiple regression  Multiple linear regression model  Confidence intervals and significance tests.
Chapter 12 Simple Linear Regression
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Linear regression models
Objectives (BPS chapter 24)
Chapter 12 Simple Linear Regression
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
Chapter 10 Simple Regression.
Chapter Topics Types of Regression Models
Introduction to Probability and Statistics Linear Regression and Correlation.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Regression Analysis
Simple Linear Regression and Correlation
Chapter 7 Forecasting with Simple Regression
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Linear Regression/Correlation
Lecture 5 Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Simple Linear Regression Models
BPS - 3rd Ed. Chapter 211 Inference for Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression I 4/9/12 Transformations The model Individual coefficients R 2 ANOVA for regression Residual standard error Section 9.4, 9.5 Professor.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Analisa Regresi Week 7 The Multiple Linear Regression Model
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Chapter 10 Inference for Regression
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/20/12 Multiple Regression SECTIONS 9.2, 10.1, 10.2 Multiple explanatory.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 19 Measure of Variation in the Simple Linear Regression Model (Data)Data.
BPS - 5th Ed. Chapter 231 Inference for Regression.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 4: Inference about regression Priyantha Wijayatunga,
The simple linear regression model and parameter estimation
Inference for Least Squares Lines
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Prepared by Lee Revere and John Large
CHAPTER 12 More About Regression
Linear Regression and Correlation
Linear Regression and Correlation
Presentation transcript:

Simple Linear Regression ANOVA for regression (10.2) Lecture 9 Simple Linear Regression ANOVA for regression (10.2)

Analysis of variance for regression The regression model is: Data = fit + residual yi = (b0 + b1xi) + (ei) where the ei are independent and normally distributed N(0,s), and s is the same for all values of x. Sums of squares measure the variation present in responses. It can be partitioned as: SST = SS model + SS error DFT = DF model + DF error

For a simple linear relationship, the ANOVA tests the hypotheses H0: β1 = 0 versus Ha: β1 ≠ 0 by comparing MSM (model) to MSE (error): F = MSM/MSE When H0 is true, F follows the F(1, n − 2) distribution. The p-value is P(> F). The ANOVA test and the two-sided t-test for H0: β1 = 0 yield the same p-value. Software output for regression may provide t, F, or both, along with the p-value.

ANOVA table Source Sum of squares SS DF Mean square MS F P-value Model 1 SSG/DFG MSG/MSE Tail area above F Error n − 2 SSE/DFE Total n − 1 SST = SSM + SSE DFT = DFM + DFE The standard deviation of the sampling distribution, s, for n sample data points is calculated from the residuals ei = yi – ŷi s is an unbiased estimate of the regression standard deviation σ.

Coefficient of determination, r2 The coefficient of determination, r2, square of the correlation coefficient, is the percentage of the variance in y (vertical scatter from the regression line) that can be explained by changes in x. r 2 = variation in y caused by x (i.e., the regression line) total variation in observed y values around the mean Go over point that missing zero is OK - this is messy data, a prediction based on messy data. Residuals should be scattered randomly around the line, or there is something wrong with your data - not linear, outliers, etc.

What is the relationship between the average speed a car is driven and its fuel efficiency? We plot fuel efficiency (in miles per gallon, MPG) against average speed (in miles per hour, MPH) for a random sample of 60 cars. The relationship is curved. When speed is log transformed (log of miles per hour, LOGMPH) the new scatterplot shows a positive, linear relationship.

SST (sum of squares total) is the sum of the two R-squared is the ratio: SSM/SST=494/552 In this case both tests check the same thing that is why the p-value is identical

Calculations for regression inference To estimate the parameters of the regression, we calculate the standard errors for the estimated regression coefficients. The standard error of the least-squares slope β1 is: The standard error of the intercept β0 is:

To estimate or predict future responses, we calculate the following standard errors The standard error of the mean response µy is: The standard error for predicting an individual response ŷ is:

1918 flu epidemics The line graph suggests that about 7 to 8% of those diagnosed with the flu died within about a week of diagnosis. We look at the relationship between the number of deaths in a given week and the number of new diagnosed cases one week earlier.

P-value for H0: β = 0; Ha: β ≠ 0 R2 = SSM / SST P-value for H0: β = 0; Ha: β ≠ 0 SSM SST 595349

Inference for correlation To test for the null hypothesis of no linear association, we have the choice of also using the correlation parameter ρ. When x is clearly the explanatory variable, this test is equivalent to testing the hypothesis H0: β = 0. When there is no clear explanatory variable (e.g., arm length vs. leg length), a regression of x on y is not any more legitimate than one of y on x. In that case, the correlation test of significance should be used. When both x and y are normally distributed H0: ρ = 0 tests for no association of any kind between x and y—not just linear associations.

The test of significance for ρ uses the one-sample t-test for: H0: ρ = 0. We compute the t statistics for sample size n and correlation coefficient r. The p-value is the area under t (n – 2) for values of T as extreme as t or more in the direction of Ha:

Relationship between average car speed and fuel efficiency p-value n There is a significant correlation (r is not 0) between fuel efficiency (MPG) and the logarithm of average speed (LOGMPH).