Simple Linear Regression

Slides:



Advertisements
Similar presentations
Objectives 10.1 Simple linear regression
Advertisements

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Correlation and Regression
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
1 Multiple Regression Response, Y (numerical) Explanatory variables, X 1, X 2, …X k (numerical) New explanatory variables can be created from existing.
Chapter 12 Simple Regression
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Ch. 14: The Multiple Regression Model building
Correlation and Regression Analysis
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Simple Linear Regression Analysis
Relationships Among Variables
Advantages of Multivariate Analysis Close resemblance to how the researcher thinks. Close resemblance to how the researcher thinks. Easy visualisation.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
1 MULTI VARIATE VARIABLE n-th OBJECT m-th VARIABLE.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
1 Prices of Antique Clocks Antique clocks are sold at auction. We wish to investigate the relationship between the age of the clock and the auction price.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
1 Quadratic Model In order to account for curvature in the relationship between an explanatory and a response variable, one often adds the square of the.
Environmental Modeling Basic Testing Methods - Statistics III.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Outliers and influential data points. No outliers?
Applied Quantitative Analysis and Practices LECTURE#31 By Dr. Osman Sadiq Paracha.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
1 Model Selection Response: Highway MPG Explanatory: 13 explanatory variables Indicator variables for types of car – Sports Car, SUV, Wagon, Minivan.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
1 Simple Linear Regression Example - mammals Response variable: gestation (length of pregnancy) days Explanatory: brain weight.
Chapter 8 Part I Answers The explanatory variable (x) is initial drop, measured in feet, and the response variable (y) is duration, measured in seconds.
Stats Methods at IC Lecture 3: Regression.
Section 11.1 Day 3.
Regression and Correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Correlation and Regression
Understanding Standards Event Higher Statistics Award
Correlation and Simple Linear Regression
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
No notecard for this quiz!!
Multiple Regression Models
residual = observed y – predicted y residual = y - ŷ
Unit 3 – Linear regression
AP Statistics, Section 3.3, Part 1
CHAPTER 3 Describing Relationships
Correlation and Simple Linear Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
CHAPTER 12 More About Regression
Simple Linear Regression and Correlation
Correlation and Regression
Indicator Variables Response: Highway MPG
3.2. SIMPLE LINEAR REGRESSION
Introduction to Regression
Introductory Statistics Introductory Statistics
Presentation transcript:

Simple Linear Regression Example - mammals Response variable: gestation (length of pregnancy) days Explanatory: brain weight

“Man” Extreme negative residual but that residual is not statistically significant. The extreme brain weight of “man” creates high leverage that is statistically significant.

“Man” Is the point for “Man” influencing where the simple linear regression line is going? Is this influence statistically significant?

Simple Linear Regression Predicted Gestation = 85.25 + 0.30*Brain Weight R2 = 0.372, so only 37.2% of the variation in gestation is explained by the linear relationship with brain weight.

Exclude “Man” What happens to the simple linear regression line if we exclude “Man” from the data? Do the estimated intercept and estimated slope change?

Simple Linear Regression Predicted Gestation = 62.05 + 0.634*Brain Weight R2 = 0.600, 60% of the variation in gestation is explained by the linear relationship with brain weight.

Changes The estimated slope has more than doubled once “Man” is removed. The estimated intercept has decreased by over 20 days.

Influence It appears that the point associated with “Man” influences where the simple linear regression line goes. Is this influence statistically significant?

Influence Measures Quantifying influence involves how much the point differs in the response direction as well as in the explanatory direction. Combine information on the residual and the leverage.

Cook’s D where z is the standardized residual and k is the number of explanatory variables in the model.

Cook’s D If D > 1, then the point is considered to have high influence.

Cook’s D for “Man”

Cook’s D for “Man” Because the D value for “Man” is greater than 1, it is considered to exert high influence on where the regression line goes.

Cook’s D There are no other mammals with a value of D greater than 1. The okapi has D = 0.30 The Brazilian Tapir has D = 0.10

Studentized Residuals The studentized residual is the standardized residual adjusted for the leverage.

Studentized Residuals h rs Brazilian Tapir 3.010 0.0217 3.043 “Man” –2.516 0.6612 –4.323 Okapi 2.443 0.0839 2.552

Studentized Residuals If the conditions for the errors are met, then studentized residuals have an approximate t-distribution with degrees of freedom equal to n – k – 1.

Computing a P-value JMP – Col – Formula (1 – t Distribution(|rs|,n-k-1))*2 For our example rs = 3.043, n-k-1=48 P-value = 0.0038

Studentized Residuals h rs P-value Brazilian Tapir 3.010 0.0217 3.043 0.0038 “Man” –2.516 0.6612 –4.323 <0.0001 Okapi 2.443 0.0839 2.552 0.0139

Conclusion – “Man” The P-value is much less than 0.001 (the Bonferroni corrected cutoff), therefore “Man” has statistically significant influence on where the regression line is going.

Other Mammals The Brazilian Tapir has the most extreme standardized residual but not much leverage and so is not influential according to either Cook’s D or the Studentized Residual value.

Other Mammals The Okapi has high leverage, greater than 0.08, but it’s standardized residual is not that extreme and so is not influential according to either Cook’s D or the Studentized Residual value.