Simple Linear Regression

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Managerial Economics in a Global Economy
Multiple Regression Analysis
Lesson 10: Linear Regression and Correlation
Forecasting Using the Simple Linear Regression Model and Correlation
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Inference for Regression
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Probability & Statistical Inference Lecture 9
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Objectives (BPS chapter 24)
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 12 Simple Linear Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Simple Linear Regression Model: Specification and Estimation
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 10 Simple Regression.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
The Simple Regression Model
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Simple Linear Regression Analysis
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 11 Simple Regression
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
7.1 Multiple Regression More than one explanatory/independent variable This makes a slight change to the interpretation of the coefficients This changes.
Introduction to Linear Regression
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Inference about the slope parameter and correlation
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Applied Statistics and Probability for Engineers
Chapter 4 Basic Estimation Techniques
6. Simple Regression and OLS Estimation
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Chapter 3 Multiple Linear Regression
Hypothesis testing and Estimation
Simple Linear Regression
Product moment correlation
Chapter Thirteen McGraw-Hill/Irwin
Correlation and Simple Linear Regression
Regression Models - Introduction
Presentation transcript:

Simple Linear Regression Chapter 2 Simple Linear Regression Linear Regression Analysis 5E Montgomery, Peck & Vining

2.1 Simple Linear Regression Model Single regressor, x; response, y 0 – intercept: if x = 0 is in the range, then 0 is the mean of the distribution of the response y, when x = 0; if x = 0 is not in the range, then 0 has no practical interpretation 1 – slope: change in the mean of the distribution of the response produced by a unit change in x  - random error Population regression model Linear Regression Analysis 5E Montgomery, Peck & Vining

2.1 Simple Linear Regression Model The response, y, is a random variable There is a probability distribution for y at each value of x Mean: Variance: Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters 0 and 1 are unknown and must be estimated Least squares estimation seeks to minimize the sum of squares of the differences between the observed response, yi, and the straight line. Sample regression model Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters Let represent the least squares estimators of 0 and 1, respectively. These estimators must satisfy: Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters Simplifying yields the least squares normal equations: Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters Solving the normal equations yields the ordinary least squares estimators: Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters The fitted simple linear regression model: Sum of Squares Notation: Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters Then Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters Residuals: Residuals will be used to determine the adequacy of the model Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining Example 2.1 – The Rocket Propellant Data Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Example 2.1- Rocket Propellant Data Linear Regression Analysis 5E Montgomery, Peck & Vining

Example 2.1- Rocket Propellant Data The least squares regression line is Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters Just because we can fit a linear model doesn’t mean that we should How well does this equation fit the data? Is the model likely to be useful as a predictor? Are any of the basic assumptions (such as constant variance and uncorrelated errors) violated, if so, how serious is this? Linear Regression Analysis 5E Montgomery, Peck & Vining

2.2 Least-Squares Estimation of the Parameters Computer Output (Minitab) Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model The ordinary least-squares (OLS) estimator of the slope is a linear combinations of the observations, yi: where Useful in showing expected value and variance properties Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model The least-squares estimators are unbiased estimators of their respective parameter: The variances are The OLS estimators are Best Linear Unbiased Estimators (BLUE) Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model Useful properties of the least-squares fit 3. The least-squares regression line always passes through the centroid of the data. Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.2.3 Estimation of 2 Residual (error) sum of squares Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.2.3 Estimation of 2 Unbiased estimator of 2 The quantity n – 2 is the number of degrees of freedom for the residual sum of squares. Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.2.3 Estimation of 2 depends on the residual sum of squares. Then: Any violation of the assumptions on the model errors could damage the usefulness of this estimate A misspecification of the model can damage the usefulness of this estimate This estimate is model dependent Linear Regression Analysis 5E Montgomery, Peck & Vining

2.3 Hypothesis Testing on the Slope and Intercept Three assumptions needed to apply procedures such as hypothesis testing and confidence intervals. Model errors, i, are normally distributed are independently distributed have constant variance i.e. i ~ NID(0, 2) Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.3.1 Use of t-tests Slope H0: 1 = 10 H1: 1  10 Standard error of the slope: Test statistic: Reject H0 if |t0| > Can also use the P-value approach Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 2.3.1 Use of t-tests Intercept H0: 0 = 00 H1: 0  00 Standard error of the intercept: Test statistic: Reject H0 if |t0| > Can also use the P-value approach Linear Regression Analysis 5E Montgomery, Peck & Vining

2.3.2 Testing Significance of Regression H0: 1 = 0 H1: 1  0 This tests the significance of regression; that is, is there a linear relationship between the response and the regressor. Failing to reject 1 = 0, implies that there is no linear relationship between y and x Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Testing significance of regression Linear Regression Analysis 5E Montgomery, Peck & Vining

2.3.3 The Analysis of Variance Approach Partitioning of total variability It can be shown that Linear Regression Analysis 5E Montgomery, Peck & Vining

2.3.3 The Analysis of Variance Degrees of Freedom Mean Squares Linear Regression Analysis 5E Montgomery, Peck & Vining

2.3.3 The Analysis of Variance ANOVA procedure for testing H0: 1 = 0 A large value of F0 indicates that regression is significant; specifically, reject if F0 > Can also use the P-value approach Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

2.3.3 The Analysis of Variance Relationship between t0 and F0: For H0: 1 = 0, it can be shown that: So for testing significance of regression, the t-test and the ANOVA procedure are equivalent (only true in simple linear regression) Linear Regression Analysis 5E Montgomery, Peck & Vining

2.4 Interval Estimation in Simple Linear Regression 100(1-)% Confidence interval for Slope 100(1-)% Confidence interval for the Intercept Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining Also see page 30, text Linear Regression Analysis 5E Montgomery, Peck & Vining

2.4 Interval Estimation in Simple Linear Regression 100(1-)% Confidence interval for 2 Linear Regression Analysis 5E Montgomery, Peck & Vining

2.4.2 Interval Estimation of the Mean Response Let x0 be the level of the regressor variable at which we want to estimate the mean response, i.e. Point estimator of once the model is fit: In order to construct a confidence interval on the mean response, we need the variance of the point estimator. Linear Regression Analysis 5E Montgomery, Peck & Vining

2.4.2 Interval Estimation of the Mean Response The variance of is Linear Regression Analysis 5E Montgomery, Peck & Vining

2.4.2 Interval Estimation of the Mean Response 100(1-)% confidence interval for E(y|x0) Notice that the length of the CI depends on the location of the point of interest Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining See pages 34-35, text Linear Regression Analysis 5E Montgomery, Peck & Vining

2.5 Prediction of New Observations Suppose we wish to construct a prediction interval on a future observation, y0 corresponding to a particular level of x, say x0. The point estimate would be: The confidence interval on the mean response at this point is not appropriate for this situation. Why? Linear Regression Analysis 5E Montgomery, Peck & Vining

2.5 Prediction of New Observations Let the random variable,  be  is normally distributed with E() = 0 Var() = (y0 is independent of ) Linear Regression Analysis 5E Montgomery, Peck & Vining

2.5 Prediction of New Observations 100(1 - )% prediction interval on a future observation, y0, at x0 Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

2.6 Coefficient of Determination R2 - coefficient of determination Proportion of variation explained by the regressor, x For the rocket propellant data Linear Regression Analysis 5E Montgomery, Peck & Vining

2.6 Coefficient of Determination R2 can be misleading! Simply adding more terms to the model will increase R2 As the range of the regressor variable increases (decreases), R2 generally increases (decreases). R2 does not indicate the appropriateness of a linear model Linear Regression Analysis 5E Montgomery, Peck & Vining

2.7 Considerations in the Use of Regression Extrapolating Extreme points will often influence the slope. Outliers can disturb the least-squares fit Linear relationship does not imply cause-effect relationship (interesting mental defective example in the book pg 40) Sometimes, the value of the regressor variable is unknown and itself be estimated or predicted. Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

2.8 Regression Through the Origin The no-intercept model is This model would be appropriate for situations where the origin (0, 0) has some meaning. A scatter diagram can aid in determining where an intercept- or no-intercept model should be used. In addition, the practitioner could test both models. Examine t-tests, residual mean square. Linear Regression Analysis 5E Montgomery, Peck & Vining

2.9 Estimation by Maximum Likelihood The method of least squares is not the only method of parameter estimation that can be used for linear regression models. If the form of the response (error) distribution is known, then the maximum likelihood (ML) estimation method can be used. In simple linear regression with normal errors, the MLEs for the regression coefficients are identical to that obtained by the method of least squares. Details of finding the MLEs are on page 47. The MLE of 2 is biased. (Not a serious problem here though.) Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining Identical to OLS Biased Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

2.10 Case Where the Regressor is Random When x and y have an unknown joint distribution, all of the “fixed x’s” regression results hold if: Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining The sample correlation coefficient is related to the slope: Also, Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining