Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.

Similar presentations


Presentation on theme: "Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two."— Presentation transcript:

1 Bivariate Regression

2 Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two variables. It specifies one dependent (response) variable and one independent (predictor) variable. It specifies one dependent (response) variable and one independent (predictor) variable. This hypothesized relationship may be linear, quadratic, or whatever. This hypothesized relationship may be linear, quadratic, or whatever. McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  What is Bivariate Regression?

3 Bivariate Regression McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Model Form

4 Regression Terminology Unknown parameters are  0 Intercept  1 Slope Unknown parameters are  0 Intercept  1 Slope The assumed model for a linear relationship is The assumed model for a linear relationship is y i =  0 +  1 x i +  i for all observations (i = 1, 2, …, n) The error term is not observable, is assumed normally distributed with mean of 0 and standard deviation . The error term is not observable, is assumed normally distributed with mean of 0 and standard deviation . McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Models and Parameters

5 The population simple linear regression model: Y=  0 +  1 X +  Nonrandom or Random Systematic Component Component where Y is the dependent variable, the variable we wish to explain or predict X is the independent variable, also called the predictor variable  is the error term, the only random component in the model, and thus, the only source of randomness in Y.  0 is the intercept of the systematic component of the regression relationship.  1 is the slope of the systematic component. The conditional mean of Y: The population simple linear regression model: Y=  0 +  1 X +  Nonrandom or Random Systematic Component Component where Y is the dependent variable, the variable we wish to explain or predict X is the independent variable, also called the predictor variable  is the error term, the only random component in the model, and thus, the only source of randomness in Y.  0 is the intercept of the systematic component of the regression relationship.  1 is the slope of the systematic component. The conditional mean of Y: The Simple Linear Regression Model

6 The simple linear regression model gives an exact linear relationship between the expected or average value of Y, the dependent variable, and X, the independent or predictor variable: E[Yi]=  0 +  1 Xi Actual observed values of Y differ from the expected value by an unexplained or random error: Yi = E[Yi] +  i =  0 +  1 Xi +  i The simple linear regression model gives an exact linear relationship between the expected or average value of Y, the dependent variable, and X, the independent or predictor variable: E[Yi]=  0 +  1 Xi Actual observed values of Y differ from the expected value by an unexplained or random error: Yi = E[Yi] +  i =  0 +  1 Xi +  i X Y E[Y]=  0 +  1 X XiXi } }  1 = Slope 1  0 = Intercept YiYi { Error:  i Regression Plot Picturing the Simple Linear Regression Model

7 Estimation of a simple linear regression relationship involves finding estimated or predicted values of the intercept and slope of the linear regression line. The estimated regression equation: Y = b0 + b1X + e where b0 estimates the intercept of the population regression line,  0 ; b1 estimates the slope of the population regression line,  1; and e stands for the observed errors - the residuals from fitting the estimated regression line b0 + b1X to a set of n points. Estimation of a simple linear regression relationship involves finding estimated or predicted values of the intercept and slope of the linear regression line. The estimated regression equation: Y = b0 + b1X + e where b0 estimates the intercept of the population regression line,  0 ; b1 estimates the slope of the population regression line,  1; and e stands for the observed errors - the residuals from fitting the estimated regression line b0 + b1X to a set of n points. 10-3 Estimation: The Method of Least Squares

8 Fitting a Regression Line X Y Data X Y Three errors from a fitted line X Y Three errors from the least squares regression line X Errors from the least squares regression line are minimized

9 . { Y X Errors in Regression XiXiXiXi

10 Least Squares Regression

11 Ordinary Least Squares Formulas The ordinary least squares method (OLS) estimates the slope and intercept of the regression line so that the residuals are small. The ordinary least squares method (OLS) estimates the slope and intercept of the regression line so that the residuals are small. The sum of the residuals = 0 The sum of the residuals = 0 The sum of the squared residuals is SSE The sum of the squared residuals is SSE McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.

12 Ordinary Least Squares Formulas The OLS estimator for the slope is: The OLS estimator for the slope is: The OLS estimator for the intercept is: The OLS estimator for the intercept is: McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Slope and Intercept or

13 Ordinary Least Squares Formulas We want to explain the total variation in Y around its mean (SST for Total Sums of Squares) We want to explain the total variation in Y around its mean (SST for Total Sums of Squares) The regression sum of squares (SSR) is the explained variation in Y The regression sum of squares (SSR) is the explained variation in Y McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Assessing Fit

14 Ordinary Least Squares Formulas The error sum of squares (SSE) is the unexplained variation in Y The error sum of squares (SSE) is the unexplained variation in Y If the fit is good, SSE will be relatively small compared to SST. If the fit is good, SSE will be relatively small compared to SST. A perfect fit is indicated by an SSE = 0. A perfect fit is indicated by an SSE = 0. The magnitude of SSE depends on n and on the units of measurement. The magnitude of SSE depends on n and on the units of measurement. McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Assessing Fit

15 Ordinary Least Squares Formulas R 2 is a measure of relative fit based on a comparison of SSR and SST. R 2 is a measure of relative fit based on a comparison of SSR and SST. McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Coefficient of Determination 0 < R 2 < 1 Often expressed as a percent, an R 2 = 1 (i.e., 100%) indicates perfect fit.Often expressed as a percent, an R 2 = 1 (i.e., 100%) indicates perfect fit. In a bivariate regression, R 2 = (r) 2In a bivariate regression, R 2 = (r) 2

16 Tests for Significance The standard error (s yx ) is an overall measure of model fit. The standard error (s yx ) is an overall measure of model fit. McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Standard Error of Regression If the fitted model’s predictions are perfect (SSE = 0), then s yx = 0. Thus, a small s yx indicates a better fit.If the fitted model’s predictions are perfect (SSE = 0), then s yx = 0. Thus, a small s yx indicates a better fit. Used to construct confidence intervals.Used to construct confidence intervals. Magnitude of s yx depends on the units of measurement of Y and on data magnitude.Magnitude of s yx depends on the units of measurement of Y and on data magnitude.

17 Tests for Significance Standard error of the slope: Standard error of the slope: McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Confidence Intervals for Slope and Intercept Standard error of the intercept:Standard error of the intercept:

18 Tests for Significance Confidence interval for the true slope: Confidence interval for the true slope: McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Confidence Intervals for Slope and Intercept Confidence interval for the true intercept:Confidence interval for the true intercept:

19 Tests for Significance If  1 = 0, then X cannot influence Y and the regression model collapses to a constant  0 plus random error. If  1 = 0, then X cannot influence Y and the regression model collapses to a constant  0 plus random error. The hypotheses to be tested are: The hypotheses to be tested are: McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Hypothesis Tests

20 Tests for Significance A t test is used with = n – 2 degrees of freedom The test statistics for the slope and intercept are: A t test is used with = n – 2 degrees of freedom The test statistics for the slope and intercept are: McGraw-Hill/Irwin© 2007 The McGraw-Hill Companies, Inc. All rights reserved.  Hypothesis Tests t n-2 is obtained from Appendix D or Excel for a given .t n-2 is obtained from Appendix D or Excel for a given . Reject H 0 if t > t  or if p-value t  or if p-value < . Slope: Intercept:


Download ppt "Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two."

Similar presentations


Ads by Google