Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simple Linear Regression

Similar presentations


Presentation on theme: "Simple Linear Regression"— Presentation transcript:

1 Simple Linear Regression
Chapter 2 Simple Linear Regression Linear Regression Analysis 5E Montgomery, Peck & Vining

2 2.1 Simple Linear Regression Model
Single regressor, x; response, y 0 – intercept: if x = 0 is in the range, then 0 is the mean of the distribution of the response y, when x = 0; if x = 0 is not in the range, then 0 has no practical interpretation 1 – slope: change in the mean of the distribution of the response produced by a unit change in x  - random error Population regression model Linear Regression Analysis 5E Montgomery, Peck & Vining

3 2.1 Simple Linear Regression Model
The response, y, is a random variable There is a probability distribution for y at each value of x Mean: Variance: Linear Regression Analysis 5E Montgomery, Peck & Vining

4 2.2 Least-Squares Estimation of the Parameters
0 and 1 are unknown and must be estimated Least squares estimation seeks to minimize the sum of squares of the differences between the observed response, yi, and the straight line. Sample regression model Linear Regression Analysis 5E Montgomery, Peck & Vining

5 2.2 Least-Squares Estimation of the Parameters
Let represent the least squares estimators of 0 and 1, respectively. These estimators must satisfy: Linear Regression Analysis 5E Montgomery, Peck & Vining

6 2.2 Least-Squares Estimation of the Parameters
Simplifying yields the least squares normal equations: Linear Regression Analysis 5E Montgomery, Peck & Vining

7 2.2 Least-Squares Estimation of the Parameters
Solving the normal equations yields the ordinary least squares estimators: Linear Regression Analysis 5E Montgomery, Peck & Vining

8 2.2 Least-Squares Estimation of the Parameters
The fitted simple linear regression model: Sum of Squares Notation: Linear Regression Analysis 5E Montgomery, Peck & Vining

9 2.2 Least-Squares Estimation of the Parameters
Then Linear Regression Analysis 5E Montgomery, Peck & Vining

10 2.2 Least-Squares Estimation of the Parameters
Residuals: Residuals will be used to determine the adequacy of the model Linear Regression Analysis 5E Montgomery, Peck & Vining

11 Linear Regression Analysis 5E Montgomery, Peck & Vining
Example 2.1 – The Rocket Propellant Data Linear Regression Analysis 5E Montgomery, Peck & Vining

12 Linear Regression Analysis 5E Montgomery, Peck & Vining

13 Example 2.1- Rocket Propellant Data
Linear Regression Analysis 5E Montgomery, Peck & Vining

14 Example 2.1- Rocket Propellant Data
The least squares regression line is Linear Regression Analysis 5E Montgomery, Peck & Vining

15 Linear Regression Analysis 5E Montgomery, Peck & Vining

16 2.2 Least-Squares Estimation of the Parameters
Just because we can fit a linear model doesn’t mean that we should How well does this equation fit the data? Is the model likely to be useful as a predictor? Are any of the basic assumptions (such as constant variance and uncorrelated errors) violated, if so, how serious is this? Linear Regression Analysis 5E Montgomery, Peck & Vining

17 2.2 Least-Squares Estimation of the Parameters
Computer Output (Minitab) Linear Regression Analysis 5E Montgomery, Peck & Vining

18 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model The ordinary least-squares (OLS) estimator of the slope is a linear combinations of the observations, yi: where Useful in showing expected value and variance properties Linear Regression Analysis 5E Montgomery, Peck & Vining

19 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model The least-squares estimators are unbiased estimators of their respective parameter: The variances are The OLS estimators are Best Linear Unbiased Estimators (BLUE) Linear Regression Analysis 5E Montgomery, Peck & Vining

20 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.2.2 Properties of the Least-Squares Estimators and the Fitted Regression Model Useful properties of the least-squares fit 3. The least-squares regression line always passes through the centroid of the data. Linear Regression Analysis 5E Montgomery, Peck & Vining

21 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.2.3 Estimation of 2 Residual (error) sum of squares Linear Regression Analysis 5E Montgomery, Peck & Vining

22 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.2.3 Estimation of 2 Unbiased estimator of 2 The quantity n – 2 is the number of degrees of freedom for the residual sum of squares. Linear Regression Analysis 5E Montgomery, Peck & Vining

23 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.2.3 Estimation of 2 depends on the residual sum of squares. Then: Any violation of the assumptions on the model errors could damage the usefulness of this estimate A misspecification of the model can damage the usefulness of this estimate This estimate is model dependent Linear Regression Analysis 5E Montgomery, Peck & Vining

24 2.3 Hypothesis Testing on the Slope and Intercept
Three assumptions needed to apply procedures such as hypothesis testing and confidence intervals. Model errors, i, are normally distributed are independently distributed have constant variance i.e. i ~ NID(0, 2) Linear Regression Analysis 5E Montgomery, Peck & Vining

25 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.3.1 Use of t-tests Slope H0: 1 =  H1: 1  10 Standard error of the slope: Test statistic: Reject H0 if |t0| > Can also use the P-value approach Linear Regression Analysis 5E Montgomery, Peck & Vining

26 Linear Regression Analysis 5E Montgomery, Peck & Vining
2.3.1 Use of t-tests Intercept H0: 0 =  H1: 0  00 Standard error of the intercept: Test statistic: Reject H0 if |t0| > Can also use the P-value approach Linear Regression Analysis 5E Montgomery, Peck & Vining

27 2.3.2 Testing Significance of Regression
H0: 1 = H1: 1  0 This tests the significance of regression; that is, is there a linear relationship between the response and the regressor. Failing to reject 1 = 0, implies that there is no linear relationship between y and x Linear Regression Analysis 5E Montgomery, Peck & Vining

28 Linear Regression Analysis 5E Montgomery, Peck & Vining

29 Testing significance of regression
Linear Regression Analysis 5E Montgomery, Peck & Vining

30 2.3.3 The Analysis of Variance Approach
Partitioning of total variability It can be shown that Linear Regression Analysis 5E Montgomery, Peck & Vining

31 2.3.3 The Analysis of Variance
Degrees of Freedom Mean Squares Linear Regression Analysis 5E Montgomery, Peck & Vining

32 2.3.3 The Analysis of Variance
ANOVA procedure for testing H0: 1 = 0 A large value of F0 indicates that regression is significant; specifically, reject if F0 > Can also use the P-value approach Linear Regression Analysis 5E Montgomery, Peck & Vining

33 Linear Regression Analysis 5E Montgomery, Peck & Vining

34 2.3.3 The Analysis of Variance
Relationship between t0 and F0: For H0: 1 = 0, it can be shown that: So for testing significance of regression, the t-test and the ANOVA procedure are equivalent (only true in simple linear regression) Linear Regression Analysis 5E Montgomery, Peck & Vining

35 2.4 Interval Estimation in Simple Linear Regression
100(1-)% Confidence interval for Slope 100(1-)% Confidence interval for the Intercept Linear Regression Analysis 5E Montgomery, Peck & Vining

36 Linear Regression Analysis 5E Montgomery, Peck & Vining
Also see page 30, text Linear Regression Analysis 5E Montgomery, Peck & Vining

37 2.4 Interval Estimation in Simple Linear Regression
100(1-)% Confidence interval for 2 Linear Regression Analysis 5E Montgomery, Peck & Vining

38 2.4.2 Interval Estimation of the Mean Response
Let x0 be the level of the regressor variable at which we want to estimate the mean response, i.e. Point estimator of once the model is fit: In order to construct a confidence interval on the mean response, we need the variance of the point estimator. Linear Regression Analysis 5E Montgomery, Peck & Vining

39 2.4.2 Interval Estimation of the Mean Response
The variance of is Linear Regression Analysis 5E Montgomery, Peck & Vining

40 2.4.2 Interval Estimation of the Mean Response
100(1-)% confidence interval for E(y|x0) Notice that the length of the CI depends on the location of the point of interest Linear Regression Analysis 5E Montgomery, Peck & Vining

41 Linear Regression Analysis 5E Montgomery, Peck & Vining
See pages 34-35, text Linear Regression Analysis 5E Montgomery, Peck & Vining

42 2.5 Prediction of New Observations
Suppose we wish to construct a prediction interval on a future observation, y0 corresponding to a particular level of x, say x0. The point estimate would be: The confidence interval on the mean response at this point is not appropriate for this situation. Why? Linear Regression Analysis 5E Montgomery, Peck & Vining

43 2.5 Prediction of New Observations
Let the random variable,  be  is normally distributed with E() = 0 Var() = (y0 is independent of ) Linear Regression Analysis 5E Montgomery, Peck & Vining

44 2.5 Prediction of New Observations
100(1 - )% prediction interval on a future observation, y0, at x0 Linear Regression Analysis 5E Montgomery, Peck & Vining

45 Linear Regression Analysis 5E Montgomery, Peck & Vining

46 2.6 Coefficient of Determination
R2 - coefficient of determination Proportion of variation explained by the regressor, x For the rocket propellant data Linear Regression Analysis 5E Montgomery, Peck & Vining

47 2.6 Coefficient of Determination
R2 can be misleading! Simply adding more terms to the model will increase R2 As the range of the regressor variable increases (decreases), R2 generally increases (decreases). R2 does not indicate the appropriateness of a linear model Linear Regression Analysis 5E Montgomery, Peck & Vining

48 2.7 Considerations in the Use of Regression
Extrapolating Extreme points will often influence the slope. Outliers can disturb the least-squares fit Linear relationship does not imply cause-effect relationship (interesting mental defective example in the book pg 40) Sometimes, the value of the regressor variable is unknown and itself be estimated or predicted. Linear Regression Analysis 5E Montgomery, Peck & Vining

49 Linear Regression Analysis 5E Montgomery, Peck & Vining

50 2.8 Regression Through the Origin
The no-intercept model is This model would be appropriate for situations where the origin (0, 0) has some meaning. A scatter diagram can aid in determining where an intercept- or no-intercept model should be used. In addition, the practitioner could test both models. Examine t-tests, residual mean square. Linear Regression Analysis 5E Montgomery, Peck & Vining

51 2.9 Estimation by Maximum Likelihood
The method of least squares is not the only method of parameter estimation that can be used for linear regression models. If the form of the response (error) distribution is known, then the maximum likelihood (ML) estimation method can be used. In simple linear regression with normal errors, the MLEs for the regression coefficients are identical to that obtained by the method of least squares. Details of finding the MLEs are on page 47. The MLE of 2 is biased. (Not a serious problem here though.) Linear Regression Analysis 5E Montgomery, Peck & Vining

52 Linear Regression Analysis 5E Montgomery, Peck & Vining

53 Linear Regression Analysis 5E Montgomery, Peck & Vining
Identical to OLS Biased Linear Regression Analysis 5E Montgomery, Peck & Vining

54 Linear Regression Analysis 5E Montgomery, Peck & Vining

55 2.10 Case Where the Regressor is Random
When x and y have an unknown joint distribution, all of the “fixed x’s” regression results hold if: Linear Regression Analysis 5E Montgomery, Peck & Vining

56 Linear Regression Analysis 5E Montgomery, Peck & Vining

57 Linear Regression Analysis 5E Montgomery, Peck & Vining

58 Linear Regression Analysis 5E Montgomery, Peck & Vining

59 Linear Regression Analysis 5E Montgomery, Peck & Vining
The sample correlation coefficient is related to the slope: Also, Linear Regression Analysis 5E Montgomery, Peck & Vining

60 Linear Regression Analysis 5E Montgomery, Peck & Vining

61 Linear Regression Analysis 5E Montgomery, Peck & Vining

62 Linear Regression Analysis 5E Montgomery, Peck & Vining

63 Linear Regression Analysis 5E Montgomery, Peck & Vining


Download ppt "Simple Linear Regression"

Similar presentations


Ads by Google