Download presentation
Presentation is loading. Please wait.
Published byLouisa Jefferson Modified over 6 years ago
1
…Don’t be afraid of others, because they are bigger than you
…Don’t be afraid of others, because they are bigger than you. The real size could be measured in the wisdom. 11/10/2018 ST3131, Lecture 2
2
Chapter 2 Simple Linear Regression
ST5213 Semester II, 2000/2001 Chapter 2 Simple Linear Regression In this Chapter, we consider the simplest regression model : Simple Linear Regression (SLR) Model Which describes the linear relationship between Y and X. Tasks: 1. Review of some basic statistics 2. Define measures of the direction / strength of the linear relationship between Y and X. 3. Find the formulas for the estimators and 11/10/2018 ST3131, Lecture 2
3
Review of Some Basic Statistics
Let Y and X have n observations as follows: Summary Statistics: Mean : , average of observations of Y, measure for sample center of Y. Deviation: , differences of observation from Variance: , average of squared deviations of Y. Standard Deviation: , measure for spread of Y. Standardization of Y: Similarly, we can define all terms for X. 11/10/2018 ST3131, Lecture 2
4
Properties of Standardized Variables
Proof: 11/10/2018 ST3131, Lecture 2
5
Exact Linear Relationship between Y and X
Given Y= X, consider the Linear Relationship between Y and X, Case 1). Positive, X increases, Y increases, when ( ) > 0; 2). Negative, X increases, Y decreases, when ( ) <0; 3). Linearly Uncorrelated, X changes, Y does NOT change, when ( ) =0. =1, = -1 =1, = 0 =1, = 1 Conclusion: ( ) is an Indicator of the direction of Y changing with X. 11/10/2018 ST3131, Lecture 2
6
Distorted Linear Relationship between Y and X
Given Y= X , consider the Linear Relationship between Y and X Case 1). Positive, X increases, Y almost increases when ( ) >0; 2). Negative, X increases, Y almost decreases when ( ) <0; 3).Linearly Uncorrelated, X changes, Y almost does NOT change, when ( )=0 =1, = -1 =1, = 0 =1, = -1 Conclusion: ( ) is also an Indicator of the Direction of Y changing with X. 11/10/2018 ST3131, Lecture 2
7
Intuitive Derivation of the LS-estimators
Sample Covariance of Y and X: the summation of cross-products of deviations of Y and X divided by (n-1). Intuitive Derivation: We have , thus we have Thus, 11/10/2018 ST3131, Lecture 2
8
Formulas for the LS-estimators
Assume ( ) =0, and =0. The LS-estimators are Since =0, Cov(Y,X) has the same signs as those of Thus Cov(Y,X) is also an Indicator of the direction of the Linear Relationship between Y and X: Case 1) Positive when Cov(Y,X)>0; 2) Negative when Cov(Y,X)<0; 3) Uncorrelated when Cov(Y,X)=0. Summary: Indicators of the Direction of the Linear Relationship between Y and X 1). The slope ) The slope estimator ). Cov (Y,X) 11/10/2018 ST3131, Lecture 2
9
Properties of Cov(Y,X) 1. Symmetric, i.e., Cov(Y,X)=Cov(X,Y).
2. Scale-Dependent, i.e. when the scales of Y or X change, so is their covariance. Let Y1=a+bY,X1=c+dX. Then we have 3. Take values from to since b and d can take any values. Thus Cov(Y,X) does not measure the strength of the linear relationship between Y and X. 11/10/2018 ST3131, Lecture 2
10
Correlation Coefficient between Y and X
Correlation Coefficient of Y and X is defined as the covariance of Standardized Y and X i.e. the covariance of Y and X divided by their standard deviations. Clearly Cor(Y,X) and Cov(Y,X) have the same signs so that it is also an indicator of the direction of the linear relationship between Y and X 1) Positive when Cor(Y,X)>0 2) Negative when Cor(Y,X)<0 3) Linear Uncorrelated when Corr(Y,X)=0 11/10/2018 ST3131, Lecture 2
11
Properties of Cor(Y,X) Symmetric, I.e. Cor(Y,X)=Cor(X,Y).
Scale-Invariant, i.e., Not change with change of the scales of Y and X. Let Y1=a+bY,X1=c+dX, b>0,d>0. Then 3. Take values between –1 and 1. The strength of the Linear Relationship: 1). Strong when |Cor(Y,X)| close to 1; 2). Weak when |Cor(Y,X)| close to 0; 3). Linear Uncorrelated when Cor(Y,X)=0; But Y and X can still have some relationship. Counter example: the top-right picture where Y=2-cos(6.28X) (perfect nonlinear relationship) while Cor(Y,X)=0. 11/10/2018 ST3131, Lecture 2
12
Examples of Correlation Coefficients
Cor(Y,X)=.98 Very Strong Linearity Cor(Y,X)=.71 Strong Linearity Cor(Y,X)=-.09 Near Uncorrelated Robustness: Both Cov(Y,X) and Cor(Y,X) are NOT Robust Statistics since their values will be affected by a few outliers . Examples: Anscombe quartets (see next slide) have the same summary statistics but quite different pictures: (a). can be described by a linear model; (b). can be described by a qudratic model; (c ). has an outlier, and so is (d). 11/10/2018 ST3131, Lecture 2
13
Anscombe Quartets (b) Strong Nonlinearity (a) Strong linearity
(d) An outlier appears 11/10/2018 (c) an outlier appears ST3131, Lecture 2
14
1 2 … n i …. Table for Computing Variance and Covariance Total
Note that Var(Y)=sum of squared deviations of Y divided by (n-1) Var(X)=sum of squared deviations of X divided by (n-1) Cov(Y,X)=sum of products of deviations of Y and X divided by (n-1). 11/10/2018 ST3131, Lecture 2
15
Example Computer Repair Data (Table 2.5, Page 27, see Table 2.6, Page 28 for detail computation) Conclusion: Y and X are strongly linearly related. Drawback: Cor(Y,X) can not be used to predict Y values given X values. This can be done with Simple Linear Regression Analysis. 11/10/2018 ST3131, Lecture 2
16
Strict Derivation of the LS-estimators
SLR Model Intercept =the predicted value of Y when X=0, Slope =the change in Y for unit change in X. Least Squares Method: Find to minimize the Sum of the Squared Errors (SSE): The minimizers are: the same as those in Slide 7. 11/10/2018 ST3131, Lecture 2
17
Proof 11/10/2018 ST3131, Lecture 2
18
Proof (continued) Equality holds when
Which are the least squares estimators of the parameters of and Since , we have an important property of Cor(Y,X) Moreover, we have another important equation: 11/10/2018 ST3131, Lecture 2
19
Computer Repair Data(continued), we have
Example Computer Repair Data(continued), we have , Cov(Y,X)=136, Var(X)=2.96 Thus the LS- Regression Line is The fitted values and residuals then are In other words, we have : Minutes = *Units Using this formula, we can compute the fitted (predicted) values, e.g. X=4, fitted value= *4=66.20. X=11, predicted value= *11= 11/10/2018 ST3131, Lecture 2
20
Exercise (1) Fill the following table, then compute the mean, variance, std of Y and X (2) Compute the covariance and Correlation of Y and X. (3) Compute the Simple Linear Regression Coefficients i 1 -.3 .09 .1 -.9 .81 .27 2 -.2 .04 .4 -.6 .36 .12 3 -.1 .01 .7 4 1.2 .2 5 1.6 .6 6 .3 2.0 Total 6.0 Statistics 1.0 11/10/2018 ST3131, Lecture 2
21
Review Sections 2.1-2.5 of Chapter 2.
Reading Assignment Review Sections of Chapter 2. Read Sections of Chapter 2. Consider problems: a) How to do significance tests of parameters? b) How to construct confidence intervals of parameters? c) How to do inferences about prediction? 11/10/2018 ST3131, Lecture 2
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.