1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 

1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 +  1 x n Estimated Simple Linear Regression Equation y = b 0 + b 1 x ^

2 2 Slide 最小平方直線（最佳預測直線） n 通過平面分佈圖資料點的直線中，使預測誤差平方和爲最小者即稱爲最小平方直線，而此方法即稱爲最小平方法（ Least Square Method ） n 何謂誤差平方和？設爲 n 個資料點，若以做爲以 X 預測 Y 的直線，則當 X ＝ x1 ，預測值與實際觀察的 y1 之差異即稱爲預測誤差，誤差平方和即定義爲求使函數 f 爲最小時，由微積分解 “ 極大或極小 ” 方法。

3 3 Slide 最小平方直線解此聯立方程組：可得可得故最小平方直線為

4 4 Slide Example: Reed Auto Sales n Simple Linear Regression Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 6 previous sales are shown below. Number of TV Ads Number of Cars Sold Number of TV Ads Number of Cars Sold 114 324 218 117 327 222

5 5 Slide n Slope for the Estimated Regression Equation b 1 = 264 - (12)(122)/5 = 5 b 1 = 264 - (12)(122)/5 = 5 28 - (12) 2 /5 28 - (12) 2 /5 n y -Intercept for the Estimated Regression Equation b 0 = 20.333 - 5(2) = 10.333 b 0 = 20.333 - 5(2) = 10.333 n Estimated Regression Equation y = 10.333 + 5 x ^ Example: Reed Auto Sales

6 6 Slide Example: Reed Auto Sales n Scatter Diagram

7 7 Slide The Coefficient of Determination n Relationship Among SST, SSR, SSE SST = SSR + SSE n Coefficient of Determination r 2 = SSR/SST where: SST = total sum of squares SST = total sum of squares SSR = sum of squares due to regression SSR = sum of squares due to regression SSE = sum of squares due to error SSE = sum of squares due to error ^^

8 8 Slide 判定係數 n 定義： r 2 = SSR/SST n 用以表示 Y 的變異數中已被 X 解釋的部分（比率）當 r 2 愈大時，表示最小平方直線愈精確當 r 2 愈大時，表示最小平方直線愈精確 1 － r 2 為總變異數 (SST) 中無法由 X 解釋的餘量（剩餘的比率） 1 － r 2 為總變異數 (SST) 中無法由 X 解釋的餘量（剩餘的比率） n 表示汽車銷售量的差異與變化有 85.2% 可由 “ 廣告次數 ” 這個因素來解釋（而有 14.8% 無法由 “ 廣告次數 ” 所解釋）表示汽車銷售量的差異與變化有 85.2% 可由 “ 廣告次數 ” 這個因素來解釋（而有 14.8% 無法由 “ 廣告次數 ” 所解釋） Example: Reed Auto Sales r 2 = SSR/SST = 100/117.333 =.852273

9 9 Slide The Correlation Coefficient n Sample Correlation Coefficient where: b 1 = the slope of the estimated regression b 1 = the slope of the estimated regressionequation

10 Slide Example: Reed Auto Sales n Sample Correlation Coefficient The sign of b 1 in the equation is “+”. r xy = +.923186 r xy = +.923186

11 Slide Model Assumptions Assumptions About the Error Term  Assumptions About the Error Term  The error  is a random variable with mean of zero. The error  is a random variable with mean of zero. The variance of , denoted by  2, is the same for all values of the independent variable. The variance of , denoted by  2, is the same for all values of the independent variable. The values of  are independent. The values of  are independent. The error  is a normally distributed random variable. The error  is a normally distributed random variable.

12 Slide Testing for Significance To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of  1 is zero. To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of  1 is zero. n Two tests are commonly used t Test t Test F Test F Test Both tests require an estimate of  2, the variance of  in the regression model. Both tests require an estimate of  2, the variance of  in the regression model.

13 Slide Testing for Significance An Estimate of  2 An Estimate of  2 The mean square error (MSE) provides the estimate of  2, and the notation s 2 is also used. s 2 = MSE = SSE/(n-2) s 2 = MSE = SSE/(n-2)where:

14 Slide Testing for Significance An Estimate of  An Estimate of  To estimate  we take the square root of  2. To estimate  we take the square root of  2. The resulting s is called the standard error of the estimate. The resulting s is called the standard error of the estimate.

15 Slide Testing for Significance: t Test n Hypotheses H 0 :  1 = 0 H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 n Test Statistic n Rejection Rule Reject H 0 if t t  where t  is based on a t distribution with where t  is based on a t distribution with n - 2 degrees of freedom. n - 2 degrees of freedom.

16 Slide n t Test Hypotheses H 0 :  1 = 0 Hypotheses H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 Rejection Rule Rejection Rule For  =.05 and d.f. = 4, t.025 = 2.776 For  =.05 and d.f. = 4, t.025 = 2.776 Reject H 0 if t > 2.776 Reject H 0 if t > 2.776 Test Statistics Test Statistics t = 5/1.0408 = 4.804 Conclusions Conclusions Reject H 0 Reject H 0 P-value 2P{T>4.804}=0.0086 4.804}=0.0086 <0.05 Reject H 0 Reject H 0 Example: Reed Auto Sales

17 Slide Confidence Interval for  1 We can use a 95% confidence interval for  1 to test the hypotheses just used in the t test. We can use a 95% confidence interval for  1 to test the hypotheses just used in the t test. H 0 is rejected if the hypothesized value of  1 is not included in the confidence interval for  1. H 0 is rejected if the hypothesized value of  1 is not included in the confidence interval for  1.

18 Slide Confidence Interval for  1 The form of a confidence interval for  1 is: The form of a confidence interval for  1 is: where b 1 is the point estimate is the margin of error is the t value providing an area of  /2 in the upper tail of a t distribution with n - 2 degrees t distribution with n - 2 degrees of freedom

19 Slide Example: Reed Auto Sales n Rejection Rule Reject H 0 if 0 is not included in the confidence interval for  1. 95% Confidence Interval for  1 95% Confidence Interval for  1 = 5 2.776(1.0408) = 5 2.89 = 5 2.776(1.0408) = 5 2.89 or 2.11 to 7.89 n Conclusion Reject H 0

20 Slide Testing for Significance: F Test n Hypotheses H 0 :  1 = 0 H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 n Test Statistic F = MSR/MSE n Rejection Rule Reject H 0 if F > F  where F  is based on an F distribution with 1 d.f. in the numerator and n - 2 d.f. in the denominator.

21 Slide n F Test Hypotheses H 0 :  1 = 0 Hypotheses H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 Rejection Rule Rejection Rule For  =.05 and d.f. = 1, 4: F.05 = 7.709 For  =.05 and d.f. = 1, 4: F.05 = 7.709 Reject H 0 if F > 7.709. Reject H 0 if F > 7.709. Test Statistic Test Statistic F = MSR/MSE = 100/4.333 = 23.077 Conclusion Conclusion We can reject H 0. Example: Reed Auto Sales

22 Slide Some Cautions about the Interpretation of Significance Tests Rejecting H 0 :  1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y. Rejecting H 0 :  1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y. Just because we are able to reject H 0 :  1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y. Just because we are able to reject H 0 :  1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y.

23 Slide n Confidence Interval Estimate of E ( y p ) n Prediction Interval Estimate of y p y p + t  /2 s ind y p + t  /2 s ind where the confidence coefficient is 1 -  and t  /2 is based on a t distribution with n - 2 d.f. is the standard error of the estimate of E ( y p ) is the standard error of the estimate of E ( y p ) s ind is the standard error of individual estimate of estimate of Using the Estimated Regression Equation for Estimation and Prediction

24 Slide Standard Errors of Estimate of E ( y p ) and y p

25 Slide E ( y p ) 與 y p 估計式的變異數 n 的變異數：  的變異數：  的變異數： n 估計式的變異數：

26 Slide n Point Estimation If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be: y = 10.333 + 5(3) = 25.333 cars n Confidence Interval for E ( y p ) 95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is: 25.333 + 3.730 = 21.603 to 29.063 cars 25.333 + 3.730 = 21.603 to 29.063 cars n Prediction Interval for y p 95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is: 25.333 + 6.878 = 18.455 to 32.211 cars ^ Example: Reed Auto Sales

1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 

Similar presentations

Presentation on theme: "1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + "— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 

Similar presentations

Presentation on theme: "1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + "— Presentation transcript:

Similar presentations

About project

Feedback