Download presentation
Presentation is loading. Please wait.
Published byBernadette Matthews Modified over 8 years ago
1
Simple Linear Regression and Correlation (Continue..,) Reference: Chapter 17 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1
2
17.4 Error Variable: Required Conditions The error is a critical part of the regression model. Four requirements involving the distribution of must be satisfied. –The probability distribution of is normal. –The mean of is zero: E( ) = 0. –The standard deviation of is for all values of x. –The set of errors associated with different values of y are all independent. 2
3
Observational and Experimental Data Observational Y,X : random variables Ex: Y = return, X = inflation Models: Regression Correlation (Bivariate normal) Experimental Y: random variable, X : controlled Ex:Y = blood pressure X = medicine dose Models: Regression 3
4
17.5 Assessing the Model For our assumed model, the least squares method will produces a regression line whether or not there are linear relationship between x and y. 4
5
Consequently, it is important to assess how well the linear model fits the data as we have assumed linear model. Several methods are used to assess the model. All are based on the sum of squares for errors, SSE. 5
6
Sum of Squares for Errors –This is the sum of squared differences between the observed points and the points on the regression line. –It can serve as a measure of how well the line fits the data. SSE is defined by 6
7
–The standard deviation of the error variables shows the dispersion around the true line for a certain x. –If is big we have big dispersion around the true line. If is small the observations tend to be close to the line. Then, the model fits the data well. –Therefore, we can, use as a measure of the suitability of using a linear model. Standard Error of Estimate, s 7
8
The is not known, therefore use an estimate of it An estimate of is given by s the standard error of estimate: 8
9
The Example with the food company: 9
10
Model: X ~ N( , 2 ) Hypothesis: H 0 : = 50 H 1 : ≠ 50 Test statistic: if H 0 is true Level of significance: α Earlier: Inference about μ, (when 2 unknown) 10
11
Rejection region: Reject H 0 if t obs >t crit or t obs <-t crit Observation: t obs Conclusion: t obs >t crit or t obs <-t crit reject H 0 t obs -t crit don’t reject H 0 Interpretation: We have empirical support for the hypothesis 11
12
12 Testing the Slope We test if there is a slope. We can write it formally as follows Test statistic Under H 0, where Confidence interval:
13
To measure the strength of the linear relationship we use the coefficient of determination. It is a measure of how much the variation in Y is explained by the variation in X. (How many % of the variation in Y can be explained by the model) Coefficient of determination 13
14
14 i x i y i 1 2 3 4 5 6 7 8 9 276 552 720 648 336 396 1056 1188 372 115.0 135.6 153.6 117.6 106.8 150.0 164.4 190.8 136.8 -26.177778 -5.577778 12.422222 -23.577778 -34.377778 8.822222 23.222222 49.622222 -4.377778 685.27605 31.11160 154.31160 555.91160 1181.83160 77.83160 539.27160 2462.36494 19.16494 118.0087 136.8165 148.2648 143.3584 122.0974 126.1860 171.1613 180.1563 124.5506 9.052434 1.479981 28.464554 663.494881 234.009911 567.104757 45.714584 113.288358 150.048384 Total55441270.6 5707.0761812.658 Mean 616 141.178 Food Company example: Call X=ADVER and Y=SALES
15
15 Standard error of the estimate Standard deviation of
16
Coefficient of determina tion Overall variability in y The regression model Remains, in part, unexplained The error Explained in part by Variation in the dependent variable Y = variation explained by the independent variable + unexplained variation SST=SSR+SSE The greater the explained variable, better the model Co-efficient of determination is a measure of explanatory power of the model 16
17
17
18
18
19
19
20
20
21
21
22
Cause - effect Note that conclusions about cause and effect, X Y is based on knowledge of the subject. Experimental studies can decide it. It is hard to say with observational studies Eg: Is smoking lungcancer true?. The regression model only shows the linear relationship. We will make the same inferences even if we switch the variables! 22
23
23
24
Do the analyze in the right order: First the theory of the subject Then the statistical model! 24
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.