Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.

Similar presentations


Presentation on theme: "Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha."— Presentation transcript:

1 Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha

2 Previous Lecture Summary Simple Linear Regression Correlation Vs Regression Introduction to Simple Linear Regression Simple Linear Regression Model Least Square Method Interpretation of Model Measures of variation

3 Simple Linear Regression Model Only one independent variable, X Relationship between X and Y is described by a linear function Changes in Y are assumed to be related to changes in X

4 Linear component Simple Linear Regression Model Population Y intercept Population Slope Coefficient Random Error term Dependent Variable Independent Variable Random Error component

5 (continued) Random Error for this X i value Y X Observed Value of Y for X i Predicted Value of Y for X i XiXi Slope = β 1 Intercept = β 0 εiεi Simple Linear Regression Model

6 The simple linear regression equation provides an estimate of the population regression line Simple Linear Regression Equation (Prediction Line) Estimate of the regression intercept Estimate of the regression slope Estimated (or predicted) Y value for observation i Value of X for observation i

7 The Least Squares Method b 0 and b 1 are obtained by finding the values of that minimize the sum of the squared differences between Y and :

8 b 0 is the estimated average value of Y when the value of X is zero b 1 is the estimated change in the average value of Y as a result of a one-unit increase in X Interpretation of the Slope and the Intercept

9 Simple Linear Regression Example: Graphical Representation House price model: Scatter Plot and Prediction Line Slope = 0.10977 Intercept = 98.248

10 Simple Linear Regression Example: Making Predictions When using a regression model for prediction, only predict within the relevant range of data Relevant range for interpolation Do not try to extrapolate beyond the range of observed X’s

11 Measures of Variation Total variation is made up of two parts: Total Sum of Squares Regression Sum of Squares Error Sum of Squares where: = Mean value of the dependent variable Y i = Observed value of the dependent variable = Predicted value of Y for the given X i value

12 SST = total sum of squares (Total Variation) Measures the variation of the Y i values around their mean Y SSR = regression sum of squares (Explained Variation) Variation attributable to the relationship between X and Y SSE = error sum of squares (Unexplained Variation) Variation in Y attributable to factors other than X (continued) Measures of Variation

13 (continued) XiXi Y X YiYi SST =  (Y i - Y) 2 SSE =  (Y i - Y i ) 2  SSR =  (Y i - Y) 2  _ _ _ Y  Y Y _ Y  Measures of Variation

14 The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable The coefficient of determination is also called r-squared and is denoted as r 2 Coefficient of Determination, r 2 note:

15 r 2 = 1 Examples of Approximate r 2 Values Y X Y X r 2 = 1 Perfect linear relationship between X and Y: 100% of the variation in Y is explained by variation in X

16 Examples of Approximate r 2 Values Y X Y X 0 < r 2 < 1 Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X

17 Examples of Approximate r 2 Values r 2 = 0 No linear relationship between X and Y: The value of Y does not depend on X. (None of the variation in Y is explained by variation in X) Y X r 2 = 0

18 Simple Linear Regression Example: Coefficient of Determination Regression Statistics Multiple R0.76211 R Square0.58082 Adjusted R Square0.52842 Standard Error41.33032 Observations10 ANOVA dfSSMSFSignificance F Regression118934.9348 11.08480.01039 Residual813665.56521708.1957 Total932600.5000 CoefficientsStandard Errort StatP-valueLower 95%Upper 95% Intercept98.2483358.033481.692960.12892-35.57720232.07386 Square Feet0.109770.032973.329380.010390.033740.18580 58.08% of the variation in house prices is explained by variation in square feet

19 Standard Error of Estimate The standard deviation of the variation of observations around the regression line is estimated by Where SSE = error sum of squares n = sample size

20 Simple Linear Regression Example: Standard Error of Estimate Regression Statistics Multiple R0.76211 R Square0.58082 Adjusted R Square0.52842 Standard Error41.33032 Observations10 ANOVA dfSSMSFSignificance F Regression118934.9348 11.08480.01039 Residual813665.56521708.1957 Total932600.5000 CoefficientsStandard Errort StatP-valueLower 95%Upper 95% Intercept98.2483358.033481.692960.12892-35.57720232.07386 Square Feet0.109770.032973.329380.010390.033740.18580

21 Comparing Standard Errors YY X X S YX is a measure of the variation of observed Y values from the regression line The magnitude of S YX should always be judged relative to the size of the Y values in the sample data i.e., S YX = $41.33K is moderately small relative to house prices in the $200K - $400K range

22 Assumptions of Regression L.I.N.E Linearity The relationship between X and Y is linear Independence of Errors Error values are statistically independent Normality of Error Error values are normally distributed for any given value of X Equal Variance (also called homoscedasticity) The probability distribution of the errors has constant variance

23 Residual Analysis The residual for observation i, e i, is the difference between its observed and predicted value Check the assumptions of regression by examining the residuals Examine for linearity assumption Evaluate independence assumption Evaluate normal distribution assumption Examine for constant variance for all levels of X (homoscedasticity) Graphical Analysis of Residuals Can plot residuals vs. X

24 Residual Analysis for Linearity Not Linear Linear x residuals x Y x Y x

25 Residual Analysis for Independence Not Independent Independent X X residuals X

26 Checking for Normality Examine the Stem-and-Leaf Display of the Residuals Examine the Boxplot of the Residuals Examine the Histogram of the Residuals Construct a Normal Probability Plot of the Residuals

27 Residual Analysis for Normality Percent Residual When using a normal probability plot, normal errors will approximately display in a straight line -3 -2 -1 0 1 2 3 0 100

28 Residual Analysis for Equal Variance Non-constant variance Constant variance xx Y x x Y residuals

29 Simple Linear Regression Example: RESIDUAL OUTPUT Predicted House PriceResiduals 1251.92316-6.923162 2273.8767138.12329 3284.85348-5.853484 4304.062843.937162 5218.99284-19.99284 6268.38832-49.38832 7356.2025148.79749 8367.17929-43.17929 9254.667464.33264 10284.85348-29.85348 Does not appear to violate any regression assumptions

30 Inferences About the Slope The standard error of the regression slope coefficient (b 1 ) is estimated by where: = Estimate of the standard error of the slope = Standard error of the estimate

31 Lecture Summary Measures of Variation (Graphical Representation) Co-efficient of Determination (r 2 ) Example of Co-efficient of Determination Standard Error Estimate Assumptions of Regression L.I.N.E Linearity Independence of Errors Normality of Errors Equal Variance Inference about slope


Download ppt "Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha."

Similar presentations


Ads by Google