Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5: Regression Analysis Part 1: Simple Linear Regression.

Similar presentations


Presentation on theme: "Chapter 5: Regression Analysis Part 1: Simple Linear Regression."— Presentation transcript:

1 Chapter 5: Regression Analysis Part 1: Simple Linear Regression

2 Regression Analysis Building models that characterize the relationships between a dependent variable and one (single) or more (multiple) independent variables, all of which are numerical, for example: Sales = a + b*Price + c*Coupons + d*Advertising + e*Price*Advertising Cross-sectional data Time series data (forecasting)

3 Simple Linear Regression Single independent variable Linear relationship

4 SLR Model Y =      X +  Interceptslopeerror E(Y  X) f(Y  X)

5 Error Terms (Residuals)  i  = Y i      X i

6 Estimation of the Regression Line True regression line (unknown): Y =      X +  Estimated regression line: Y = b 0 + b 1  X Observable errors:  i  Y i - b 0 - b 1  X i

7 Least Squares Regression b 0 =  Y - b 1  X minimize a b c

8 Excel Trendlines Construct a scatter diagram Method 1: Select Chart/Add Trendline Method 2: Select data series; right click = 0.0854(6000) – 108.59 = 403.81

9 Without Regression Y Best estimate for Y is the mean; independent of the value of X A measure of total variation is SST =  (Y i -  Y) 2 Unexplained variation XiXi YiYi

10 With Regression Observed values Y i Fitted values Y i XiXi YiYi Variation unexplained after regression, Y - Y Fitted line Y = b 0 + b 1 X Y Variation explained by regression, Y - Y

11 Sums of Squares SST =  (Y i -  Y) 2 =  ( – Y) 2 +  (Y - ) 2 = SSR + SSE Explained variationUnexplained variation

12 Coefficient of Determination R 2 = SSR/SST = (SST – SSE)/SST = 1 – SSE/SST = coefficient of determination: the proportion of variation explained by the independent variable (regression model) 0  R 2  1  Adjusted R 2 incorporates sample size and number of explanatory variables (in multiple regression models). Useful for comparing against models with different numbers of explanatory variables.

13 Correlation Coefficient Sample correlation coefficient R =  R 2 Properties -1  R  1 R = 1 => perfect positive correlation R = -1 => perfect negative correlation R = 0 => no correlation

14 Standard Error of the Estimate MSE = SSE/(n-2) = an unbiased estimate of the variance of the errors about the regression line Standard error of the estimate is Measures the spread of data about the line S YX =

15 Confidence Bands Analogous to confidence intervals, but depend on the specific values of the independent variable  t  /2, n-2 S YX  h i

16 Regression as ANOVA Testing for significance of regression SST = SSR + SSE The null hypothesis implies that SST = SSE, or SSR = 0 MSR = SSR/1 = variance explained by regression F = MSR/MSE If F > critical value, it is likely that  1  0, or that the regression line is significant H 0 :  1 = 0 H 1 :  1  0

17 t-test for Significance of Regression with n-2 degrees of freedom This allows you to test H 0 : slope =  1 H 1 : slope   1

18 Excel Regression Tool Excel menu > Tools > Data Analysis > Regression Input variable ranges Check appropriate boxes Select output options

19 Regression Output Correlation coefficient S YX b 0 b 1 p-value for significance of regression t- test for slope Confidence interval for slope

20 Residuals Standard residuals are residuals divided by their standard error, expressed in units independent of the units of the data.

21 Assumptions Underlying Regression Linearity Check with scatter diagram of the data or the residual plot Normally distributed errors for each X with mean 0 and constant variance Examine histogram of standardized residuals or use goodness-of-fit tests Homoscedasticity – constant variance about the regression line for all values of the independent variable Examine by plotting residuals and looking for differences in variances at different values of X No autocorrelation. Residuals should be independent for each value of the independent variable. Especially important if the independent variable is time (forecasting models).

22 Residual Plot

23 Histogram of Residuals

24 Evaluating Homoscedasticity OK Heteroscadastic

25 Autocorrelation Durbin-Watson statistic D < 1 suggest autocorrelation D > 1.5 suggest no autocorrelation D > 2.5 suggest negative autocorrelation PHStat tool calculates this statistic

26 Regression and Investment Risk Systematic risk – variation in stock price explained by the market Measured by beta Beta = 1: perfect match to market movements Beta < 1: stock is less volatile than market Beta > 1: stock is more volatile than market

27 Systematic Risk (Beta) Beta is the slope of the regression line


Download ppt "Chapter 5: Regression Analysis Part 1: Simple Linear Regression."

Similar presentations


Ads by Google