Chapter 5: Regression Analysis Part 1: Simple Linear Regression.

Chapter 5: Regression Analysis Part 1: Simple Linear Regression

Regression Analysis Building models that characterize the relationships between a dependent variable and one (single) or more (multiple) independent variables, all of which are numerical, for example: Sales = a + b*Price + c*Coupons + d*Advertising + e*Price*Advertising Cross-sectional data Time series data (forecasting)

Simple Linear Regression Single independent variable Linear relationship

SLR Model Y =      X +  Interceptslopeerror E(Y  X) f(Y  X)

Error Terms (Residuals)  i  = Y i      X i

Estimation of the Regression Line True regression line (unknown): Y =      X +  Estimated regression line: Y = b 0 + b 1  X Observable errors:  i  Y i - b 0 - b 1  X i

Least Squares Regression b 0 =  Y - b 1  X minimize a b c

Excel Trendlines Construct a scatter diagram Method 1: Select Chart/Add Trendline Method 2: Select data series; right click = 0.0854(6000) – 108.59 = 403.81

Without Regression Y Best estimate for Y is the mean; independent of the value of X A measure of total variation is SST =  (Y i -  Y) 2 Unexplained variation XiXi YiYi

With Regression Observed values Y i Fitted values Y i XiXi YiYi Variation unexplained after regression, Y - Y Fitted line Y = b 0 + b 1 X Y Variation explained by regression, Y - Y

Sums of Squares SST =  (Y i -  Y) 2 =  ( – Y) 2 +  (Y - ) 2 = SSR + SSE Explained variationUnexplained variation

Coefficient of Determination R 2 = SSR/SST = (SST – SSE)/SST = 1 – SSE/SST = coefficient of determination: the proportion of variation explained by the independent variable (regression model) 0  R 2  1  Adjusted R 2 incorporates sample size and number of explanatory variables (in multiple regression models). Useful for comparing against models with different numbers of explanatory variables.

Correlation Coefficient Sample correlation coefficient R =  R 2 Properties -1  R  1 R = 1 => perfect positive correlation R = -1 => perfect negative correlation R = 0 => no correlation

Standard Error of the Estimate MSE = SSE/(n-2) = an unbiased estimate of the variance of the errors about the regression line Standard error of the estimate is Measures the spread of data about the line S YX =

Confidence Bands Analogous to confidence intervals, but depend on the specific values of the independent variable  t  /2, n-2 S YX  h i

Regression as ANOVA Testing for significance of regression SST = SSR + SSE The null hypothesis implies that SST = SSE, or SSR = 0 MSR = SSR/1 = variance explained by regression F = MSR/MSE If F > critical value, it is likely that  1  0, or that the regression line is significant H 0 :  1 = 0 H 1 :  1  0

t-test for Significance of Regression with n-2 degrees of freedom This allows you to test H 0 : slope =  1 H 1 : slope   1

Excel Regression Tool Excel menu > Tools > Data Analysis > Regression Input variable ranges Check appropriate boxes Select output options

Regression Output Correlation coefficient S YX b 0 b 1 p-value for significance of regression t- test for slope Confidence interval for slope

Residuals Standard residuals are residuals divided by their standard error, expressed in units independent of the units of the data.

Assumptions Underlying Regression Linearity Check with scatter diagram of the data or the residual plot Normally distributed errors for each X with mean 0 and constant variance Examine histogram of standardized residuals or use goodness-of-fit tests Homoscedasticity – constant variance about the regression line for all values of the independent variable Examine by plotting residuals and looking for differences in variances at different values of X No autocorrelation. Residuals should be independent for each value of the independent variable. Especially important if the independent variable is time (forecasting models).

Residual Plot

Histogram of Residuals

Evaluating Homoscedasticity OK Heteroscadastic

Autocorrelation Durbin-Watson statistic D < 1 suggest autocorrelation D > 1.5 suggest no autocorrelation D > 2.5 suggest negative autocorrelation PHStat tool calculates this statistic

Regression and Investment Risk Systematic risk – variation in stock price explained by the market Measured by beta Beta = 1: perfect match to market movements Beta < 1: stock is less volatile than market Beta > 1: stock is more volatile than market

Systematic Risk (Beta) Beta is the slope of the regression line

Chapter 5: Regression Analysis Part 1: Simple Linear Regression.

Similar presentations

Presentation on theme: "Chapter 5: Regression Analysis Part 1: Simple Linear Regression."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 5: Regression Analysis Part 1: Simple Linear Regression.

Similar presentations

Presentation on theme: "Chapter 5: Regression Analysis Part 1: Simple Linear Regression."— Presentation transcript:

Similar presentations

About project

Feedback