Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)

Similar presentations


Presentation on theme: "1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)"— Presentation transcript:

1 1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)

2 2 BUSI 6220 Simple Linear Regression Y = b 0 + b 1 X + residual Model with one predictor variable: X Y The fitted line is a straight line, since the model is of 1 st order: Ŷ = b 0 + b 1 X explained part of Y unexplained part of Y (error)

3 3 BUSI 6220 Quadratic Regression Y = b 0 + b 1 X + b 2 X 2 Quadratic Regression model: X Y

4 4 BUSI 6220 Polynomial Regression Y = b 0 + b 1 X + b 2 X 2 + b 3 X 3 3 rd -order Regression model: X Y

5 5 BUSI 6220 Indicator (Dummy) Variables Dummy, or indicator, variables allow for the inclusion of qualitative variables in the model For example: I1 =I1 =I1 =I1 = 1if female 0if male

6 6 BUSI 6220 Indicator (Dummy) Variables Y = b 0 + b 1 X + b 2 I Model with Indicator variable: Rewrite the model as: For I = 0, For I = 1, Y = b 0 + b 1 X Y = (b 0 + b 2 ) + b 1 X X Y

7 7 BUSI 6220 Indicator Variables with interaction Y = b 0 + b 1 X + b 2 I + b 3 XI Rewrite the model as: For I = 0, For I = 1, Y = b 0 + b 1 X Y = (b 0 + b 2 ) + (b 1 + b 3 ) X Model with Indicator variable: X Y

8 8 BUSI 6220 Hypothesis Test on the Slope of the Regression Line H o :  1 = 0 (X provides no information) H a :  1 ≠ 0 (X does provide information) Two-Tailed Test Test Statistic: t =t =t =t = b1b1sbsbb1b1sbsb 1 For large data sets, reject H o if |t| > 2

9 9 BUSI 6220 Model Assumptions and Residual Analysis Randomness Constant Variance Normal Distribution Residuals should have… Y Y - Y ^ Residual Plot ^

10 10 BUSI 6220 Residual Analysis Y - Y ^ Y Residual Plot Violation of the constant variance assumption How to fix it: Transformation ^

11 11 BUSI 6220 Residual Analysis Violation of the randomness assumption How to fix it: Add more predictor variables to explain patterns. In time series data, add lags of Y or X as predictors: Y t-1, X1 t-1, X1 t-2, X2 t-1, etc. Y - Y ^ Y Sequence Plot of Residuals ^

12 12 BUSI 6220 Residual Analysis Violation of the normality assumption How to fix it: Transformation (start with easy transformations, such as Log(Y), then continue with bucket transformations, etc.) Frequency Histogram of the residuals

13 13 BUSI 6220 Stepwise Procedures Procedures either choose or eliminate variables, one at a time, in an effort to avoid including variables with either no predictive ability or are highly correlated with other predictor variables  Forward selection Add one variable at a time until contribution is insignificant  Backward elimination Remove one variable at a time starting with the “worst” until R 2 drops significantly  Stepwise selection Forward regression with the ability to remove variables that become insignificant

14 14 BUSI 6220 Stepwise Regression Include X 3 Include X 6 Include X 2 Include X 5 Remove X 2 (When X 5 was inserted into the model X 2 became unnecessary) Include X 7 Remove X 7 - it is insignificant Stop Final model includes X 3, X 5 and X 6 An example implementation of the stepwise procedure:

15 15 BUSI 6220 Interpretation of Regression Coefficients: Linear Regression When Y is continuous (Linear Regression), the standard interpretation of a slope coefficient is as follows: The slope tells you the change in Y when a particular input X increases by one unit, while all other inputs are kept constant. Example: If a linear regression coefficient is equal to -2.5, that indicates that when the predictor increases, the target variable decreases (since the coefficient is negative ). Specifically, for each unit increase in the predictor variable, the target variable decreases by 2.5 units

16 16 BUSI 6220 Interpretation of Regression Coefficients: Logistic Regression When Y is binary (Logistic Regression), the standard interpretation of an odds ratio coefficient is as follows: The odds ratio coefficient is a multiplier on the odds ratio for the target event T (=probability for T / probability for non-T) when a particular input X increases by one unit, while all other inputs are kept constant. Example: If an odds ratio coefficient is equal to 1.05, that indicates that when the predictor increases, the probability for the target event also increases. Specifically, for each unit increase in the predictor variable, the odds ratio P(event/nonevent) gets multiplied by 1.05


Download ppt "1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)"

Similar presentations


Ads by Google