Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter4 MGMT 405, POM, 2010/11. Lec Notes Chapter.

Similar presentations


Presentation on theme: "© Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter4 MGMT 405, POM, 2010/11. Lec Notes Chapter."— Presentation transcript:

1 © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter4 MGMT 405, POM, 2010/11. Lec Notes Chapter 4: Regression Analysis Department of Business Administration FALL 20 10 - 2011 Too complicated by hand!

2 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 2 Outline: What You Will Learn... Purpose of regression analysis Simple linear regression model Overall significance concept- F-test Individual significance concept- t-test Coefficient of determination and correlation coefficient Confident interval Multiple regression Model Compare and contrast simple linear regression analysis and multiple regression Analysis

3 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 3 Purpose of Regression Analysis Regression Analysis is Used Primarily to Model Causality and Provide Prediction  Predict the values of a dependent (response) variable based on values of at least one independent (explanatory) variable  Explain the effect of the independent variables on the dependent variable  The relationship between X and Y can be shown on a scatter diagram

4 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 4 Scatter Diagram It is two dimensional graph of plotted points in which the vertical axis represents values of the dependent variable and the horizontal axis represents values of the independent or explanatory variable. The patterns of the intersecting points of variables can graphically show relationship patterns. Mostly, scatter diagram is used to prove or disprove cause-and-effect relationship. In the following example, it shows the relationship between advertising expenditure and its sales revenues.

5 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 5 Scatter Diagram Scatter Diagram-Example

6 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 6 Scatter Diagram Scatter diagram shows a positive relationship between the relevant variables. The relationship is approximately linear. This gives us a rough estimates of the linear relationship between the variables in the form of an equation such as Y= a+ b X

7 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 7 Regression Analysis In the equation, a is the vertical intercept of the estimated linear relationship and gives the value of Y when X=0, while b is the slope of the line and gives an estimate of the increase in Y resulting from each unit increase in X. The difficulty with the scatter diagram is that different researchers would probably obtain different results, even if they use same data points. Solution for this is to use regression analysis.

8 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 8 Regression Analysis Regression analysis: is a statistical technique for obtaining the line that best fits the data points so that all researchers can reach the same results. Regression Line: Line of Best Fit Regression Line: Minimizes the sum of the squared vertical deviations (e t ) of each point from the regression line. This is the method called Ordinary Least Squares (OLS).

9 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 9 Regression Analysis In the table, Y 1 refers actual or observed sales revenue of $44 mn associated with the advertising expenditure of $10 mn in the first year for which data collected. In the following graph, Y ^ 1 is the corresponding sales revenue of the firm estimated from the regression line for the advertising expenditure of $10 mn in the first year. The symbol e 1 is the corresponding vertical deviation or error of the actual sales revenue estimated from the regression line in the first year. This can be expressed as e 1 = Y 1 - Y ^ 1.

10 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 10 Regression Analysis In the graph, Y ^ 1 is the corresponding sales revenue of the firm estimated from the regression line for the advertising expenditure of $10 mn in the first year. The symbol e 1 is the corresponding vertical deviation or error of the actual sales revenue estimated from the regression line in the first year. This can be expressed as e 1 = Y 1 - Y ^ 1.

11 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 11 Regression Analysis Since there are 10 observation points, we have obviously 10 vertical deviations or error (i.e., e 1 to e 10 ). The regression line obtained is the line that best fits the data points in the sense that the sum of the squared (vertical) deviations from the line is minimum. This means that each of the 10 e values is first squared and then summed.

12 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 12 Simple Regression Analysis Now we are in a position to calculate the value of a ( the vertical intercept) and the value of b (the slope coefficient) of the regression line. Conduct tests of significance of parameter estimates. Construct confidence interval for the true parameter. Test for the overall explanatory power of the regression.

13 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 13 Simple Linear Regression Model average value Regression line is a straight line that describes the dependence of the average value of one variable on the other Y Intercept Slope Coefficient Random Error Independent (Explanatory) Variable Regression Line Dependent (Response) Variable

14 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 14 Ordinary Least Squares (OLS) Model:

15 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 15 Ordinary Least Squares (OLS) Objective: Determine the slope and intercept that minimize the sum of the squared errors.

16 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 16 Ordinary Least Squares (OLS) Estimation Procedure

17 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 17 Ordinary Least Squares (OLS) Estimation Example

18 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 18 Ordinary Least Squares (OLS) Estimation Example

19 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 19 The Equation of Regression Line The equation of the regression line can be constructed as follows: Yt^=7.60 +3.53 Xt When X=0 (zero advertising expenditures), the expected sales revenue of the firm is $7.60 mn. In the first year, when X=10mn, Y1^= $42.90 mn. Strictly speaking, the regression line should be used only to estimate the sales revenues resulting from advertising expenditure that are within the range.

20 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 20 Tests of Significance: Standard Error To test the hypothesis that b is statistically significant (i.e., advertising positively affects sales), we need first of all to calculate standard error (deviation) of b ^. The standard error can be calculated in the following expression:

21 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 21 Tests of Significance Standard Error of the Slope Estimate

22 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 22 Tests of Significance Example Calculation Yt^=7.60 +3.53 Xt =7.60+3.53(10)= 42.90

23 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 23 Tests of Significance Example Calculation

24 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 24 Tests of Significance Calculation of the t Statistic Degrees of Freedom = (n-k) = (10-2) = 8 Critical Value (tabulated) at 5% level =2.306

25 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 25 Confidence interval We can also construct confidence interval for the true parameter from the estimated coefficient. Accepting the alternative hypothesis that there is a relationship between X and Y. Using tabular value of t=2.306 for 5% and 8 df in our example, the true value of b will lies between 2.33 and 4.73 t=b^+/- 2.306 (sb^)=3.53+/- 2.036 (0.52)

26 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 26 Tests of Significance Decomposition of Sum of Squares Total Variation = Explained Variation + Unexplained Variation

27 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 27 Tests of Significance Decomposition of Sum of Squares

28 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 28 Coefficient of Determination-R 2 Coefficient of Determination: is defined as the proportion of the total variation or dispersion in the dependent variable that explained by the variation in the explanatory variables in the regression. In our example, COD measures how much of the variation in the firm’s sales is explained by the variation in its advertising expenditures.

29 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 29 Tests of Significance Coefficient of Determination

30 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 30 Coefficient of Correlation-r Coefficient of Correlation (r): The square root of the coefficient of determination. This is simply a measure of the degree of association or co-variation that exists between variables X and Y. In our example, this mean that variables X and Y vary together 92% of the time. The sign of coefficient r is always the same as the sign of coefficient of b^.

31 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 31 Tests of Significance Coefficient of Correlation

32 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 32 Simple Linear Regression: Example You wish to examine the linear dependency of the annual sales of produce stores on their sizes in square footage. Sample data for 7 stores were obtained. Find the equation of the straight line that fits the data best. Annual Store Square Sales Feet($1000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760

33 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 33 Scatter Diagram: Example Excel Output

34 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 34 Simple Linear Regression Equation: Example From Excel Printout:

35 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 35 Graph of the Simple Linear Regression Equation: Example Y i = 1636.415 +1.487X i

36 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 36 Interpretation of Results: Example The slope of 1.487 means that for each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units. The equation estimates that for each increase of 1 square foot in the size of the store, the expected annual sales are predicted to increase by $1487.

37 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 37 Crucial Assumptions Error term is normally distributed. Error term has zero expected value or mean. Error term has constant variance in each time period and for all values of X. Error term’s value in one time period is unrelated to its value in any other period.

38 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 38 Multiple Regression Analysis Model:

39 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 39  Relationship between 1 dependent & 2 or more independent variables is a linear function Y-interceptSlopes Random error Dependent (Response) variable Independent (Explanatory) variables Multiple Regression Analysis

40 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 40 Multiple Regression Model: Example Develop a model for estimating heating oil used for a single family home in the month of January, based on average temperature and amount of insulation in inches.

41 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 41 Multiple Regression Model: Example Excel Output For each degree increase in temperature, the estimated average amount of heating oil used is decreased by 5.437 gallons, holding insulation constant. For each increase in one inch of insulation, the estimated average use of heating oil is decreased by 20.012 gallons, holding temperature constant.

42 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 42 Multiple Regression Analysis Adjusted Coefficient of Determination

43 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 43 Interpretation of Coefficient of Multiple Determination 96.56% of the total variation in heating oil can be explained by temperature and amount of insulation 95.99% of the total fluctuation in heating oil can be explained by temperature and amount of insulation after adjusting for the number of explanatory variables and sample size

44 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 44 Testing for Overall Significance Shows if Y Depends Linearly on All of the X Variables Together as a Group Use F Test Statistic Hypotheses:  H0: β 1 = β2 = … = βk = 0 (No linear relationship)  H1: At least one βi  0 ( At least one independent variable affects Y ) The Null Hypothesis is a Very Strong Statement The Null Hypothesis is Almost Always Rejected

45 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 45 Multiple Regression Analysis Analysis of Variance and F Statistic

46 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 46 Test for Overall Significance Excel Output: Example k -1= 2, the number of explanatory variables and dependent variable n - 1 p -value k = 3, no of parameters

47 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 47 Test for Overall Significance: Example Solution 0 3.89  = 0.05 H 0 :  1 =  2 = … =  k = 0 H 1 : At least one  j  0  =.05 df = 2 and 12 Critical Value : Test Statistic: Decision: Conclusion: F 168.47 Reject at  = 0.05. There is evidence that at least one independent variable affects Y. 

48 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 48 t Test Statistic Excel Output: Example t Test Statistic for X 2 (Insulation) t Test Statistic for X 1 (Temperature)

49 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 49 t Test : Example Solution Does temperature have a significant effect on monthly consumption of heating oil? Test at  = 0.05. H 0 :  1 = 0 H 1 :  1  0 df = 12 Critical Values: Test Statistic: t Test Statistic = -16.1699 Decision: Reject H 0 at  = 0.05. Conclusion: There is evidence of a significant effect of temperature on oil consumption holding constant the effect of insulation. Reject H 00.025 -2.17882.1788 0

50 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 50 Problems in Regression Analysis Multicollinearity: Two or more explanatory variables are highly correlated. Heteroskedasticity: Variance of error term is not independent of the Y variable. Autocorrelation: Consecutive error terms are correlated. Functional form: Misspecified by the omission of a variable Normality: Residuals are normally distributed or not

51 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 51 Steps in Demand Estimation Model Specification: Identify Variables Collect Data Specify Functional Form Estimate Function Test the Results

52 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 52 Functional Form Specifications Linear Function: Power Function:Estimation Format:

53 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 53 Demand Estimation To use these important demand relationship in decision analysis, we need empirically to estimate the structural form and parameters of the demand function-Demand Estimation. Q dx = (P, I, P c, P s, T) (-, +, -, +, +) The demand for a commodity arises from the consumers’ willingness and ability to purchase the commodity. Consumer demand theory postulates that the quantity demanded of a commodity is a function of or depends on the price of the commodity, the consumers’ income, the price of related commodities, and the tastes of the consumer.

54 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 54 Demand Estimation  In general, we will seek the answer for the following qustions: How much will the revenue of the firm change after increasing the price of the commodity? How much will the quantity demanded of the commodity increase if consumers’ income increase What if the firms double its ads expenditure? What if the competitors lower their prices?  Firms should know the answers the abovementioned questions if they want to achieve the objective of maximizing thier value.

55 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 55 Dummy-Variable Models When the explanatory variables are qualitative in nature, these are known as dummy variables. These can also defined as indicators variables, binary variables, categorical variables, and dichotomous variables such as variable D in the following equation:

56 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 56 Dummy-Variable Models Categorical Explanatory Variable with 2 or More Levels Yes or No, On or Off, Male or Female, Use Dummy-Variables (Coded as 0 or 1) Only Intercepts are Different Assumes Equal Slopes Across Categories Regression Model Has Same Form Can the dependent variable be dummy?

57 MGMT 405, POM, 2010/11. Lec Notes © Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter 4 57 Thanks


Download ppt "© Stevenson, McGraw Hill, 2007- Assoc. Prof. Sami Fethi, EMU, All Right Reserved. Regression Analysis; Chapter4 MGMT 405, POM, 2010/11. Lec Notes Chapter."

Similar presentations


Ads by Google