Presentation is loading. Please wait.

Presentation is loading. Please wait.

BNAD 276: Statistical Inference in Management Spring 2016 Green sheets.

Similar presentations


Presentation on theme: "BNAD 276: Statistical Inference in Management Spring 2016 Green sheets."— Presentation transcript:

1

2 BNAD 276: Statistical Inference in Management Spring 2016 Green sheets

3

4 Before our fourth and final exam (April 28 th ) OpenStax Chapters 1 – 13 (Chapter 12 is emphasized) Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions Schedule of readings

5 On class website: Please complete homework worksheet #19 Multiple Regression Worksheet Due: Tuesday, April 26 th Homework Study guide for Exam 4 is online

6 By the end of lecture today 4/21/16 Simple and Multiple Regression Using correlation for predictions r versus r 2 Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent) Coefficient of correlation is name for “r” Coefficient of determination is name for “r 2 ” (remember it is always positive – no direction info) Standard error of the estimate is our measure of the variability of the dots around the regression line (average deviation of each data point from the regression line – like standard deviation) Coefficient of regression will “b” for each variable (like slope)

7 Rory’s Regression: Predicting sales from number of visits (sales calls) Regression line (and equation) r = 0.71 b = 11.579 (slope) a = 20.526 (intercept) Predict using regression line (and regression equation) Slope: as sales calls increase by 1, sales should increase by 11.579 Describe relationship Correlation: This is a strong positive correlation. Sales tend to increase as sales calls increase Intercept: suggests that we can assume each salesperson will sell at least 20.526 systems Review Dependent Variable Independent Variable

8 Review (0.71 > 0.632) 50% is explained so the other 50% has yet to be explained

9 Summary Slope: as sales calls increase by one, 11.579 more systems should be sold Intercept: suggests that we can assume each salesperson will sell at least 20.526 systems Review

10 Homework Review

11 Multiple regression equations Can use variables to predict behavior of stock market probability of accident amount of pollution in a particular well quality of a wine for a particular year which candidates will make best workers

12 Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Measured current workers – the best workers tend to have highest “success scores”. (Success scores range from 1 – 1,000) Try to predict which applicants will have the highest success score. We have found that these variables predict success: Age (X 1 ) Niceness (X 2 ) Harshness (X 3 ) According to your research, age has only a small effect on success, while workers’ attitude has a big effect. Turns out, the best workers have high “niceness” scores and low “harshness” scores. Your results are summarized by this regression formula: Both 10 point scales Niceness (10 = really nice) Harshness (10 = really harsh) Success score = (1)( Age ) + (20)( Nice ) + (-75)( Harsh ) + 700 Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Can use variables to predict which candidates will make best workers

13 Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a According to your research, age has only a small effect on success, while workers’ attitude has a big effect. Turns out, the best workers have high “niceness” scores and low “harshness” scores. Your results are summarized by this regression formula: Success score = (1)( Age ) + (20)( Nice ) + (-75)( Harsh ) + 700

14 Y’ is the dependent variable “Success score” is your dependent variable. X 1 X 2 and X 3 are the independent variables “Age”, “Niceness” and “Harshness” are the independent variables. Each “b” is called a regression coefficient. Each “b” shows the change in Y for each unit change in its own X (holding the other independent variables constant). a is the Y-intercept Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a According to your research, age has only a small effect on success, while workers’ attitude has a big effect. Turns out, the best workers have high “niceness” scores and low “harshness” scores. Your results are summarized by this regression formula: Success score = (1)( Age ) + (20)( Nice ) + (-75)( Harsh ) + 700

15 14-14 The Multiple Regression Equation – Interpreting the Regression Coefficients b 1 = The regression coefficient for age (X 1 ) is “1” The coefficient is positive and suggests a positive correlation between age and success. As the age increases the success score increases. The numeric value of the regression coefficient provides more information. If age increases by 1 year and hold the other two independent variables constant, we can predict a 1 point increase in the success score. Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Success score = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700

16 14-15 The Multiple Regression Equation – Interpreting the Regression Coefficients b 2 = The regression coefficient for age (X 2 ) is “20” The coefficient is positive and suggests a positive correlation between niceness and success. As the niceness increases the success score increases. The numeric value of the regression coefficient provides more information. If the “niceness score” increases by one, and hold the other two independent variables constant, we can predict a 20 point increase in the success score. Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Success score = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700

17 14-16 The Multiple Regression Equation – Interpreting the Regression Coefficients b 3 = The regression coefficient for age (X 3 ) is “-75” The coefficient is negative and suggests a negative correlation between harshness and success. As the harshness increases the success score decreases. The numeric value of the regression coefficient provides more information. If the “harshness score” increases by one, and hold the other two independent variables constant, we can predict a 75 point decrease in the success score. Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Success score = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700

18 Here comes Victoria, her scores are as follows: Age = 30 Niceness = 8 Harshness = 2 What would we predict her “success index” to be? Y’ = = 3.812 Prediction line: Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Y’ = 1X 1 + 20X 2 - 75X 3 + 700 Y' = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700 We predict Victoria will have a Success Index of 740 Y’ = 740 (1)(30) + (20)(8) - 75(2) + 700 Y' = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700

19 Here comes Victor, his scores are as follows: Here comes Victoria, her scores are as follows: Age = 30 Niceness = 8 Harshness = 2 What would we predict her “success index” to be? Y’ = = 3.812 We predict Victor will have a Success Index of 175 Prediction line: Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Y’ = 1X 1 + 20X 2 - 75X 3 + 700 Y' = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700 Y’ = 740 (1)(30) + (20)(8) - 75(2) + 700 Y' = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700 Age = 35 Niceness = 2 Harshness = 8 We predict Victoria will have a Success Index of 740 What would we predict his “success index” to be? Y’ = Y’ = 175 (1)(35) + (20)(2) - 75(8) + 700 Y' = (1)(Age) + (20)(Nice) + (-75)(Harsh) + 700

20 We predict Victor will have a Success Index of 175 We predict Victoria will have a Success Index of 740 Can use variables to predict which candidates will make best workers Who will we hire?

21 Conducting multiple regression analyses that are relevant and useful starts with measurement designed to decrease uncertainty “Anything can be measured. If a thing can be observed in any way at all, it lends itself to some type of measurement method. No matter how “fuzzy” the measurement is, it’s still a measurement if it tells you more than you knew before.” Douglas Hubbard -Author “How to Measure Anything: Finding the value of “Intangibles” in Business”

22 Measurements don’t have to be precise to be useful “Anything can be measured. If a thing can be observed in any way at all, it lends itself to some type of measurement method. No matter how “fuzzy” the measurement is, it’s still a measurement if it tells you more than you knew before.” Douglas Hubbard -Author “How to Measure Anything: Finding the value of “Intangibles” in Business” How do we operationally define and measure constructs that we care about? “A problem well stated is a problem half solved” Charles Kettering (1876 – 1958), American inventor, holder of 300 patents, including electrical ignition for automobiles “It is better to be approximately right, than to be precisely wrong.” - Warren Buffett

23 14-22 Can we predict heating cost? Three variables are thought to relate to the heating costs: (1) the mean daily outside temperature, (2) the number of inches of insulation in the attic, and (3) the age in years of the furnace. To investigate, Salisbury's research department selected a random sample of 20 recently sold homes. It determined the cost to heat each home last January Multiple Linear Regression - Example

24

25 14-24 The Multiple Regression Equation – Interpreting the Regression Coefficients b 1 = The regression coefficient for mean outside temperature (X 1 ) is -4.583. The coefficient is negative and shows a negative correlation between heating cost and temperature. As the outside temperature increases, the cost to heat the home decreases. The numeric value of the regression coefficient provides more information. If we increase temperature by 1 degree and hold the other two independent variables constant, we can estimate a decrease of $4.583 in monthly heating cost.

26 14-25 The Multiple Regression Equation – Interpreting the Regression Coefficients b 2 = The regression coefficient for mean attic insulation (X 2 ) is -14.831. The coefficient is negative and shows a negative correlation between heating cost and insulation. The more insulation in the attic, the less the cost to heat the home. So the negative sign for this coefficient is logical. For each additional inch of insulation, we expect the cost to heat the home to decline $14.83 per month, regardless of the outside temperature or the age of the furnace.

27 14-26 The Multiple Regression Equation – Interpreting the Regression Coefficients b 3 = The regression coefficient for mean attic insulation (X 3 ) is 6.101 The coefficient is positive and shows a negative correlation between heating cost and insulation. As the age of the furnace goes up, the cost to heat the home increases. Specifically, for each additional year older the furnace is, we expect the cost to increase $6.10 per month.

28

29

30

31

32

33

34 Applying the Model for Estimation What is the estimated heating cost for a home if: the mean outside temperature is 30 degrees, there are 5 inches of insulation in the attic, and the furnace is 10 years old?

35 Multiple regression equations Prediction line Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Very often we want to select students or employees who have the highest probability of success in our school or company. Andy is an administrator at a paralegal program and he wants to predict the Grade Point Average (GPA) for the incoming class. He thinks these independent variables will be helpful in predicting GPA. High School GPA (X 1 ) SAT - Verbal (X 2 ) SAT - Mathematical (X 3 ) Andy completes a multiple regression analysis and comes up with this regression equation: Y’ = 1.2X 1 +.00163X 2 -.00194X 3 -.411 Y’ = 1.2 gpa +.00163 sat verb -.00194sat math -.411

36 Here comes Victoria, her scores are as follows: High School GPA = 3.81 SAT Verbal = 500 SAT Mathematical = 600 What would we predict her GPA to be in the paralegal program? Y’ = 1.2 (3.81) +.00163 (500) -.00194 (600) -.411 Y’ = 4.572 +.815 - 1.164 -.411 Y’ = 1.2 gpa +.00163 sat verb -.00194sat math -.411 Predict Victor’s GPA, his scores are as follows: High School GPA = 2.63 SAT - Verbal = 469 SAT - Mathematical = 440 Y’ = 1.2 (2.63) +.00163 (469) -.00194 (440) -.411 Y’ = 3.156 +.76447 -.8536 -.411 = 3.812 Y’ = 1.2 gpa +.00163 sat verb -.00194 sat math -.411 We predict Victor will have a GPA of 2.656 = 2.66 Prediction line: Y’ = b 1 X 1 + b 2 X 2 + b 3 X 3 + a Y’ = 1.2X 1 +.00163X 2 -.00194X 3 -.411 We predict Victoria will have a GPA of 3.812

37

38 500 400 300 200 100 0 20 40 60 80 Average Temperature Heating Cost r(18) = - 0.50 r(18) = - 0.811508835 500 400 300 200 100 0 20 40 60 80 Insulation Heating Cost r(18) = - 0.40 r(18) = - 0.257101335 500 400 300 200 100 0 20 40 60 80 Age of Furnace Heating Cost r(18) = + 0.60 r(18) = + 0.536727562

39 500 400 300 200 100 0 20 40 60 80 Average Temperature Heating Cost r(18) = - 0.50 r(18) = - 0.811508835 500 400 300 200 100 0 20 40 60 80 Insulation Heating Cost r(18) = - 0.40 r(18) = - 0.257101335 500 400 300 200 100 0 20 40 60 80 Age of Furnace Heating Cost r(18) = + 0.60 r(18) = + 0.536727562

40 + 427.19 - 4.5827 -14.8308 + 6.1010 427.19 - 4.5827 x 1 - 14.8308 x 2 + 6.1010 x 3 Y’ =

41 + 427.19 - 4.5827 -14.8308 + 6.1010 427.19 - 4.5827 x 1 - 14.8308 x 2 + 6.1010 x 3 Y’ =

42 + 427.19 - 4.5827 -14.8308 + 6.1010 427.19 - 4.5827 x 1 - 14.8308 x 2 + 6.1010 x 3 Y’ =

43 + 427.19 - 4.5827 -14.8308 + 6.1010 427.19 - 4.5827 x 1 - 14.8308 x 2 + 6.1010 x 3 Y’ =

44 + 427.19 - 4.5827 -14.8308 + 6.1010 427.19 - 4.5827 x 1 - 14.8308 x 2 + 6.1010 x 3 Y’ =

45 4.58 14.83 6.10 427.19 - 4.5827(30) -14.8308 (5) +6.1010 (10) Y’ = 427.19 - 137.481 - 74.154 + 61.010 Y’ = = $ 276.56 Calculate the predicted heating cost using the new value for the age of the furnace Use the regression coefficient for the furnace ($6.10), to estimate the change

46 4.58 14.83 6.10 427.19 - 4.5827(30) -14.8308 (5) +6.1010 (10) Y’ = 427.19 - 137.481 - 74.154 + 61.010 Y’ = = $ 276.56 $ 276.56 Calculate the predicted heating cost using the new value for the age of the furnace Use the regression coefficient for the furnace ($6.10), to estimate the change 427.19 - 4.5827(30) -14.8308 (5) +6.1010 (10) Y’ = 427.19 - 137.481 - 74.154 + 61.010 Y’ = = $ 276.56 427.19 - 4.5827(30) -14.8308 (5) +6.1010 (11) Y’ = 427.19 - 137.481 - 74.154 + 67.111 Y’ = = $ 282.66 These differ by only one year but heating cost changed by $6.10 282.66 – 276.56 = 6.10

47 4.0 3.0 2.0 1.0 0 1 2 3 4 High School GPA GPA r(7) = 0.50 r(7) = + 0.911444123 0 200 300 400 500 600 SAT (Verbal) GPA r(7) = + 0.80 r(7) = + 0.616334867 SAT (Mathematical) GPA r(7) = + 0.80 r(7) = + 0.487295007 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 0 200 300 400 500 600

48 4.0 3.0 2.0 1.0 0 1 2 3 4 High School GPA GPA r(7) = 0.50 r(7) = + 0.911444123 0 200 300 400 500 600 SAT (Verbal) GPA r(7) = + 0.80 r(7) = + 0.616334867 SAT (Mathematical) GPA r(7) = + 0.80 r(7) = + 0.487295007 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 0 200 300 400 500 600

49 4.0 3.0 2.0 1.0 0 1 2 3 4 High School GPA GPA r(7) = 0.50 r(7) = + 0.911444123 0 200 300 400 500 600 SAT (Verbal) GPA r(7) = + 0.80 r(7) = + 0.616334867 SAT (Mathematical) GPA r(7) = + 0.80 r(7) = + 0.487295007 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 0 200 300 400 500 600

50 4.0 3.0 2.0 1.0 0 1 2 3 4 High School GPA GPA r(7) = 0.50 r(7) = + 0.911444123 0 200 300 400 500 600 SAT (Verbal) GPA r(7) = + 0.80 r(7) = + 0.616334867 SAT (Mathematical) GPA r(7) = + 0.80 r(7) = + 0.487295007 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 0 200 300 400 500 600

51 - 0.41107 No

52 + 1.2013 Yes - 0.41107 No

53 0.0016 No + 1.2013 Yes - 0.41107 No

54 - 0.0019 No + 1.2013 Yes - 0.41107 No 0.0016

55 - 0.0019 No + 1.2013 Yes - 0.41107 No High School GPA 0.0016

56 - 0.0019 No + 1.2013 Yes - 0.41107 No High School GPA - 0.0019 x 3 + 0.0016 x 2 + 1.2013 x 1 Y’ = - 0.41107 0.0016

57 1.201.0016.0019 - 0.0019 (460) + 0.0016 (430) + 1.2013 (2.8) Y’ = - 0.411 - 0.0019 x 3 + 0.0016 x 2 + 1.2013 x 1 Y’ = - 0.41107 = 2.76 2.76

58 1.201.0016 - 0.0019 (460) + 0.0016 (430) + 1.2013 (3.8) Y’ = - 0.411 - 0.0019 x 3 + 0.0016 x 2 + 1.2013 x 1 Y’ = - 0.41107 = 3.96 3.96.0019

59 1.201.0016.0019 Yes, use the regression coefficient for the HS GPA (1.2), to estimate the change 3.96 2.76 3.96 - 2.76 = 1.2

60


Download ppt "BNAD 276: Statistical Inference in Management Spring 2016 Green sheets."

Similar presentations


Ads by Google