Presentation is loading. Please wait.

Presentation is loading. Please wait.

BNAD 276: Statistical Inference in Management Spring 2016

Similar presentations


Presentation on theme: "BNAD 276: Statistical Inference in Management Spring 2016"— Presentation transcript:

1 BNAD 276: Statistical Inference in Management Spring 2016
Welcome Green sheets

2

3 Schedule of readings Before our fourth and final exam (April 28th)
OpenStax Chapters 1 – 13 (Chapter 12 is emphasized) Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

4 Homework On class website: Please complete homework worksheet #18 Simple Regression Worksheet Due: Tuesday, April 19th

5 By the end of lecture today 4/14/16
Simple and Multiple Regression Using correlation for predictions r versus r2 Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent) Coefficient of correlation is name for “r” Coefficient of determination is name for “r2” (remember it is always positive – no direction info) Standard error of the estimate is our measure of the variability of the dots around the regression line (average deviation of each data point from the regression line – like standard deviation) Coefficient of regression will “b” for each variable (like slope)

6 3 0.878

7

8 3 0.878 Yes Yes The relationship between the hours worked and weekly pay is a strong positive correlation. This correlation is significant, r(3) = 0.92; p < 0.05

9 3 -0.73 3 0.878 No No The relationship between wait time and number of operators working is negative and strong, but not reliable enough to reach significance. This correlation is not significant, r(3) = -0.73; n.s.

10 We are measuring 9 students

11

12 Critical r = 0.666 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 GPA GPA GPA High School GPA SAT (Verbal) SAT (Mathematical) Do not reject null r is not significant Do not reject null r is not significant Reject Null r is significant r(7) = 0.50 r(7) = r(7) = r(7) = r(7) = r(7) =

13 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 GPA GPA GPA High School GPA SAT (Verbal) SAT (Mathematical) r(7) = 0.50 r(7) = r(7) = r(7) = r(7) = r(7) =

14 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 GPA GPA GPA High School GPA SAT (Verbal) SAT (Mathematical) r(7) = 0.50 r(7) = r(7) = r(7) = r(7) = r(7) =

15 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 4.0 3.0 2.0 1.0 GPA GPA GPA High School GPA SAT (Verbal) SAT (Mathematical) r(7) = 0.50 r(7) = r(7) = r(7) = r(7) = r(7) =

16 Assumptions Underlying Linear Regression
For each value of X, there is a group of Y values These Y values are normally distributed. The means of these normal distributions of Y values all lie on the straight line of regression. The standard deviations of these normal distributions are equal. Revisit this slide

17 Regression Example Rory is an owner of a small software company and employs 10 sales staff. Rory send his staff all over the world consulting, selling and setting up his system. He wants to evaluate his staff in terms of who are the most (and least) productive sales people and also whether more sales calls actually result in more systems being sold. So, he simply measures the number of sales calls made by each sales person and how many systems they successfully sold. Revisit this slide

18 Do more sales calls result in more sales made?
Regression Example 60 70 Number of sales calls made systems sold 10 20 30 40 50 Ava Emily Do more sales calls result in more sales made? Isabella Emma Step 1: Draw scatterplot Ethan Step 2: Estimate r Joshua Jacob Dependent Variable Independent Variable Revisit this slide

19 Regression Example Do more sales calls result in more sales made? Step 3: Calculate r Step 4: Is it a significant correlation? Revisit this slide

20 Do more sales calls result in more sales made?
Step 4: Is it a significant correlation? n = 10, df = 8 alpha = .05 Observed r is larger than critical r (0.71 > 0.632) therefore we reject the null hypothesis. Yes it is a significant correlation r (8) = 0.71; p < 0.05 Step 3: Calculate r Step 4: Is it a significant correlation? Revisit this slide

21 Regression: Predicting sales
Step 1: Draw prediction line r = 0.71 b = (slope) a = (intercept) Draw a regression line and regression equation What are we predicting? Revisit this slide

22 Regression: Predicting sales
Step 1: Draw prediction line r = 0.71 b = (slope) a = (intercept) Draw a regression line and regression equation Revisit this slide

23 Regression: Predicting sales
Step 1: Draw prediction line r = 0.71 b = (slope) a = (intercept) Draw a regression line and regression equation Revisit this slide

24 Describe relationship Regression line (and equation) r = 0.71
Rory’s Regression: Predicting sales from number of visits (sales calls) Describe relationship Regression line (and equation) r = 0.71 Correlation: This is a strong positive correlation. Sales tend to increase as sales calls increase Predict using regression line (and regression equation) b = (slope) Slope: as sales calls increase by 1, sales should increase by Dependent Variable Intercept: suggests that we can assume each salesperson will sell at least systems a = (intercept) Independent Variable

25 Regression: Predicting sales
You should sell systems Step 1: Predict sales for a certain number of sales calls Madison Step 2: State the regression equation Y’ = a + bx Y’ = x Joshua If make one sales call Step 3: Solve for some value of Y’ Y’ = (1) Y’ = What should you expect from a salesperson who makes 1 calls? They should sell systems If they sell more  over performing If they sell fewer  underperforming Revisit this slide

26 Regression: Predicting sales
You should sell systems Step 1: Predict sales for a certain number of sales calls Isabella Step 2: State the regression equation Y’ = a + bx Y’ = x Jacob If make two sales call Step 3: Solve for some value of Y’ Y’ = (2) Y’ = What should you expect from a salesperson who makes 2 calls? They should sell systems If they sell more  over performing If they sell fewer  underperforming Revisit this slide

27 Regression: Predicting sales
You should sell systems Ava Step 1: Predict sales for a certain number of sales calls Emma Step 2: State the regression equation Y’ = a + bx Y’ = x If make three sales call Step 3: Solve for some value of Y’ Y’ = (3) Y’ = What should you expect from a salesperson who makes 3 calls? They should sell systems If they sell more  over performing If they sell fewer  underperforming

28 Regression: Predicting sales
You should sell systems Step 1: Predict sales for a certain number of sales calls Emily Step 2: State the regression equation Y’ = a + bx Y’ = x If make four sales calls Step 3: Solve for some value of Y’ Y’ = (4) Y’ = What should you expect from a salesperson who makes 4 calls? They should sell systems If they sell more  over performing If they sell fewer  underperforming

29 Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels Ava Emma Isabella Emily Madison What should you expect from each salesperson Joshua Jacob They should sell x systems depending on sales calls If they sell more  over performing If they sell fewer  underperforming

30 Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels =14.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Ava 14.7 How did Ava do? Ava sold 14.7 more than expected taking into account how many sales calls she made over performing

31 Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels =-23.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Ava -23.7 How did Jacob do? Jacob sold fewer than expected taking into account how many sales calls he made under performing Jacob

32 Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels Ava Emma Isabella Emily Madison What should you expect from each salesperson Joshua Jacob They should sell x systems depending on sales calls If they sell more  over performing If they sell fewer  underperforming

33 Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Ava 14.7 Emma Isabella -6.8 Emily Madison -23.7 7.9 Joshua Jacob

34 No, we are wrong sometimes…
Does the prediction line perfectly the predicted variable when using the predictor variable? No, we are wrong sometimes… How can we estimate how much “error” we have? Exactly? Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) 14.7 How would we find our “average residual”? -23.7 The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions

35 Σ(Y – Y’) = 0 Σ(Y – Y’) Σx N Σ(Y – Y’)
Residual scores How do we find the average amount of error in our prediction Ava is 14.7 Jacob is -23.7 Emily is -6.8 Madison is 7.9 The average amount by which actual scores deviate on either side of the predicted score Step 1: Find error for each value (just the residuals) Y – Y’ Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Step 2: Add up the residuals Big problem Σ(Y – Y’) = 0 Square the deviations Σ(Y – Y’) 2 How would we find our “average residual”? N Σx Square root 2 n - 2 Σ(Y – Y’) The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions Divide by df

36 √ Σx N How do we find the average amount of error in our prediction
Deviation scores Diallo is 0” Preston is 2” Mike is -4” Step 1: Find error for each value (just the residuals) Hunter is -2 Y – Y’ Sound familiar?? Step 2: Find average ∑(Y – Y’)2 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) n - 2 How would we find our “average residual”? N Σx The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions

37 These would be helpful to know by heart – please memorize
Standard error of the estimate (line) = These would be helpful to know by heart – please memorize these formula

38 Standard error of the estimate:
How well does the prediction line predict the predicted variable when using the predictor variable? Standard error of the estimate (line) What if we want to know the “average deviation score”? Finding the standard error of the estimate (line) Standard error of the estimate: a measure of the average amount of predictive error the average amount that Y’ scores differ from Y scores a mean of the lengths of the green lines Slope doesn’t give “variability” info Intercept doesn’t give “variability” info Correlation “r” does give “variability” info Residuals do give “variability” info

39 How well does the prediction line predict the Ys from the Xs?
A note about curvilinear relationships and patterns of the residuals Residuals Shorter green lines suggest better prediction – smaller error Longer green lines suggest worse prediction – larger error Why are green lines vertical? Remember, we are predicting the variable on the Y axis So, error would be how we are wrong about Y (vertical)

40 No, we are wrong sometimes…
Does the prediction line perfectly the predicted variable when using the predictor variable? No, we are wrong sometimes… How can we estimate how much “error” we have? 14.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) -23.7 The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions Perfect correlation = or -1.00 Each variable perfectly predicts the other No variability in the scatterplot The dots approximate a straight line

41 Regression Analysis – Least Squares Principle
When we calculate the regression line we try to: minimize distance between predicted Ys and actual (data) Y points (length of green lines) remember because of the negative and positive values cancelling each other out we have to square those distance (deviations) so we are trying to minimize the “sum of squares of the vertical distances between the actual Y values and the predicted Y values”

42 Is the regression line better than just guessing the mean of the Y variable? How much does the information about the relationship actually help? Which minimizes error better? How much better does the regression line predict the observed results? r2 Wow!

43 r2 = The proportion of the total variance in one variable that is
What is r2? r2 = The proportion of the total variance in one variable that is predictable by its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what amount (proportion or percentage) of variance of mother’s height is accounted for by daughter’s height? .64 because (.8)2 = .64

44 r2 = The proportion of the total variance in one variable that is
What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what proportion of variance of mother’s height is not accounted for by daughter’s height? .36 because ( ) = .36 or 36% because 100% - 64% = 36%

45 If ice cream sales and temperature are correlated with an
What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If ice cream sales and temperature are correlated with an r = .5, then what amount (proportion or percentage) of variance of ice cream sales is accounted for by temperature? .25 because (.5)2 = .25

46 If ice cream sales and temperature are correlated with an
What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If ice cream sales and temperature are correlated with an r = .5, then what amount (proportion or percentage) of variance of ice cream sales is not accounted for by temperature? .75 because ( ) = .75 or 75% because 100% - 25% = 75%

47 Some useful terms Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent) Coefficient of correlation is name for “r” Coefficient of determination is name for “r2” (remember it is always positive – no direction info) Standard error of the estimate is our measure of the variability of the dots around the regression line (average deviation of each data point from the regression line – like standard deviation)

48 Describe relationship Regression line (and equation) r = 0.71
Rory’s Regression: Predicting sales from number of visits (sales calls) Describe relationship Regression line (and equation) r = 0.71 Correlation: This is a strong positive correlation. Sales tend to increase as sales calls increase Predict using regression line (and regression equation) b = (slope) Slope: as sales calls increase by 1, sales should increase by Dependent Variable Intercept: suggests that we can assume each salesperson will sell at least systems a = (intercept) Independent Variable Review

49 Review

50 Summary Intercept: suggests that we can assume each salesperson will sell at least systems Slope: as sales calls increase by one, more systems should be sold Review

51 Thank you! See you next time!!


Download ppt "BNAD 276: Statistical Inference in Management Spring 2016"

Similar presentations


Ads by Google