Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.

Similar presentations


Presentation on theme: "Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill."— Presentation transcript:

1

2 Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays & Fridays

3

4 On class website: No Homework Due: Monday, April 18 th Homework

5 By the end of lecture today 4/15/16 Simple and Multiple Regression Using correlation for predictions r versus r 2 Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent) Coefficient of correlation is name for “r” Coefficient of determination is name for “r 2 ” (remember it is always positive – no direction info) Standard error of the estimate is our measure of the variability of the dots around the regression line (average deviation of each data point from the regression line – like standard deviation) Coefficient of regression will “b” for each variable (like slope)

6 Before our fourth and final exam (May 2 nd ) OpenStax Chapters 1 – 13 (Chapter 12 is emphasized) Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions Schedule of readings

7

8

9 Labs will meet this week Project 4

10 Regression Example Rory is an owner of a small software company and employs 10 sales staff. Rory send his staff all over the world consulting, selling and setting up his system. He wants to evaluate his staff in terms of who are the most (and least) productive sales people and also whether more sales calls actually result in more systems being sold. So, he simply measures the number of sales calls made by each sales person and how many systems they successfully sold.

11 Regression Example Do more sales calls result in more sales made? Dependent Variable Independent Variable Ethan Isabella Ava Emma Emily Jacob Joshua 60 70 0 1 2 3 4 Number of sales calls made Number of systems sold 10 20 30 40 50 0 Step 1: Draw scatterplot Step 2: Estimate r

12 Regression Example Do more sales calls result in more sales made? Step 3: Calculate r Step 4: Is it a significant correlation?

13 Do more sales calls result in more sales made? Step 4: Is it a significant correlation? n = 10, df = 8 alpha =.05 Observed r is larger than critical r (0.71 > 0.632) therefore we reject the null hypothesis. Yes it is a significant correlation r (8) = 0.71; p < 0.05 Step 3: Calculate r Step 4: Is it a significant correlation?

14 Regression: Predicting sales Step 1: Draw prediction line What are we predicting? r = 0.71 b = 11.579 (slope) a = 20.526 (intercept) Draw a regression line and regression equation

15 Regression: Predicting sales Step 1: Draw prediction line r = 0.71 b = 11.579 (slope) a = 20.526 (intercept) Draw a regression line and regression equation

16 Regression: Predicting sales Step 1: Draw prediction line r = 0.71 b = 11.579 (slope) a = 20.526 (intercept) Draw a regression line and regression equation

17 Rory’s Regression: Predicting sales from number of visits (sales calls) Regression line (and equation) r = 0.71 b = 11.579 (slope) a = 20.526 (intercept) Predict using regression line (and regression equation) Slope: as sales calls increase by 1, sales should increase by 11.579 Describe relationship Correlation: This is a strong positive correlation. Sales tend to increase as sales calls increase Intercept: suggests that we can assume each salesperson will sell at least 20.526 systems Dependent Variable Independent Variable

18 Step 2: State the regression equation Y’ = a + bx Y’ = 20.526 + 11.579x Step 3: Solve for some value of Y’ Y’ = 20.526 + 11.579(1) Y’ = 32.105 If make one sales call You should sell 32.105 systems Regression: Predicting sales Step 1: Predict sales for a certain number of sales calls What should you expect from a salesperson who makes 1 calls? Madison Joshua They should sell 32.105 systems If they sell more  over performing If they sell fewer  underperforming

19 Step 2: State the regression equation Y’ = a + bx Y’ = 20.526 + 11.579x Step 3: Solve for some value of Y’ Y’ = 20.526 + 11.579(2) Y’ = 43.684 Regression: Predicting sales Step 1: Predict sales for a certain number of sales calls What should you expect from a salesperson who makes 2 calls? If make two sales call You should sell 43.684 systems Isabella Jacob They should sell 43.68 systems If they sell more  over performing If they sell fewer  underperforming

20 Step 2: State the regression equation Y’ = a + bx Y’ = 20.526 + 11.579x Step 3: Solve for some value of Y’ Y’ = 20.526 + 11.579(3) Y’ = 55.263 Regression: Predicting sales Step 1: Predict sales for a certain number of sales calls What should you expect from a salesperson who makes 3 calls? If make three sales call You should sell 55.263 systems Ava Emma They should sell 55.263 systems If they sell more  over performing If they sell fewer  underperforming

21 Step 2: State the regression equation Y’ = a + bx Y’ = 20.526 + 11.579x Regression: Predicting sales Step 1: Predict sales for a certain number of sales calls What should you expect from a salesperson who makes 4 calls? Step 3: Solve for some value of Y’ Y’ = 20.526 + 11.579(4) Y’ = 66.842 If make four sales calls You should sell 66.84 systems Emily They should sell 66.84 systems If they sell more  over performing If they sell fewer  underperforming

22 Regression: Evaluating Staff Step 1: Compare expected sales levels to actual sales levels What should you expect from each salesperson They should sell x systems depending on sales calls If they sell more  over performing If they sell fewer  underperforming Madison Isabella Ava Emma Emily Jacob Joshua

23 Regression: Evaluating Staff Step 1: Compare expected sales levels to actual sales levels How did Ava do? Ava sold 14.7 more than expected taking into account how many sales calls she made  over performing Ava 14.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) 70-55.3=14.7

24 Regression: Evaluating Staff Step 1: Compare expected sales levels to actual sales levels How did Jacob do? Jacob sold 23.684 fewer than expected taking into account how many sales calls he made  under performing Ava -23.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Jacob 20-43.7=-23.7

25 Regression: Evaluating Staff Step 1: Compare expected sales levels to actual sales levels What should you expect from each salesperson They should sell x systems depending on sales calls If they sell more  over performing If they sell fewer  underperforming Madison Isabella Ava Emma Emily Jacob Joshua

26 Regression: Evaluating Staff Step 1: Compare expected sales levels to actual sales levels Madison Isabella Ava Emma Emily Jacob Joshua 14.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) -23.7 -6.8 7.9

27 14.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Does the prediction line perfectly the predicted variable when using the predictor variable? The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions How would we find our “average residual”? No, we are wrong sometimes… How can we estimate how much “error” we have? Exactly? -23.7

28 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) How do we find the average amount of error in our prediction The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions How would we find our “average residual”? Step 1: Find error for each value (just the residuals) Y – Y’ Ava is 14.7 Emily is -6.8 Madison is 7.9 Jacob is -23.7 Residual scores The average amount by which actual scores deviate on either side of the predicted score N ΣxΣx Big problem Σ (Y – Y’) = 0 2 Square the deviations Step 2: Add up the residuals Σ (Y – Y’) Divide by df 2 n - 2 Σ (Y – Y’) Square root

29 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) How do we find the average amount of error in our prediction The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions How would we find our “average residual”? Step 1: Find error for each value (just the residuals) Y – Y’ Step 2: Find average ∑(Y – Y’) 2 n - 2 √ Diallo is 0” Mike is -4” Hunter is -2 Preston is 2” Deviation scores N ΣxΣx Sound familiar??

30 These would be helpful to know by heart – please memorize these formula Standard error of the estimate (line) =

31 Shorter green lines suggest better prediction – smaller error Longer green lines suggest worse prediction – larger error Why are green lines vertical? Remember, we are predicting the variable on the Y axis So, error would be how we are wrong about Y (vertical) How well does the prediction line predict the Ys from the Xs? Residuals A note about curvilinear relationships and patterns of the residuals

32


Download ppt "Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill."

Similar presentations


Ads by Google