Download presentation
Presentation is loading. Please wait.
Published byΛαδων Γεωργίου Modified over 5 years ago
1
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays & Fridays. April 22
2
The Green Sheets
3
Study Guide for Exam 4 On website
Schedule of readings Before our fourth and final exam (April 29th) OpenStax Chapters 1 – 13 (Chapter 12 is emphasized) Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions Study Guide for Exam 4 On website
4
Are optional and will focus on preparing for Exam 4
Lab sessions Labs this week Are optional and will focus on preparing for Exam 4
5
Solutions for Review Sheet Available
6
regression coefficient
We refer to the predicted variable as the dependent variable (Y) and the predictor variable (X) as the independent variable Why are we finding the regression line? How would we use it? regression coefficient (slope) correlation coefficient (“r”) Revisit this slide
7
Regression Example Rory is an owner of a small software company and employs 10 sales staff. Rory send his staff all over the world consulting, selling and setting up his system. He wants to evaluate his staff in terms of who are the most (and least) productive sales people and also whether more sales calls actually result in more systems being sold. So, he simply measures the number of sales calls made by each sales person and how many systems they successfully sold. Revisit this slide
8
Describe relationship Regression line (and equation) r = 0.71
Rory’s Regression: Predicting sales from number of visits (sales calls) Describe relationship Regression line (and equation) r = 0.71 Correlation: This is a strong positive correlation. Sales tend to increase as sales calls increase Predict using regression line (and regression equation) b = (slope) Slope: as sales calls increase by 1, sales should increase by Dependent Variable Intercept: suggests that we can assume each salesperson will sell at least systems a = (intercept) Independent Variable Revisit this slide
9
Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels =14.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Ava 14.7 How did Ava do? Ava sold 14.7 more than expected taking into account how many sales calls she made over performing Revisit this slide
10
Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels =-23.7 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Ava -23.7 How did Jacob do? Jacob sold fewer than expected taking into account how many sales calls he made under performing Jacob Revisit this slide
11
Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels Ava Emma Isabella Emily Madison What should you expect from each salesperson Joshua Jacob They should sell x systems depending on sales calls If they sell more over performing If they sell fewer underperforming
12
Regression: Evaluating Staff
Step 1: Compare expected sales levels to actual sales levels Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) Ava 14.7 Emma Isabella -6.8 Emily Madison -23.7 7.9 Joshua Jacob
13
√ Σx N How do we find the average amount of error in our prediction
Deviation scores Diallo is 0” Preston is 2” Mike is -4” Step 1: Find error for each value (just the residuals) Hunter is -2 Y – Y’ Sound familiar?? Step 2: Find average √ ∑(Y – Y’)2 Difference between expected Y’ and actual Y is called “residual” (it’s a deviation score) n - 2 How would we find our “average residual”? N Σx The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions
14
These would be helpful to know by heart – please memorize
Standard error of the estimate (line) = These would be helpful to know by heart – please memorize these formula
15
Standard error of the estimate:
How well does the prediction line predict the predicted variable when using the predictor variable? Standard error of the estimate (line) What if we want to know the “average deviation score”? Finding the standard error of the estimate (line) Standard error of the estimate: a measure of the average amount of predictive error the average amount that Y’ scores differ from Y scores a mean of the lengths of the green lines Slope doesn’t give “variability” info Intercept doesn’t give “variability” info Correlation “r” does give “variability” info Residuals do give “variability” info
16
r2 = The proportion of the total variance in one variable that is
What is r2? r2 = The proportion of the total variance in one variable that is predictable by its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what amount (proportion or percentage) of variance of mother’s height is accounted for by daughter’s height? .64 because (.8)2 = .64
17
r2 = The proportion of the total variance in one variable that is
What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what proportion of variance of mother’s height is not accounted for by daughter’s height? .36 because ( ) = .36 or 36% because 100% - 64% = 36%
18
Some useful terms Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent) Coefficient of correlation is name for “r” Coefficient of determination is name for “r2” (remember it is always positive – no direction info) Standard error of the estimate is our measure of the variability of the dots around the regression line (average deviation of each data point from the regression line – like standard deviation)
19
Review
20
Summary Intercept: suggests that we can assume each salesperson will sell at least systems Slope: as sales calls increase by one, more systems should be sold Review
21
Homework Review
22
For each additional hour worked, weekly pay will increase by $6.09
+0.92 positive strong The relationship between the hours worked and weekly pay is a strong positive correlation. This correlation is significant, r(3) = 0.92; p < 0.05 up down 55.286 6.0857 y' = x 207.43 85.71 or 84% 84% of the total variance of “weekly pay” is accounted for by “hours worked” For each additional hour worked, weekly pay will increase by $6.09
23
400 380 360 Wait Time 340 320 300 280 4 5 6 7 8 Number of Operators
24
No we do not reject the null
Critical r = 0.878 No we do not reject the null -.73 negative strong The relationship between wait time and number of operators working is negative and moderate. This correlation is not significant, r(3) = 0.73; n.s. number of operators increase, wait time decreases 458 -18.5 y' = -18.5x + 458 365 seconds 328 seconds or 54% The proportion of total variance of wait time accounted for by number of operators is 54%. For each additional operator added, wait time will decrease by 18.5 seconds
25
39 36 33 30 27 24 21 Percent of BAs Median Income
26
Percent of residents with a BA degree
Critical r = 0.632 Yes we reject the null Percent of residents with a BA degree 10 8 0.8875 positive strong The relationship between median income and percent of residents with BA degree is strong and positive. This correlation is significant, r(8) = 0.89; p < 0.05. median income goes up so does percent of residents who have a BA degree 3.1819 0.0005 y' = x 25% of residents 35% of residents or 78% The proportion of total variance of % of BAs accounted for by median income is 78%. For each additional $1 in income, percent of BAs increases by .0005
27
30 27 24 21 18 15 12 Crime Rate Median Income
28
No we do not reject the null
Critical r = 0.632 No we do not reject the null Crime Rate 10 8 negative moderate The relationship between crime rate and median income is negative and moderate. This correlation is not significant, r(8) = -0.63; p < n.s. [ is not bigger than critical of 0.632] . median income goes up, crime rate tends to go down 4662.5 y' = x 2,417 thefts 1,418.5 thefts .396 or 40% The proportion of total variance of thefts accounted for by median income is 40%. For each additional $1 in income, thefts go down by .0499
29
Thank you! See you next time!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.