Download presentation
Presentation is loading. Please wait.
1
Yesterday Correlation Regression -Definition
-Deviation Score Formula, Z score formula -Hypothesis Test Regression Intercept and Slope Unstandardized Regression Line Standardized Regression Line Hypothesis Tests
2
Summary Correlation: Pearson’s r Unstandardized Regression Line
3
Some issues with r Outliers have strong effects
Restriction of range can suppress or augment r Correlation is not causation No linear correlation does not mean no association
4
Outliers Child 19 is lowering r Child 18 is increasing r
5
The restricted range problem
The relationship you see between X and Y may depend on the range of X For example, the size of a child’s vocabulary has a strong positive association with the child’s age But if all of the children in your data set are in the same grade in school, you may not see much association
6
Common causes, confounds
Two variables might be associated because they share a common cause. There is a positive correlation between ice cream sales and drownings. Also, in many cases, there is the question of reverse causality
7
Non-linearity Some variables are not linearly related, though a relationship obviously exists For monotonic relationships that are not linear we use Spearman’s r
8
Regression: Analyzing the “Fit”
How well does the regression line describe the data? Assessing “fit” relies on analysis of residuals Are the residuals randomly distributed? (If no, perhaps a linear model is inappropriate) How large are the residuals? Too big? (low correlation means big residuals)
9
Assumptions of Regression
The residuals have mean of 0 and variance of sresid2 The residuals are uncorrelated with X The residuals are homoscedastic (similarly sized across the range of x)
10
Residual Diagnostics I: Graphing
11
Residual Diagnostics I: Graphing
Residual Plot resid Problem: curvilinearity
12
Residual Diagnostics I: Graphing
Agreeableness Time 2
13
Residual Diagnostics I: Graphing
Residual Plot Residuals Problem: heteroscedasticity
14
Regression: Analyzing the “Fit”
How well does the regression line describe the data? Assessing “fit” relies on analysis of residuals Are the residuals randomly distributed? (If no, perhaps a linear model is inappropriate) How large are the residuals? Too big? (low correlation means big residuals) Residual plots ANOVA
15
Regression ANOVA SSY SSmodel SSresid Y Y’
16
Regression ANOVA Source SS df s2 Model Error Total F=t2
“the amount of variance in Y explained by our model”
17
Exercise X Y Fill in the ANOVA table 1 3 4 5 6 9 7 Mean: 5 5
Stdevp: r= b= 0.375 a= 3.125 X Y Y' 5 4 5 5 6 5
18
Exercise X Y Y’ (Y-Y’)2 1 3 4 5 6 9 7 SSresid = … SSmodel = …
Predicted value (Y-Y’)2 Residual (Unpredicted deviation) (Predicted Deviation) 1 3 4 5 6 9 7 SSresid = … SSmodel = … Mean: 5 5 Stdevp: r= b= 0.375 a= 3.125 X Y Y' 5 4 5 5 6 5
19
Exercise X Y Y’ (Y-Y’)2 1 3 3.5 (-0.5)2 (-1.5)2 4 (0.5)2 5 (-1)2 (0)2
Predicted value (Y-Y’)2 Residual (Unpredicted deviation) (Predicted Deviation) 1 3 3.5 (-0.5)2 (-1.5)2 4 (0.5)2 5 (-1)2 (0)2 6 (1)2 9 6.5 (1.5)2 7 SSresid = … SSmodel = … Mean: 5 5 Stdevp: r= b= 0.375 a= 3.125 X Y Y' 5 4 5 5 6 5 3 9
20
Regression ANOVA Source SS df s2 F Model Error Total
21
Regression ANOVA Source SS df s2 F Model 9 1 12 Error 3 4 .75 Total 5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.