Download presentation
Presentation is loading. Please wait.
Published byThomasine Thornton Modified over 6 years ago
2
Review
3
. . . . .
4
Statistics Needed Need to find the best place to draw the regression line on a scatter plot Need to quantify the cluster of scores around this regression line (i.e., the correlation coefficient)
5
Computational formula
6
Correlation
7
Hypothesis testing of r
Is there a significant relationship between X and Y (or are they independent)? Are two independent correlations significantly different than each other?
8
Statistics Needed Need to find the best place to draw the regression line on a scatter plot Need to quantify the cluster of scores around this regression line (i.e., the correlation coefficient)
9
. . . . .
10
Regression Equation Y = a + bX Where:
Y = value predicted from a particular X value a = point at which the regression line intersects the Y axis b = slope of the regression line X = X value for which you wish to predict a Y value
11
Regression
12
How to draw the regression line
. . . . .
13
Hypothesis Testing Have learned
How to calculate r as an estimate of relationship between two variables How to calculate b as a measure of the rate of change of Y as a function of X Next determine if these values are significantly different than 0
14
Testing b The significance test for r and b are equivalent
If X and Y are related (r), then it must be true that Y varies with X (b). Important to learn b significance tests for multiple regression
15
Calculate t-observed b = Slope Sb = Standard error of slope
17
Multiple Regression Good news! No Math Bad news!
Too complicated to do by hand Bad news! Almost all conceptual
18
Causal Models X (IV) the cause of Y (DV)
19
Causal Models X (IV) is the cause of Y (DV)
This is an assumption – causation is not demonstrated with statistics! X Y
20
Remember Candy Depression Charlie 5 55 Augustus 7 43 Veruca 4 59 Mike
108 Violet 65
21
Remember Y = 127 + -13.26(X) COV = -30.5 N = 5 r = -.81 Sx = 1.52
Sy = 24.82
22
Causal Models -13.26 Candy Depression
23
Example Data collected from 15 people Salary Years since Ph.D.
Publications
24
Example Predict the salary of a person from the time since their Ph.D. (in years)
26
Example Predict the salary of a person from the time since their Ph.D. (in years) Y = 51, (X) What do these mean? $51,670 a person tends to earn after graduating (Years = 0) Each year after that a person’s salary increase $1,218 a year
27
Causal Models 1,218 Years since Ph.D. Salary
28
Example Predict the salary of a person from the number of publications they have
30
Causal Models 334 Publications Salary
31
What if we have two IVs? It is possible to use two IVs at the same time to predict a DV Use both publications and years since Ph.D. to predict salary
32
Causal Models Publications 334 Salary 1,218 Years since Ph.D.
33
Causal Models Publications 334 Salary 1,218 Years since Ph.D.
How to interpret values if IVs are independent
34
Causal Models Publications 334 Salary 1,218 Years since Ph.D.
Problem: Information provided by publications and Years is probably somewhat redundant
36
Causal Models Publications Salary r = .66 Years since Ph.D.
37
Causal Models Publications Salary r = .66 Years since Ph.D.
Must estimate these regression coefficients so this relationship is taken into account (called “partial regression coefficients”)
38
Regression Coefficients
Basic logic is exactly the same as normal regression Least squares Has one intercept and each of the IVs has one slope
39
Regression Coefficients
bo the intercept b1 the slope of the first IV b2 the slope of the second IV bp the slope of p IV Y = bo + b1 (X1) + b2 (X2) bp (Xp)
40
Example Predict the salary of a person from the number of publications they have and the years since they got their Ph.D.
42
Regression Coefficients
Current Problem Y = Salary X1 = Years since Ph.D.; X2 = Publications
43
Causal Models Publications 122 Salary r = .66 977 Years since Ph.D.
44
Regression Coefficients
Current Problem Y = Salary X1 = Years since Ph.D.; X2 = Publications What does a person who just graduated (years = 0) with 2 publications likely earn?
45
Regression Coefficients
Current Problem Y = Salary X1 = Years since Ph.D.; X2 = Publications What does a person who just graduated (years = 0) with 2 publications likely earn?
46
Regression Coefficients
Current Problem Y = Salary X1 = Years since Ph.D.; X2 = Publications What does a person who graduated 10 years ago with no publications make?
47
Regression Coefficients
Current Problem Y = Salary X1 = Years since Ph.D.; X2 = Publications What does a person who graduated 10 years ago with no publications make?
48
Question Current Problem Y = Salary
X1 = Years since Ph.D.; X2 = Publications Which IV has a greater “effect” of salary?
51
Standardized Regression Coefficients
Conceptually the same as standardizing all variables and then doing regression analysis Why does this work? Example with Years predicting Salary
52
Standardized Regression Coefficients
With a single predictor – Unstandardized 1,218 Years since Ph.D. Salary With a single predictor -- Standardized .71 Years since Ph.D. Salary
54
Standardized Regression Coefficients
With single IV Correlation between years and salary (r = .71) is the SAME as the standardized regression weight!
55
Standardized Regression Coefficients
β1 = Standardized Regression of first IV β2 = Standardized Regression of second IV βp = Standardized Regression of p IV β0 = Intercept (always = 0)
56
Remember Publications 122 Salary r = .66 977 Years since Ph.D.
58
Standardized Publications .21 Salary r = .66 .57 Years since Ph.D.
59
Regression Coefficients
Current Problem Yz = Standardized Salary Z1 = Years since Ph.D. (Standardized) Z2 = Publications (Standardized) Which IV has a greater “effect” of salary? Can interpret in SD units
60
Regression Coefficients
Current Problem Yz = Standardized Salary Z1 = Years since Ph.D. (Standardized) Z2 = Publications (Standardized) What would you predict the salary to be if a person’s Years = 1.2 and a persons publications = -.50? Interpret what these values mean!
61
Testing the full model How well does the model predict?
The fit test for the full model and its significance are equal for both standardized and unstandardized models
62
Person Z1 Z2 ZY 1 -1.26 .35 -.83 2 -.53 -.24 3 -.63 -1.40 -.84 4 .63 1.23 1.56 5 1.26 .36
63
Person Z1 Z2 ZY Pred 1 -1.26 .35 -.83 -.477 2 -.53 -.24 -.287 3 -.63 -1.40 -.84 -1.097 4 .63 1.23 1.56 1.01 5 1.26 .36 .862
64
Person Z1 Z2 ZY Pred 1 -1.26 .35 -.83 -.477 2 -.53 -.24 -.287 3 -.63
-.53 -.24 -.287 3 -.63 -1.40 -.84 -1.097 4 .63 1.23 1.56 1.01 5 1.26 .36 .862 r = .902
65
Multiple R
66
Multiple R
67
Testing for Significance
Once an equation is created (standardized or unstandardized) typically test for significance. Two levels 1) Level of each regression coefficient 2) Level of the entire model
68
Testing for Significance
Once an equation is created (standardized or unstandardized) typically test for significance. Two levels 1) Level of each regression coefficient 2) Level of the entire model
69
Multiple R Commonly used as R2 Can be tested for significance
Pros and Cons Can be tested for significance Does the set of variables (taken together) predict Y at better than chance levels? H1 : R* > 0 Ho : R* = 0
70
Person Z1 Z2 ZY Pred 1 -1.26 .35 -.83 -.477 2 -.53 -.24 -.287 3 -.63
-.53 -.24 -.287 3 -.63 -1.40 -.84 -1.097 4 .63 1.23 1.56 1.01 5 1.26 .36 .862 r = .902
71
Significance testing for Multiple R
p = number of predictors N = total number of observations
72
Significance testing for Multiple R
p = number of predictors N = total number of observations
73
Significance testing for Multiple R
p = number of predictors N = total number of observations
74
Significance testing for Multiple R
Fcrit Page # 737 Need two df Numerator df = p Denominator df = N – p - 1
75
Significance testing for Multiple R
Fcrit Need two df Numerator df = p Denominator df = N – p - 1 F (2, 2) = 19.00
76
Multiple R If F > Fcrit reject Ho and accept H1
If F < or = Fcrit fail to reject Ho Current problem – fail to reject Ho These two variables do not predict the outcome
78
Practice The teaching salary example Based on 15 people Two IVs
79
Significance testing for Multiple R
p = number of predictors N = total number of observations
80
Significance testing for Multiple R
F crit (2,12) = 3.89
81
Multiple R If F > Fcrit reject Ho and accept H1
If F < or = Fcrit fail to reject Ho Current problem – accept H1 These two variables do predict the outcome
83
Detour Moving back to issues of correlation This will help with . . .
84
Testing for Significance
Once an equation is created (standardized or unstandardized) typically test for significance. Two levels 1) Level of each regression coefficient 2) Level of the entire model
86
How strong is the relationship between publications and salary if we partial out the effect of years? What this is saying
87
Salary
88
Salary Publications
89
r2 SP = .35 Salary Publications
90
r2 SP = .35 Salary Publications
r2 is a ratio = Variance explained / Total Variance Total variance of Salary = 1 (standardized)
91
r2 SP = .35 .65 Salary Publications
92
e Salary a b c Publications Years
93
? e Salary a b c Publications Years
94
? e Salary a b c Publications Years How strong is the relationship between publications and salary if we partial out the effect of years?
95
Semipartial correlation of publications and salary
Years
96
Semipartial correlation of publications and salary
Years Multiple R2 = a + c + b
97
Multiple R
98
R2 = .53 or a + b + c e Salary a b c Publications Years
99
R2 = .53 or a + b + c r2SY = .50 or b + c r2SP = .35 or a + c e Salary
Publications Years
100
R2 = .53 or a + b + c r2SY = .50 or b + c r2SP = .35 or a + c
So what is just “a”? e Salary a b c Publications Years
101
R2 = .53 or a + b + c r2SY = .50 or b + c r2SP = .35 or a + c
So what is just “a”? e Salary a b c Publications Years a = (a + b + c) – (b + c)
102
R2 = .53 or a + b + c r2SY = .50 or b + c r2SP = .35 or a + c
So what is just “a”? e Salary a b c Publications Years a = (a + b + c) – (b + c) or R2 – r2sy
103
R2 = .53 or a + b + c r2SY = .50 or b + c r2SP = .35 or a + c
So what is just “a”? e Salary a b c Publications Years a = R2 – r2sy = = .03 Thus semipartial correlation = .17
104
R2 = .53 or a + b + c r2SY = .50 or b + c r2SP = .35 or a + c
So what is just “a”? e Salary a b c Publications Years What is the correlation between years and salary controlling for publications?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.