1 G Lect 2w Review of expectations Conditional distributions Regression line Marginal and conditional distributions G Multiple Regression in Psychology
2 G Lect 2w Review of Expectation Operators Let X and Y be random and k be some constant »E(k*X) = k*E(X) = k* x »E(X+k) = E(X)+k = x +k »E(X+Y) = E(X)+E(Y) = x + y »E(X-Y) = E(X)-E(Y) = x - y »E[(X- x ) 2 ] = V(X) = x 2 »V(k*X) = k 2 *V(X) = k 2 * x 2 »V(X+k) = V(X) = x 2 »E[(X- x )(Y- Y )] = Cov(X,Y) = XY »Cov(k 1 +X, k 2 +Y) = XY »Cov(k 1 X, k 2 Y) = k 1 *k 2 * XY »V(k 1 X+ k 2 Y) = k 1 2 x 2 + k 2 2 y k 1 k 2 x y xy
3 G Lect 2w Example Suppose we want to contrast POMS anxious and depressed moods using A-D. What is the expected variance? In the sample on day 29, »Var(Anx)=1.129, Var(Dep)=0.420 Corr(A,D)= 0.64 Cov(A,D)=.64*(1.129*.420) 1/2 = »Var(1*A+(-1)*D) = 1*(1.129)+1*(.420)+(2)(-1)(.441) = Expectation operators are useful with both population and sample values.
4 G Lect 2w Second example Suppose that Z 1 and Z 2 are independent random variables »Each with mean 0 »Each with variance 1 Suppose that someone computes W=Z 1 + Z 2. What is E(W)? What is V(W)? How strongly will W be correlated with Z 1 ?
5 G Lect 2w Reminder Standard deviations and variances are particularly useful when variables are normally distributed Expectation operators assume that f(X), f(Y) and f(X,Y) can be known, but they do not assume that these describe bell shape or normal distributions Covariances and correlations can be estimated with non- normal variables, but be careful about statistical tests.
6 G Lect 2w E(Y|X) for Y:Depression and X:Anxious Mood »Note conditional distribution of depression shifts with increasing values of Anxious mood. »Linear model is only an approximation
7 G Lect 2w Linear Regression Approximate E(Y|X) with linear model »E(Y|X) = b 0 + b 1 X »Y = b 0 + b 1 X + e We choose values of b 0 and b 1 that minimize variance of e »Ordinary Least Squares (OLS) Y = X + e
8 G Lect 2w Interpretation of Linear Regression Results The expected value of depression for a person with Anxiety of zero is »E(Y|X) = b 0 + b 1 X »When X=0 E(Y|X=0) = b 0 + b 1 0 = b 0 For each unit change in Anxiety, depression is expected to increase by.391. »Compare X=1 and X=2 »Compare X=2 and X=3 »Increase is always.391 Sometimes we have to rescale X to make the interpretation of coefficients clearer.
9 G Lect 2w Sample Estimates vs. Population Parameters Assume the following »Sample is representative of a well defined population »Observations are independent »Variance of residuals that does not depend on X »Residuals distributed N(0, 2 ) Then »Standard errors of b 0 and b 1 can be estimated. »Ratio of b to standard error is distributed as t statistic on (n-2) df »Confidence bounds for regression weights can be estimated.
10 G Lect 2w Example revisited (estimates from Excel) Data are consistent with intercept (b 0 ) of zero. Data are consistent with population slope in range (.28,.51).
11 G Lect 2w Note on distribution of residuals If distribution of Y is normal, then residuals tend to be normal. Even if distribution of Y is not normal, residuals may more closely resemble normal Distribution (Y) Distribution (e)
12 G Lect 2w Regression Estimates from Sample Moments Ordinary Least Squares (OLS) estimates always satisfy the following relations Let S X, S Y, and S XY be the sample standard deviations of X and Y, and the covariance »The estimated standard error of the slope b 1 is given by
13 G Lect 2w Marginal from Conditional Expectations Suppose that 60% of the NYU undergrads were female (X=1) and 40% were male (X=0). Suppose E(Y|X=1)=5’5” and E(Y|X=0)=5’9” What is E(Y)? »E(Y|X=1)P(X=1)+E(Y|X=0)P(X=0) =5’5”(.60)+5’9”(.40) =5’6.6” »E(Y)=E[E(Y|X)] where the expectation outside the brackets is over the distribution of X and the Expectation inside is over the distribution of Y (for each given X)