Presentation is loading. Please wait.

Presentation is loading. Please wait.

Review Guess the correlation. -2.0 -0.9 -0.1 0.1 0.9.

Similar presentations


Presentation on theme: "Review Guess the correlation. -2.0 -0.9 -0.1 0.1 0.9."— Presentation transcript:

1 Review Guess the correlation. -2.0 -0.9 -0.1 0.1 0.9

2 Review Calculate the correlation between X and Y.
zX = [ ] zY = [ ] -1.21 -.30 -.24 -.01

3 Review Which statement is true?
If the correlation is zero, then the variables are independent If the correlation is nonzero, then the variables are independent If two variables are independent, then their correlation is zero If two variables have a nonlinear relationship, then their correlation is zero

4 Regression 11/5

5 Regression Correlation can tell how to predict one variable from another What about multiple predictor variables? Explaining Y using X1, X2, … Xm Income based on age, education Memory based on pre-test, IQ Social behavior based on personality dimensions Regression Finds best combination of predictors to explain outcome variable Determines unique contribution of each predictor Can test whether each predictor has reliable influence

6 Linear Prediction One predictor Multiple predictors Intercept: b0
Draw a line through data Multiple predictors Each predictor has linear effect Effects of different predictors simply add together Intercept: b0 Value of Y when all Xs are zero Regression coefficients: bi Influence of Xi Sign tells direction; magnitude tells strength Can be any value (not standardized like correlation)

7 Example Predict income from education and intelligence
Y = $k/yr X1 = years past HS X2 = IQ Regression equation: 4-year college, IQ = 120 *4 + 1*120 = 90  $90k/yr 2-year college, IQ = 90 *2 + 1*90 = 40  $40k/yr

8 Finding Regression Coefficients
Goal: regression equation with best fit to data Minimize squared error between Y and Math solution Matrix algebra Combines Xs into a matrix Rows for subjects, columns for variables Transposes, matrix inverse, derivatives and other fun Practical solution Use a computer R: linear model function, lm(Y~X1+X2+X3)

9 Regression vs. Correlation
One predictor: regression = correlation Plus commitment about which variable predicts the other Multiple predictors Predictors affect each other Each coefficient depends on what other predictors are included Example: 3rd-variable problem Math achievement and verbal achievement both depend on overall IQ Math helps with English? Add IQ as a predictor: verbal ~ math + IQ bmath will be near 0 No effect of math once IQ is accounted for Regression finds best combination of predictors to explain outcome Best values of b1, b2, etc. taken together Each coefficient shows contribution of predictor beyond effects of others Allows identification of most important predictors

10 Explained Variability
Total sum of squares Variability in Y SSY = S(Y – MY)2 Residual sum of squares Deviation between predicted and actual scores SSresidual = S(Y – )2

11 Explained Variability
SSY SSresidual SSregression SSregression Variability explained by regression Difference between total and residual sums of squares SSregression = SSY - SSresidual R2 ("R-squared") Fraction of variability explained by the regression SSregression/SSY Measures how well the predictors (Xs) can predict or explain the outcome (Y) Same as r2 from correlation when only one predictor

12 Hypothesis Testing with Regression
Does the regression explain meaningful variance in the outcome? Null hypothesis All regression coefficients equal zero in population Regression only “explaining” random error in sample Approach Find how much variance the predictors explain: SSregression Compare to amount expected by chance Ratio gives F statistic, which we can use for hypothesis testing If F is larger than expected by H0, regression explains more variance than expected by chance Reject null hypothesis if F is large enough

13 Hypothesis Testing with Regression
SSY SSresidual SSregression Likelihood function for F is an F distribution Comes from ratio of two chi-square variables Two df values, from numerator and denominator P(F5,10) F a p Fcrit

14 Degrees of Freedom SSY dfY = n - 1 dfresidual = n – m – 1 SSregression
dfregression = m SSY dfY = n - 1 dfresidual = n – m – 1

15 Testing Individual Predictors
Does predictor variable have any effect on outcome? Does it provide any information beyond other predictors? Null hypothesis: bi = 0 in population Other bs might be nonzero Compute standard error for bi Uses MSresidual to estimate t statistic: Get tcrit or p-value from t distribution with dfresidual Can do one-tailed test if predicted direction of effect Sign of bi

16 Review Predicting income from education and IQ:
A high-school graduate with IQ = 125 considers getting a 4-year bachelor’s degree. What is her expected increase in income? $4k $40k $95k $165k

17 Review A study investigating personality differences in speech rate measures 100 subjects on 5 personality dimensions. These dimensions are used to predict words per minute (WPM) in a timed speech. The total variability in the outcome variable is SSWPM = The residual variability after accounting for the personality predictors is SSresidual = 9000. What is R2 for the regression? 0.06 0.25 0.75 30 3000

18 Review A study investigating personality differences in speech rate measures 100 subjects on 5 personality dimensions. These dimensions are used to predict words per minute (WPM) in a timed speech. The total variability in the outcome variable is SSWPM = The residual variability after accounting for the personality predictors is SSresidual = 9000. Calculate the F statistic for testing whether personality predicts speech rate. (dfregression = 5, dfresidual = 94) 0.02 0.33 4.70 6.27


Download ppt "Review Guess the correlation. -2.0 -0.9 -0.1 0.1 0.9."

Similar presentations


Ads by Google