Download presentation
Presentation is loading. Please wait.
Published byClarissa Mills Modified over 9 years ago
1
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9
2
Review Calculate the correlation between X and Y. z X = [-1.0-0.3-0.2-0.6-1.6] z Y = [-0.6-1.3-0.6-0.0-1.3] A.-1.21 B.-.30 C.-.24 D.-.01
3
Review Which statement is true? A.If the correlation is zero, then the variables are independent B.If the correlation is nonzero, then the variables are independent C.If two variables are independent, then their correlation is zero D.If two variables have a nonlinear relationship, then their correlation is zero
4
Regression 11/4
5
Regression Correlation can tell how to predict one variable from another What about multiple predictor variables? – Explaining Y using X 1, X 2, … X m – Income based on age, education – Memory based on pre-test, IQ – Social behavior based on personality dimensions Regression – Finds best combination of predictors to explain outcome variable – Determines unique contribution of each predictor – Can test whether each predictor has reliable influence
6
Linear Prediction One predictor – Draw a line through data Multiple predictors – Each predictor has linear effect – Effects of different predictors simply add together Intercept: b 0 – Value of Y when all Xs are zero Regression coefficients: b i – Influence of X i – Sign tells direction; magnitude tells strength – Can be any value (not standardized like correlation)
7
Example Predict income from education and intelligence – Y = $k/yr – X 1 = years past HS – X 2 = IQ Regression equation: 4-year college, IQ = 120 -70 + 10*4 + 1*120 = 90 $90k/yr 2-year college, IQ = 90 -70 + 10*2 + 1*90 = 40 $40k/yr
8
Finding Regression Coefficients Goal: regression equation with best fit to data – Minimize squared error between Y and Math solution – Matrix algebra – Combines Xs into a matrix Rows for subjects, columns for variables – Transposes, matrix inverse, derivatives and other fun Practical solution – Use a computer – R: linear model function, lm(Y~X1+X2+X3)
9
Regression vs. Correlation One predictor: regression = correlation – Plus commitment about which variable predicts the other – And not standardized Multiple predictors – Predictors affect each other – Each coefficient depends on what other predictors are included Example: 3 rd -variable problem – Math achievement and verbal achievement both depend on overall IQ – Math helps with English? – Add IQ as a predictor: verbal ~ math + IQ – b math will be near 0 – No effect of math once IQ is accounted for Regression finds best combination of predictors to explain outcome – Best values of b 1, b 2, etc. taken together – Each coefficient shows contribution of predictor beyond effects of others – Allows identification of most important predictors
10
Explained Variability Total sum of squares – Variability in Y – SS Y = (Y – M Y ) 2 Residual sum of squares – Deviation between predicted and actual scores – SS residual = (Y – ) 2
11
Explained Variability SS regression – Variability explained by regression – Difference between total and residual sums of squares – SS regression = SS Y - SS residual R 2 ("R-squared") – Fraction of variability explained by the regression – SS regression /SS Y – Measures how well the predictors (Xs) can predict or explain the outcome (Y) – Same as r 2 from correlation when only one predictor SS Y SS residual SS regression
12
Hypothesis Testing with Regression Does the regression explain meaningful variance in the outcome? Null hypothesis – All regression coefficients equal zero in population – Regression only “explaining” random error in sample Approach – Find how much variance the predictors explain: SS regression – Compare to amount expected by chance Ratio gives F statistic, which we can use for hypothesis testing – If F is larger than expected by H 0, regression explains more variance than expected by chance – Reject null hypothesis if F is large enough
13
Hypothesis Testing with Regression Likelihood function for F is an F distribution – Comes from ratio of two chi-square variables – Two df values, from numerator and denominator SS Y SS residual SS regression P(F 5,10 ) F crit F p
14
Degrees of Freedom SS Y SS regression df Y = n - 1 df regression = m df residual = n – m – 1
15
Testing Individual Predictors Does predictor variable have any effect on outcome? – Does it provide any information beyond other predictors? – Null hypothesis: b i = 0 in population – Other bs might be nonzero Compute standard error for b i – Uses MS residual to estimate t statistic: Get t crit or p-value from t distribution with df residual Can do one-tailed test if predicted direction of effect – Sign of b i
16
Review Predicting income from education and IQ: A high-school graduate with IQ = 125 considers getting a 4-year bachelor’s degree. What is her expected increase in income? A.$4k B.$40k C.$95k D.$165k
17
Review A study investigating personality differences in speech rate measures 100 subjects on 5 personality dimensions. These dimensions are used to predict words per minute (WPM) in a timed speech. The total variability in the outcome variable is SS WPM = 12000. The residual variability after accounting for the personality predictors is SS residual = 9000. What is R 2 for the regression? A.0.06 B.0.25 C.0.75 D.30 E.3000
18
Review A study investigating personality differences in speech rate measures 100 subjects on 5 personality dimensions. These dimensions are used to predict words per minute (WPM) in a timed speech. The total variability in the outcome variable is SS WPM = 12000. The residual variability after accounting for the personality predictors is SS residual = 9000. Calculate the F statistic for testing whether personality predicts speech rate. (df regression = 5, df residual = 94) A.0.02 B.0.33 C.4.70 D.6.27
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.