Multiple Regression: I PSGE 7211
Goals Introduce multiple regression: What changes when I add more variables? Work on writing up results in APA style
Do Now What’s the difference between regression and correlation? How are they similar? How are they different? What would be an appropriate research question for a regression analysis?
Multiple Regression Multiple regression = regression with more than one IV DV: 8th grade overall GPA (100 pts) IVs: Parental schooling (yrs); Average homework time/week Entered into a simultaneous regression
Grades on Parent Ed, HW Parental Ed Grades Time HW
Grades on Parent Ed, HW
Grades on ParEd, HW r2pared = (.191)2 = .036 r2hw = (.354)2 = .125
Grades on ParEd, HW Multiple Correlation Coefficient (“Mult R”) R2 = .36 x .36 = .130 13% of variance of GPA is explained by PARED, HWRK Adusted R2 = accounts for # of predictors in model .036 .125 Question: Why isn’t the R2 = r2hwork + r2pared
Parental Ed Grades Time HW Grades TimeHW ParEd
Grades TimeHW ParEd Grades TimeHW ParEd Multicollinearity!
Overall model is significant, which indicates that taken together, years of schooling and average time spent on homework explains a significant proportion of the variance in 8th grade GPA
No time spent on homework, Parents never attended school y’ = a + bx1 + bx2 + e Predicted value of GPA = 69.984 + .258(PARED) + 1.143(HWORK) No time spent on homework, Parents never attended school = 69.984 + .258(16 yrs) + 1.143(5 hrs) = 79.827
Note that HWORK is significant but PARED is not. What does this mean? For each additional hour spent on homework, GPA is expected to go up 1.143 points However, PARED is not statistically significant, indicating that it does not adequately predict GPA. Thus, the only conclusion that can be drawn is that HWORK is a better predictor of grades than PARED.
Assuming that PARED was significant and entered as a control variable, then... For each additional hour students with parents who attended an average number of years in school (i.e., 14 years) spend on homework, their GPA will go up [1.143 + (14*.253)] or 4.685 points
Confidence Intervals: give us a range of possible values for the coefficients. Although we may not be entirely certain that the true value falls in the calculated range, we can be reasonably confident Hypothesis tests are used to answer the question of whether or not the true coefficient is zero; if the coefficient is not significant, 0 will be in the range (see PARED) We can be 95% confident that the true coefficient for HWORK lies between .440 and 1.847
b and Betas b = unstandardized coefficients B = standardized coefficients (b transformed into standard deviation units)
Interpreting findings Interpretations Formal (e.g., results section) In English (e.g., discussion section) “Real World” (e.g., explain to a parent, teacher, student, policy maker)
Formal Formal: This research was designed to determine the influence of time spent on homework on 8th grade students’ GPAs, while controlling for parents’ level of education. Students’ 8th-grade GPAs were regressed on average time spent on homework per week and parents’ level of education. The overall multiple regression was significant (F(2, 97) = 7.242, p < .01), and the two variables accounted for 13% of the variance in GPA. Only homework was found to have a statistically significant effect on grades, t(97) = 3.224, p<.01. Controlling for parental education, for every additional hour spent on homework, students’ GPAs are expected to increase 1.143 points.
English English: The results suggest that while parental education may not be an important predictor of student GPA, the amount of time spent on homework has a marked influence on 8th grade achievement. Spending, on average, one hour per week on homework should result in almost a one-point increase in students’ overall GPA.
Real World? Real World: What would be your real world explanation of these findings?
Lab Time One more example Conduct a multiple regression analysis on your dataset Writing up regression analyses Check the wiki – “writing up regression results” Revise HW 3 and/or 4
Writing up regression analyses Results Overview of purpose of the study Preliminary analyses Descriptives (can summarize in a table) Correlational analyses (can summarize in a table or report in text) Main analyses Regression analyses
HW 5 On a dataset of your choice: Run Multiple regression analysis (2 predictors, I continuous DV) Write up your findings in short APA style format Provide a brief introduction (overall purpose of your analyses) Present preliminary analyses (descriptives and correlations) Present analyses on multiple regression Use APA-style tables to present your descriptive statistics and correlational analyses