Regression Assumptions of OLS
Assumptions of multiple regression Equal probability of selection (SRS) Linearity (visible and invisible variables) Independence of observations: Errors are uncorrelated The mean of error term is ALWAYS zero: Mean does not depend on x. Normality (of the error term) Homoskedasticity Variance does not depend on x. No multicollinearity
Homoskedasticity The variance of the error term is fixed (equal across all cases). Compliance with this assumption can be empirically checked. Consequences if violated: SE will be upward biased.
Multicollinearity Has to do with the quality of the information matrix. No linear combination of independent variables should be able to predict any other independent variable.
Multicollinearity - Dx Tolerance: VIF: Inverse of tolerance Indicates inflated standard errors >2 >2.5 Multiple correlation among IVs
Example: Regression with SPSS
Regression exercise Maternal aggression Child aggression Paternal Harsh parenting
Correlations
SPSS output
SPSS output
Regression exercise Maternal aggression Harsh parenting Child Paternal aggression
SPSS step 1: Harsh parenting
Step 2: Direct effects of mom
Step 3: Mediated effects of mom
Multicollinearity check
Including nominal Or ordinal Variables Regression Including nominal Or ordinal Variables
Categorical variables in regression
Association with DV
Dummy variables
Regression with dummy variables
ANOVA UNIANOVA kidagr BY harsh_o WITH momagr dadagr /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /CRITERIA = ALPHA(.05) /DESIGN = momagr dadagr harsh_o .
Regression Interaction effects
Moderated regression Maternal aggression Child aggression Paternal Harsh parenting
Moderated regression momdadagr = momagr*dadagr
Issues Related to Regression Homework
Issues Interpreting regression coefficients when measurement units are not meaningful Interval level, different units of measurement Legend of conceptual framework Test of mediated effects XYZ Atheoretical regression models Write-up Hypotheses Less…than According to conceptual framework Regression equations in text Decimal points
Regression & ANOVA: Wrap up
Common elements All of these models are linear: DV(s)=b1*IV1 + b2*IV2 + b3*IV3 All of these models assume a interval/ratio level DV. All of these models can handle categorical or interval/ratio IVs. All of these models use some form of least squares method (squared deviations from the mean).
Common elements All of these methods assume SRS (independence of observations). All of these methods assume homoskedasticity. All of these methods can only model “flat” and unidirectional effects.
ANOVA / Regression Differences arise from “traditions.” ANOVA Experimental design Regression Non-experimental/survey design. Differences in the yield of information: Regression is superior.
Within Subjects Designs Regression with fixed or random effects:
Factor Analyses Regression with an “unknown” IV:
HLMs Regression coefficients themselves are DVs.