PSY 1950 General Linear Model November 12, 2008
The General Linear Model Or, What the Hell ’ s Going on During Estimation?
Generalized linear models General linear model Multiple regression Simple regression ANOVA
Motivation Benefits to GLM approach over variance- ratio method –Efficiency –Easier in the case of unequal sample sizes –ANCOVA –Present and future statistical techniques and software Goal: vague understanding
History Correlational versus experimental approach –Correlational: Does value of Y change with value of X? –Experimental: Does mean value of Y change with category of X? Computer revolution –ANOVA/regression calculations based on matrix algreba
General Linear Model General model –Categorical or continuous predictor Linear model –Parameters that are not multiplied by other parameters, e.g., not Y = abX only first-power, e.g., NOT Y = b 2 X not exponents, e.g., NOT Y = X b –Variables do not need to satisfy the above criteria transformation workaround, e.g., Y = bX 2 can be rewritten as Y = bZ –Not necessary straight-line relationship
The Foundation of Statistics datum = model + error
The Simplest Model datum = mean + error
Regression Model Y = bX + a + e
The Foundation of Statistics datum – model = error
Method of Least Squares ∑(datum – model) 2 = ∑error 2
Method of Least Squares Regression ∑(Y – bX – a) 2 = ∑e 2 ∑(Y – Y) 2 = ∑e 2 ^
One-way ANOVA as GLM datum = model + error ^ ^
By comparing the error estimates made by both these models, we can assess how much variance the grouping parameter explained Reduced ModelFull Model
t-Test as Regression Y = bX + a + e Code grouping variable (X) as 0/1
Example Libido = b(Viagra) + a + e
t-Test as Regression What do the following represent? –slope value –intercept value –slope p-value –intercept p-value –model significance How would the above change if you coded the groups differently? 1, 0 -1, 1 1, 2
One-Way ANOVA as Regression Y = b 1 X 1 + b 2 X 2 + … b p X p + a + e Number of predictors (p) is one less than the number of groups (k) –Just as one predictor distinguishes two groups, two predictors distinguish three groups, and so on … –Any redundancy creates multiple solutions to least squares minimization
Example Libido = b 1 (Viagra) + b 2 (High) + a + e
What do the following represent? –regression coefficients –intercept values –coefficient p-value –intercept p-value –model significance How would the above change if you coded the groups differently? original: 00, 10, 01 00, 20, 02 01, 10, , 10, 01
Coding Dummy –Comparison group coded will all zeros –Variables named after conditions of interest –Data coded 1/0 based on membership Effect –One group coded with all “-1”s Can be uninteresting conceptually or based upon proximity of its mean to grand mean –Variables named after conditions of interest –Data coded 1/0 based on membership Contrast –Data coded based upon expected pattern (e.g., 1, 1, -2) –Sum of weights must equal zero –Additional variables must represent orthogonal contrasts
Coding