Generalized Linear Models (know this) Generalized Linear Models An alternative to data transformations Principle is to make the model fit the data, rather than changing the data to fit the model Models include link functions that allow heterogeneous variances and nonlinearity Analysis and estimation are based on maximum likelihood methods Becoming more widely used - recommended by the experts Need some understanding of the underlying theory to implement properly Notes adapted from ASA GLMM Workshop, Long Beach, CA, 2010
Generalized Linear Models ANOVA/Regression model is fit to a non-normal data set Three elements: Random component – a probability distribution for Yi from the exponential family of distributions (this is known) Systematic component – represent the linear predictors (X variables) in the model Link function – links the random and systematic elements Form is mean + trt effect No error term
Log of Distribution = “Log-Likelihood” Binary responses (0 or 1) Probability of success follows a binomial distribution Logistic regression – response variables are binary (e.g. alive or dead); predictors can be categorical or continuous MLE Make multiple observations on Y and N Apply likelihood formula for a range of P values Determine which value of P has the greatest likelihood, given the data set “canonical parameter” Takes the form Y * function of P
Use an inverse function to convert means to the original scale Example – logit link µ can only vary from 0 to 1 can take on any value Logistic regression – response variables are binary (e.g. alive or dead); predictors can be categorical or continuous Use an inverse function to convert means to the original scale
Some Common Distributions & Link(s) Log linear models – good for analysis of contingency tables; do not distinguish response and predictor variables – based on a Poisson distribution
RBD Mixed Model Analyses with SAS (know this) RBD Mixed Model Analyses with SAS Distribution Treatments Fixed Blocks Fixed Blocks Random Normal (continuous) (PROC GLM) Linear Model (LM) (PROC MIXED) Linear Mixed Model (LMM) Non-normal (categories or counts) (PROC GENMOD) Generalized Linear Model (GLM) (PROC GLIMMIX) Mixed Model (GLMM) Mixed Models - contain both random and fixed effects Note that PROC GLM will only handle LM! PROC GLIMMIX can handle all of the situations above
Linear Models for an RBD in SAS (know this) Linear Models for an RBD in SAS Treatments fixed, Blocks fixed PROC GLM (normal) or PROC GENMOD (non-normal) all effects appear in model statement Model Response = Block Treatment; Treatments fixed, Blocks random PROC MIXED (normal) or PROC GLIMMIX (non-normal) Only fixed effects appear in model statement Model Response = Treatment; Random Block;
GLIMMIX basic syntax for an RBD proc glimmix; class treatment block; model response = treatment / link=log s dist=poisson; random block; lsmeans treatment/ilink diff; fixed effects go in the model statement random effects go in the random statement default means and standard errors from lsmeans statement are on a log scale ilink option gives back-transformed means on original scale and estimates standard errors on original scale diff option requests significant tests between all possible pairs of treatments in the trial,
Estimation in LMM, GLM, and GLMM (know this) Estimation in LMM, GLM, and GLMM Does not use Least Squares estimation Does not calculate Sums of Squares or Mean Squares Estimates are by Maximum Likelihood Output includes Source of variation degrees of freedom F tests and p-values Treatment means and standard errors Comparisons of means and standard errors