Linear statistical models 2009 Count data Contingency tables and log-linear models Poisson regression
Linear statistical models 2009 Contingency tables and log-linear models Expected frequency: Log-linear models are linear models of the log expected frequency (log is used as link function)
Linear statistical models 2009 A log-linear model for independence The last parameter of each kind can be set to zero
Linear statistical models 2009 The saturated log-linear model Independence can be tested by relating the difference in deviance D 2 – D 1 to a 2 distribution with df 2 – df 1 degrees of freedom. What is D 1 and df 1 for the saturated model?
Linear statistical models 2009 Analysis of example data (1) proc genmod data=linear.snoring; class snore heart; model count = snore heart/link=log dist=Poisson; run; Can a Poisson distribution be justified?
Linear statistical models 2009 Analysis of example data (2) Analysis Of Parameter Estimates Standard Wald 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept <.0001 Snore Often <.0001 Snore Seldom Heart No <.0001 Heart Yes Scale Estimates of log( )
Linear statistical models 2009 Contingency table with one response variable Consider the example data written in the following form proc genmod data=linear.snoring2; class snore; model heart/total = snore/link=logit dist=binomial; run;
Linear statistical models 2009 Analysis of example data (2) Analysis Of Parameter Estimates Standard Wald 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept <.0001 Snore No <.0001 Snore Yes Scale log(p/(1- p)) p Yes No
Linear statistical models 2009 The multinomial distribution Consider a nominal random variable that takes k distinct values with probabilities p 1, p 2, …, p k Assume that have made n independent observations of that variable Then where n j is the number of times the j th value is observed Note that n is fixed in a multinomial distribution. If the observations arrive randomly, a Poisson distribution is usually preferable.
Linear statistical models 2009 Higher order tables Consider the following data on drug use Model:
Linear statistical models 2009 Terminology A = alcoholC = cigaretteM = marijuana Model A C M: mutual independence model Model A C M A*C A*M C*M: homogeneous association model Model A C M A*C A*M: Model in which C and M are mutually independent when controlling for A
Linear statistical models 2009 Poisson regression I Poisson distribution Log link where x is a covariate
Linear statistical models 2009 Poisson regression II Poisson distribution Log link where the parameters are row, column and treatment effects