Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generalized Linear Models II Distributions, link functions, diagnostics (linearity, homoscedasticity, leverage)

Similar presentations


Presentation on theme: "Generalized Linear Models II Distributions, link functions, diagnostics (linearity, homoscedasticity, leverage)"— Presentation transcript:

1 Generalized Linear Models II Distributions, link functions, diagnostics (linearity, homoscedasticity, leverage)

2 Dichotomous key: picking a distribution for your data

3 Discrete or continuous? Possible values: 0/1 or 0,1,2,… etc. Binomial (logistic regression) 0/1 Range of data -  to +  0,1,2,… Discrete Continuous Poisson or Binomial Normal Gamma or Inverse-Gaussian >0 to +  Check for overdispersion Poisson ok Resid. deviance ~= Resid. df (~  n-p ) Compare fit w/ quasi-poisson or Quasi-binomial or negative binomial Resid. deviance >> Resid. df (~  n-p ) Check residuals for normality Check s.dev. residuals for normality If distributional checks fail examine the data/residuals and try to determine source of deviance! Bimodality? Linearity? Fat tails? Excess zeros? Check Resid. deviance = Resid. df (~  n-p ) again and compare s.dev. resids to normality Common distributions (But see next slide for others And additional details)

4 Possible values: 0/1 Bernoulli(successs/failure, logistic regresion?) -  to +  Discrete Continuous Geometric (# trials to 1 st success) Poisson (#successes in large # trials) Negative Binomial (#trials to n th success or over-dispersed Poisson) Exponential(time to 1 st success) Gamma(time to n th success) Inverse-Gaussian( 1/x is normal) >0 to +  0,1,2,… infinity Normal Binomial (# successes in fixed # trials) Multinomial(more than 2 categories, fixed # trials) 0,1,2,… N (known) 0 to 1 Beta(fraction of total, proportions) Check out Wikipedia pages for each distribution for more info!

5 As sample sizes get large, many distributions converge on the normal distribution See, e.g. http://en.wikipedia.org/wiki/Negative_binomial_di stribution http://en.wikipedia.org/wiki/Negative_binomial_di stribution http://en.wikipedia.org/wiki/Gamma_distribution

6 Group exercise Get a partner Describe a real dataset to your partner Partner picks a potentially appropriate distribution Switch roles Repeat!

7 Link Functions Enforce appropriate range for expected response (e.g. 0,1 for ‘probability of success’, >0 for counts, etc) Linearize relationship between expected response and predictors G(E(y)) = b 0 + b 1 x 1 + b 2 x 2 + etc Be careful to interpret coefficients properly given a link function! E(y) =G -1 ( b 0 + b 1 x 1 + b 2 x 2 + etc) E.g. LinkConstraintInverse LogE(y)>0 LogitE(y) in (0,1) See Table 15.1 in GLM chapter for lots more!

8 Canonical link functions

9 Sample problems for count data Binomial vs. poisson http://personal.maths.surrey.ac.uk/st/J.Deane/Teach/se 202/poiss_bin.html

10 Leverage (see diagnostic plots & websites on next slide) Xxx et al 2006 PLoS Biology

11 R: example GLM with data #read in data bd=read.csv("c:/marm/teaching/293qe/bat_lambda.csv") str(bd);head(bd) #What not to do- run models blindly! b1=glm(Lambda~PreWNS_Pop,family=Gamma,data=bd);summary(b1) #What to do - plot data plot(Lambda~PreWNS_Pop,data=bd) #What does it suggest would be a good idea? bd$Lpop=log(bd$PreWNS_Pop) plot(Lambda~Lpop,data=bd) b1=glm(Lambda~Lpop,family=Gamma,data=bd);summary(b1) b2=glm(Lambda~Lpop+Species,family=Gamma,data=bd);summary(b2) b3=glm(Lambda~Lpop*Species,family=Gamma,data=bd);summary(b3) anova(b1,b2,b3,test="Chisq") AIC(b1,b2,b3) plot(b3) http://stats.stackexchange.com/questions/52089/what-does-having-constant-variance-in-a-linear-regression-model-mean http://stats.stackexchange.com/questions/58141/interpreting-plot-lm


Download ppt "Generalized Linear Models II Distributions, link functions, diagnostics (linearity, homoscedasticity, leverage)"

Similar presentations


Ads by Google